Interested in a HN-like source of information and discussions on AI news. Ideally it would include slightly more in-depth and in the weeds discussions on AI research and developments, while staying away from basic news stories and applications.
https://paperswithcode.com/ is arguably the best source and overview of all the research. Its also (somewhat) unbiased (owned by Meta), not being an SEO-optimised company blog.
A lot of the latest high-performing models aren't making it to HN. I use paperswithcode by scanning major tasks for new models that come out #1 across multiple benchmarks and then reading those papers.
Reminds me of a reddit meme I saw a few weeks ago of a picture of a boxer trying to keep up with the rate of change and getting tired out, then saying "damn, singularity got hands"
The average academic paper is about 12-14 pages including citations.
So on average probably 50 pages an hour.
With practice, that's entirely reasonable.
Without practice it's a lot.
Reading technical literature is a skill that develops over time. At first it goes slow. Then it gets normal. At first it might take an hour to read a 12 page paper. After a few months, fifteen minutes becomes enough to read a paper and make a cup of coffee.
My take goes against most of the other comments here – don't keep up. It's not practical, the amount of new information and development is too much to process.
I have a daily workflow of scanning r/ML and HN and I subscribe to a few newsletters that I came across. I save bookmarks of tools and repos to raindrop.io and articles to readwise/reader. One good trick is to use the readwise feed email when subscribing to newsletters, so the newsletters go to Readwise instead of your personal email.
My big unsolved problem is Twitter — how do I avoid going on twitter more than a half hour a day, by using some type of twitter based filter/aggregator?
Labml daily is a relatively good trend aggregator informed by Twitter. But I still keep discovering interesting things on Twitter not covered by any of the above. And BTW I bookmark twitter threads to Readwise/reader as well.
I used to deal with the "avoiding going on Twitter" problem by subscribing to interesting AI Twitter feeds in Feedbin, an RSS aggregator. Unfortunately that doesn't work any more because the "genius" in charge revoked Feedbin's Twitter API access a week or so ago. So now I don't check Twitter at all.
The people posting on Twitter do it to be heard by others. As the audience decreases, the significance of the platform decreases. Thus, people stop posting on said platform and use other avenues to get their voices heard.
As users drop off, advertisers leave, removing a large part of their revenue.
Your claim is so completely out of touch with reality...
The koolaid level is high with you.
No one stops posting on twitter because feedbin can't repost. You care about people engaged not someone who doesn't follow or comment or show up in the view column
I think Twitter lists are a good starting point. There are plenty of ML/AI lists around. I start from there and whittle down to good signal/noise ratio, meaning avoid people who post overly frequent fluffy hot-takes. People who retweet good stuff are also worth a follow. Ultimately what I want to get out of twitter is tools, papers, good blog-links.
For specific research-oriented follows I would suggest starting with Ilya Sutskever and see who he is following or retweeting. And for practical stuff start with Simon Willison
a blog which in the past speculated about Covid, Bird flu and now tells stories about tracts around generative methods. I would not classify this as keeping up with AI in the tech sense.
Zvi's updates are very comprehensive and detailed -- and his commentary is excellent, because he combines intellectual curiosity with a careful scrupulousness for factual accuracy. And his mind has some interesting corners.
I find that I always come away learning a thing or two from his updates -- and feel as though I'm keeping up with at least those developments which relate to commercially-available AI.
His blog is not a repository of scientific work like aRxiv, but more like a curated summary of AI news. It is, after all, a blog.
"hijacking" the post to ask where I can find a good introduction to machine learning and AI. Not how to use this or this library but the fundamentals and principles behind. Preferably something explaining clearly the principles first then explaining the maths (from the beginning, my maths are quite far now) then showing practical usage/development (in any high level language like python or julia). I do not need to jump straight to the latest algorithms, I prefer starting with building bricks first
The usual recommendations I think are: Andrew Ng’s Coursera for the fundamentals, Andrej Kaparthy’s videos (https://karpathy.ai/zero-to-hero.html) for more practical and LLM focussed, and also Fast.ai’s courses. I’ve done some of the first two and they seem great.
The Little Learner[0]: Learn the basics using scheme. This is fundamental; no frameworks, just Scheme and functions. From here you can go anywhere.
Then watch Neural Networks: Zero to Hero[1] to learn how autograd works, learn a few advance topics, and most importantly, to see what the workflow of an expert looks like.
Then read PyTorch tutorials and the PyTorch docs while doing some real projects.
If PyTorch starts to feel too rigid, then look into Jax or Julia's Flux library.
Karpathy recently tweeted that we don't need more researchers, we need prompt engineers, which goes beyond just writing prompts but actually writing code to make ai agents that carry on multiple tasks, for instance Autogpt is kind of amazing, I think you make something like that and you'd get job offers from ai labs. they're all moving from research to products, meaning they need people to create use cases around their new products.
Is it just me, or is this post complete nonsense? This AI hype seems to be proliferated by people who have never programmed more than 1kloc in their lives. But maybe that's the point?
For example:
> “babyagi” is a program that given a task, creates a task list and executes the tasks over and over again. It’s now been open sourced and is the top trending repos on Github atm [Link]. Helpful tip on running it locally [Link]
The babyagi project is is an extremely simple 180 line python script. The tips for running it is just rephrasing the readme to set some environment variables.
Lots of sources. However, Last Week in AI has been a great podcast since I started listening a couple months ago. Like covid, beware of resources that only started covering AI because it's trendy lately. They quickly summarize and discuss papers and news.
For years I have followed top researchers on Twitter and helped quite a bit to stay up to date on the topic. Today I think it's still quite good for that purpose, although the countless way that Musk is trying to make it worse...
HN lies on the Early Majority border of the innovation curve. It's also highly resistant to new tech of all sorts and tries to bury them.
HN is okay for things you don't watch so you don't miss out on anything cool. It used to be 1-2 years behind the curve in the GPT-3 era, but now that things are moving faster, it's only around 3 months or so behind.
While we are on the topic, can somebody give a TLDR what breakthroughs made current AI advancements? From what I understand the "foundation" is exactly the same as it was 40 years ago - same neural networks, same activation functions, same architectures, same gradient descent. If I ask some "skeptical" crowd they say: "nothing is new, we just started using GPUs". Some say there were breakthroughs in learning algorithms to facilitate deep learning (i.e. that features are trained and learned by deeper layers automatically). Can someone elaborate on this, please? I tried googling and I only get crap articles that just "wave hands"
I studied machine learning 20 years ago, and have only been a casual observer since. I hope someone else corrects me where I'm wrong.
I'd say the big change was first deep nets, mostly CNNs. 20 years ago, there was a lot of talk about fully connected hidden layers. And how they are equivalent to an ANN with one hidden layer, if you allow exponential scaling of the number of perceptrons. They took forever to converge. CNNs reduce the number of parameters, and hence the training by quite a lot. They also make it obvious that even if multi-hidden layer nets are equivalent theoretically, in practice, you can cheat a lot using convolutions, thereby making this theory pointless.
Then came embeddings that segmented training into an embedding stage and a prediction stage, further reducing training time.
RNNs never really took off as a name, and seems to have been replaced with LSTM, being a subset of the vague RNN term. Having a well-understood way of introducing memory is very useful for sequences, and makes scaling a model to sequences much cheaper. So yet another input size reducing thing. (I guess we're basically building Markov processes on top of ANNs now? When I was in school, it was an either-or choice.)
On a meta-level, there are ensemble models, which is just taking many models, running them in parallel and combining the outputs. Simple idea, but apparently effective enough that people keep doing it. :)
Then there was attention being added, which adds a stage where you find similarities between entities. This means the ANN doesn't have to figure clustering, which again speeds up convergence. (I think it also makes the models more modular, but I'm not sure.)
Reinforcement learning (Q-learning from the late 1980s) arguably benefits tremendously with all the improvements to the ANN models, so I think time will tell how awesome this will be. For now, it seems we're all amazed by how much just the ANN (Q-function) can do on its own.
All of this is enabled by vastly improved matrix arithmetic hardware, software libraries and RAM prices. (Aside from each of these techniques just reducing training time/size.)
And then there's the inevitable hype bubble that just turns it all up to 11 for a while, because people with money want to invest and people without money want to pretend to know things. I wouldn't discount the availability of money as a "news" here, though.
Sure, I can provide a brief overview of the key breakthroughs and advancements that have contributed to the current state of AI, particularly in the domain of deep learning.
1. Availability of data: The explosion of digital data, especially from the internet, has provided a massive amount of training data for AI models. This has allowed AI systems to learn patterns, features, and representations from various data sources more effectively than before.
2. Hardware improvements: The introduction of GPUs (Graphics Processing Units) and specialized hardware, like TPUs (Tensor Processing Units), has significantly accelerated the training of large neural networks. These advancements enable researchers to experiment with larger and more complex models, leading to improved performance.
3. Algorithmic innovations: Key algorithmic advancements have been made to train deep neural networks more efficiently. Some notable examples include:
a. Backpropagation: This algorithm is used to train neural networks by minimizing the loss function through gradient descent. Although it was introduced in the 1980s, it became more widely used and optimized in recent years.
b. Activation functions: Non-linear activation functions like ReLU (Rectified Linear Unit) have been crucial in addressing the vanishing gradient problem and improving training efficiency in deep networks.
c. Dropout: This regularization technique helps prevent overfitting by randomly dropping out neurons during training, encouraging the network to learn more robust features.
4. Architectural advancements: The development of various neural network architectures has led to improved performance in specific tasks. Some prominent architectures include:
a. Convolutional Neural Networks (CNNs): These networks are especially effective at image recognition tasks due to their ability to capture spatial patterns and hierarchical features.
b. Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM): These architectures excel at handling sequence data, such as time series or natural language processing tasks.
c. Transformers: Introduced in 2017, the transformer architecture has become a key component in state-of-the-art natural language processing models like BERT and GPT, due to its self-attention mechanism and ability to handle long-range dependencies.
5. Transfer learning and pre-training: Instead of training models from scratch, researchers have found it effective to pre-train models on large datasets, followed by fine-tuning them on specific tasks. This approach reduces training time, requires less labeled data, and often leads to better performance.
These breakthroughs and advancements, combined with a growing research community and increased investment in AI, have led to the current state of AI, where deep learning models can achieve human-level or near-human-level performance on a variety of tasks.
various discord channels if you want the latest. As much as I hate discord's UI and ecosystem, it's value in up to date information about AI can't be matched.