NEWS: Apple just entered the AI open source arena by quietly releasing their new DL framework called MLX! It runs code natively on Apple Silicon with a single pip install and no other dependencies.
Sharing what I discovered from this initial release:
Can you predict IQ based on grades? One meta-analysis suggests the answer is yes. It found that the correlation between IQ and grades was r = 0.54. That means a scatterplot of the data would look like this:
How much of math is “scary” because people are primed to think that way? I taught some really advanced math concepts to my pre-teen nephew like we were learning the next letter of the English alphabet and it went swimmingly.
I have been testing mistral-medium & GPT-4’s code generation abilities for non-trivial problems. These are problems even experience engineers will take time to work it out. I am summarizing some examples and overall impression in this thread: 🧶
All coding projects have two parts:
1. The fun part: where you get to "create"
2. The pain part: where you have to debug
Code LLMs are "automating" the fun parts while introducing bugs and not helping much with debugging. As a developer, you’re left with more pain to deal with.
China using AI models of 3D spatial data to give feedback to their athletes. Not sure if this is the reason for their victories, but when you are competing at those levels, every bit makes a huge difference!
Crazy AF. Paper studies
@_akhaliq
and
@arankomatsuzaki
paper tweets and finds those papers get 2-3x higher citation counts than control.
They are now influencers 😄 Whether you like it or not, the TikTokification of academia is here!
Drop everything you are doing!!
Alex Graves pushed a paper on arXiv, so nothing could be more important than reading it. First thing I did was go look for any comments in the TeX file. Unfortunately, it’s all been scrubbed.
Interesting story from the interwebs. 26yo Stephen Wolfram was unhappy at being treated badly at the IAS. He wanted to start his own institute to continue his research and asked his ex-colleague, Feynman, for advice. Feynman’s reply:
One way to disclose you never studied physics seriously is to recommend “Feynman’s lectures” as a good physics book. The real deal know it is and it will always be Landau-Lifshitz.
OAI will discontinue support for Codex models starting March 26. And just like that, all papers and ideas built atop codex (> 200 on ArXiv) will not be replicable or usable as is. Do we still think openness doesn’t matter?
What an absurd line of reasoning here. I think he left out Nvidia for providing the GPUs, the power companies for supplying power, and the construction company for building the data center.
imagine replying “ship something better” to the person who leads Pixel’s computational photography ("AI" if you prefer) that's raising the bar of what to expect from a camera.
LaTeX is amazing and fun, but it hit me today why I love it. It’s productive procrastination. As you work on your document, you feel like you are making progress but it’s not the progress you want and you’re okay with that as you have no idea about the other kind of progress.
I have long maintained LLMs make the poor performers mediocre, the average slightly above average, but do not change, and maybe hinder, the performance of top performers.
Here’s a result from a university-level physics coding task.
History class: This 2015 paper is the mother of all LM based pre-training approaches, including the GPT, but few are aware of it. GPT (Radford et al 2018) was a direct application of Transformers (Vasvani et al 2017) to the result in this paper (w/ lot of work & insight ofc).
one small patching from Intel, and all your scikit-learn operations will run anywhere between 10-100x faster. Just make sure to import and run before any scikit imports. Details:
Startups are hiring influencers to peddle bullshit. This is a new kind of misinformation to peddle snake oil AI products with no backing. When you dig in, you see the growth rate has been hovering around 8.5%, with leading and lagging indicators for March showing a -16% drop.
Reminder: this was (part of) the team that thought GPT-2 was too dangerous to release, and now they are making models stronger than GPT-4 available on AWS for anyone with an Amazon account to use.
This is why I have little trust in “AI safety” claims by Anthropic/OpenAI. It all…
Congrats to Dario and the
@AnthropicAI
team on their new Claude 3 family of models. Very impressive benchmarks, and excited to have all of them coming to Amazon Bedrock (w/ Sonnet avail today). Many AWS customers are already building with Anthropic’s foundation models, and…
PSA: Please don’t do this in your code for _any_ keys, especially if you use 3rdparty libraries.
I just discovered some 3rd party code exporting all environment vars for telemetry and debugging. The increasing adoption of DevOps tools/libs can lead to key leaks with this code.
I cannot understand why folks are selling $GOOG because of the Bard “demo mistake”. ChatGPT and other equivalent models make similar mistakes a million times a day. A quick day trade here would be buy & hold GOOG until it bounces back in a couple days once this hysteria is over.
So let me get this straight. It’s dangerous to have powerful open source models but okay to make the similar or more powerful models available via AWS?
The Untold History of Deep Learning is yet to be written.
After reading this Mikolov post, I will never again teach the Seq2Seq paper the same way.
Also, doubt people and organizations when they make proclamations about working for the benefit of humanity.…
Just discovered
@rasbt
’s handy watermark module. Just pip install watermark, and you can do this! Very useful for filing GitHub issues and other bug reports.
If this were a science paper, you would expect a country that picks its science workforce at random as a “weak baseline” and a leading nation like the US to actively experiment towards state-of-the-art, or at least beat the baseline.
Not providing a guaranteed path for…
OpenAI released their ChatGPT API today. Here’s a deep dive:
1. It’s not only a new model, but a new endpoint. Notice the model name says, “gpt-3.5-turbo”.
Turbo model is something the paid ChatGPT users (“PLUS”) got a preview a week or so ago.
Personal news: I am beyond thrilled to start as an Entrepreneur in Residence at
@allen_ai
beginning today, where I will be building magical consumer products incorporating cutting-edge speech research and NLP!
ChatGPT pricing between $0 and $42, there exists a power law of customers willing to pay that. That’s a big gap this pricing creates. The bigger this gap, the more incentive for competitors (they exist) to accelerate their efforts to fill it, and OpenAI will be forced to reprice.
I flip-flop on how bad releasing model weights is, but what is clear to me is that we're in a honeymoon period before something bad happens like mass social manipulation and surely Meta is gonna regret making "we let anyone use our great models for anything" a selling point.
arXiv: "The World as a Neural Network -- We discuss a possibility that the entire universe on its most fundamental level is a neural network."
Me: Yeah, and 2020 is the year of NaNs
I absolutely love this idea of putting negative results in gray. This is a gift you can give to readers for the low effort of throwing in a few annotations while writing. Also, the gray with black is not jarring at all.
Where my Indian friends at!? I have missed all the sweet nepo deals! Also where do I go to collect my minority perks?? Feeling livid I missed out on whatever memos my Indian friends are sending themselves while I’ve been slugging it out for the past 15+ years and paying fat taxes
Indians leave their country of 1.38 billion, come to America/the west and claim minority status to get all the perks. Then they practice nepotism and get ahead on the shoulders of Europeans while we get attacked. Isn't 'diversity' great?
Recommendation: If you use
#Pandas
for pre-processing data before model training, a minimal effort way to parallelize your .apply's is to use the swifter lib. Swifter automatically splits your data, does multiprocess, and returns the result.
#Python
If you see people flaunting ⏸️ or ⏹️ emojis in their profile, keep in mind that this is the level of their understanding of AI, Biology, and the interaction of the two.
I’ve tested many codegen models and GPT-4 was a clear winner until now. Congratulations to
@MistralAI
for bringing in mistral-medium as a strong competitor for code generation tasks.
A new LLM leaderboard from
@allen_ai
dropped! CommonGen tests a model’s commonsense reasoning by generating sentences from everyday concepts over 30K concept-sets via crowd-sourcing and caption data. Look at the gap between human and GPT-4! 😱
Closed science companies like OpenAI and Anthropic parasitically extract value from open science and open source without giving credit to people or organizations building them. Open science with citations would’ve addressed that, but alas that’s too much to ask.
Pace of progress in AI is lightning!
@OpenAI
released MRL style text embeddings, mirroring our NeurIPS '22 paper (w/ awesome folks from UW and Harvard).
However, as an advocate of open science, I am a bit disappointed with rebranding to "shortening embs" without ref to MRL 1/n
Upside-down fig tree in Bacoli, Italy. "No one is quite sure how the tree ended up there or how it survived, but year after year it continues to grow downwards and bear figs."
#archaeohistories
This is like insisting on baby-proofing every power tool.
If anything, this is an endorsement of how versatile and unlobotomized the mistral model is as a *base model*.
If you are capable of making an inference on an LLM, it is your responsibility to use it safely.
After spending just 20 minutes with the
@MistralAI
model, I am shocked by how unsafe it is. It is very rare these days to see a new model so readily reply to even the most malicious instructions. I am super excited about open-source LLMs, but this can't be it!
Examples below 🧵
I love my partner to death, but dear god, I made the mistake of peeking at some simulation source code he was writing. Never look at source code written by Physicists.
I have a very niche use of ChatGPT -- knock out some code, stick it in ChatGPT, and ask it to generate Python docstrings. I want to write docstrings, but I'm too lazy. ChatGPT is very good at understanding code and summarizing it, and docstring generation is a subset of that.
Turns out you cannot replace decades of painstaking work optimizing around every kind of web content with an LLM and retrieval augmentation.
Who could’ve guessed. 🙃
Perplexity only beats Google at answering questions. for general search (e.g. looking up restaurants, movies, celebrities etc) Google is still unparalleled
i also much prefer Google’s instant results for quick searches
True story. Startups pay high salaries to hire competent employees, and disempower them by not taking their suggestions. Then they invite consultants like me, who sometimes end up giving similar suggestions and get paid for it. Founders, start trusting the people you hired!
Looking at TIME 100 for AI, it appears like list of AI “influencers” with Hinton and a few other actual AI research folks randomly thrown in to give it legitimacy?
Price for 26 minutes transcription:
Humans: $39 to $130 (*)
Rev AI: $6.5 (closed model)
Whisper on Mac: pennies
Open Source AI will upset so many company business models, but it will enable more company business models and, most importantly the individuals, leaning on it to…
WhisperKit-v0.6.0 dropped yesterday!
In the demo, 200 audio files (~26 minutes) are transcribed in ~13 seconds on an M2 Ultra Mac using whisper-base. WhisperKit harnesses all available compute, roughly 60 TFlops on Mac (GPU + 2xANE). This release is our first step towards…
What’s the difference between CBOW and Skip-gram, in terms of performance? Turns out nothing if you fix bugs in popular implementations. I love this genre of papers.
I am absolutely in love with this mouse-over stuff
@huggingface
API documentations have. It clearly shows they understand their users' pain -- ML APIs have too many arguments. If you are wondering why 🤗 is so popular, little things like these go a long way in reducing friction.
Anyone underestimating the power of autoregressive models has not fully figured out autoregressive models. Even people actively building autoregressive regressive models have not figured them out fully either.
Hat-tip
@natfriedman