Some people said that closed APIs were winning...
but we will never give up the fight for open source AI ⚔️⚔️
Today is a big day as we launch the first open source alternative to ChatGPT:
HuggingChat 💬
Powered by Open Assistant's latest model – the best open source chat…
🦄 We ported
@openai
's GPT-2 to run on-device (using Swift and CoreML on iOS) 🔥🔥
Large transformers models can now live on the edge. 📱📲
The video below is GPT-2 running locally (no network) on the device!
Code:
Built w/
@LysandreJik
at
@huggingface
Prediction:
there'll be 10 open reproductions of ChatGPT in 6 months
and if i understand correctly, models are quite smaller than LLMs so they might be significantly easier to deploy at scale (maybe even single-GPU)
we just shipped HuggingChat on iOS 💬
The app is super polished and gives you access to the community's best open AI models, on the go.
Give it a try!
link to Appstore below ⤵️
Ten days ago I posted about GPT Store being a bit sad 😢:
What if we could build an open source alternative, with the full power of the Community?
So last Friday we launched Hugging Chat Assistants, and the adoption has been impressive:
- 4,000 Assistants have been created on…
This changes everything.
HuggingChat with Web Search is now available.
It fetches Google's top results to help the model generate its answer using data from the Web...
...And it's free 🤯
Chat-UI, the codebase behind our HuggingChat app, is now open sourced on GitHub. 🎉
We are looking for contributors 🏴☠️
Do YOU want to push the state of Open source AI forward? ☝️
The codebase is in
@sveltejs
SvelteKit but you can contribute in many different ways, not just…
🔥🔥 Series A!! 🔥🔥
Solving Natural language is going to be the biggest achievement of our lifetime, and is the best proxy for Artificial intelligence.
Not one company, even the Tech Titans, will be able to do it by itself – the only way we'll achieve this is working together
At Hugging Face we have crazy engineers like
@MorganFunto
who are building racks of M1 Mac minis
The goal is to try and build the most efficient FLOPS/$ machine learning rig
AND the most energy-friendly CO2_eq/FLOPS machine learning rig
Has anyone here tried something like it?
GPT-2 on device is blazing fast on iPhone 11 ⚡️
Core ML 3 is officially out so we can do state-of-the-art text generation on mobile (117M parameters running ~3 times par second on the neural engine!)
We put together a small video benchmark ⬇️
Apple, are you trying to bankrupt us?
At ~500 MB per average model download, 90k hits (Apple's IP addresses are the 17. block) translate to ~45TB of downstream bandwidth… which starts being costly for us.
Your new ML platform Overton () ⤵️
We have decided to update text-generation-inference (TGI)'s license.
We switch the license from HFOIL (our custom license) back to Apache 2, hence making the library fully open-source.
Read below for why we are making this change 👀
my prediction for 2024 (yes, i have only one) 💡
Local ML is going to be huge.
It will be in part driven by the adoption of Apple Silicon and other innovative hardware, but also on raw CPU and mobile devices
In many cases except for the largest of LLMs, local inference will…
Today we’re announcing 3 big things on the
@huggingface
Hub 🔥
Open this 🧵 to see all 3️⃣ of them. I'm very excited ❤️
1️⃣ The first one is that we’ve just rolled out Spaces GPU Upgrades for everyone
You can now upgrade to T4 and A10G, and we have A100 in private beta.
At NAACL last week we built a new side project, Write With Transformer.
It lets you trigger GPT-2 completions multiple times, in a Google Doc-like interface.
🦄 It's like having a unicorn friend that completes your thoughts 🦄 cc
@gdb
@AlecRad
Try it:
Introducing: Zephyr 141B-A35B 🥁
🔥Mixtral-8x22B fine-tune
🤯 Using DORPO: new alignment algorithm (no SFT, open )
🚀 With 7k instances of (open) data
Very strong IFEval, BBH, AGIEval... Enjoy! 🤗
Introducing... HF Training Cluster as a service! 🔥🔥
Access to a large compute cluster is key for large-scale model training, but historically it's been hard to secure access to large numbers of hardware accelerators, even with a hefty budget. 💰
With Training cluster as a…
Very happy to announce that we've moved our
@huggingface
AutoTrain compute to Quebec 🇨🇦(running on
@ovhcloud
)
Our compute now emit 11x less CO₂ 🔥 and runs on 99% renewable energy🍀
Hat tip to our Infra team, to
@electricityMap
for the very neat CO₂ maps, and to
@SashaMTL
💚
sending many hugs to the
@huggingface
Infra Team who stayed up all night to mitigate a db scaling issue.
Post mortem will follow later in the week, in the meantime please join me in thanking Remy/Michelle/Eliott/Adrien/..
you folks all rock 💎
Congrats to
@allen_ai
for the excellent release of OLMo
It's a true open source end-to-end release:
not just
• model code
• model weights
but also
• training code
• training data (and associated toolkits)
• eval toolkits
This is pushing the enveloppe of open-source AI 🔥
What's the AI equivalent for forking a software repo?
Finetuning a model!
Today we introduce a new feature on the Hub:
We now display which base model a given model is:
• finetuned from,
• or is an adapter (LoRA, PETs, etc) for.
This also enables a genealogic graph of all…
I'm beyond stoked to launch the v2 of the
@huggingface
model hub today 🔥
Each of our 2,000 models now has an inference widget that lets you try it (text-classification, token-classification, translation, etc.) directly from the model page.
It's all powered by the community 💖
We just helped a team member from Mongolia get a work visa to France.
June 10: offer sent to candidate(&accepted)
June 16: appointment at the FR consulate in Ulaanbaatar, Mongolia 🇲🇳
June 22: visa application approved
June 30:
@mishig25
will travel to France 🎉
Go France!! 🇫🇷🇫🇷
The famous
@ycombinator
dropped a very cool article to showcase their 25+ companies who have trained their own AI models, rather than just use someone else's closed model through an API like a black box. ⤵️
“With these GPU optimizations, we were able to use 2000+ Azure GPU Virtual Machines across four regions to serve over 1 million BERT inferences per second worldwide”
Bing, using (distilled, 3-layer) BERT in production.
via
@rangthan
Update to 𝗢𝗨𝗥 𝗗𝗘𝗙𝗜𝗡𝗜𝗧𝗜𝗩𝗘 𝗧𝗨𝗧𝗢𝗥𝗜𝗔𝗟 🔥
You can now click the "Open in Colab" link and run as a simple notebook. Hat/tip
@srush_nlp
for the idea and
@AdityaMalte
for the implem help 🔥
➡️
Llama 2 (70B) just landed in HuggingChat💬
This is the largest running version of the model from
@MetaAI
, running on fast optimized inference on
@huggingface
infra.
Unleash the llamas! 🦙🦙
Try it out now
✨ Distil-All The Things✨
Today we release:
- 🐎DistilGPT-2 support in Write With Transformer. Compare your GPT-2 generations to our distilled, faster model:
- 🚨 DistilBERT for question answering for iOS
- 🦄 DistilGPT-2 for iOS:
NVIDIA A100 GPUs are now available on
@huggingface
Spaces, at those competitive prices= 🔥🔥🔥
• USD 4.13/hour in self-serve
• lower prices for Enterprise customers
Spaces will expose more cool hardware for ML in the coming months, ping us if you have specific needs 🙏
🔥 Thrilled to release our Swift Core ML implementation of BERT for question answering.🔥🔥
Transformers models now also live on the edge. 📱📲
You now CAN do state-of-the-art NLP on mobile devices!
Built w/
@LysandreJik
and
@Thom_Wolf
at
@huggingface
@huggingface
@awscloud
This is not about BLOOM or ChatGPT.
This is about the dozens of BLOOMs and ChatGPTs that are going to be released by the community in the coming months, and years.
For the record:
we've always been pretty frugal/scrappy at
@huggingface
.
This is the first >$100k piece of equipment we've bought, ever
I still believe ideas are worth way more than compute (though we do have access to ample compute now)
Photo credit:
@gloriamika
@MorganFunto
We just passed 1️⃣0️⃣,0️⃣0️⃣0️⃣ models on
@huggingface
🔥
This amounts to more than 1 trillion (10^12) machine learned parameters 🤯
Here's to the next 100k models 💜
Here's to the community building the future of AI with us ❤️🧡🖤
You can now filter for models that include data about their training's energy consumption and equivalent CO2 emissions ⚡️
To add this metadata to your models, see the screenshot below
hat/tip
@SashaMTL
@abhi1thakur
@mmitchell_ai
🙏
Today I am excited to release pytorch-block-sparse: a *drop-in* replacement of
@PyTorch
Linear with GPU-efficient sparsity:
75% sparsity➡️4x less memory + 2x the speed!
Code & tutorials to use to train your own sparse models:
Jensen:
> If not for Llama 2, if not for mistral 7b, and all the open source AI models, the whole energy of the generative AI field would be very different
Yesterday
@deepfloydai
released IF, a new text-to-image diffusion model that can (among other things) generate text inside images.
We host the official demo on
@huggingface
Spaces 🔥
Traffic has been massive, so the best time to try it is now, before the U.S. wakes up 😛…
Actually, let's make bolder predictions for 2024:
W̶e̶'̶l̶l̶ ̶h̶a̶v̶e̶ ̶a̶t̶ ̶l̶e̶a̶s̶t̶ ̶o̶n̶e̶ ̶o̶p̶e̶n̶ ̶s̶o̶u̶r̶c̶e̶ ̶m̶o̶d̶e̶l̶ n̶e̶x̶t̶ ̶y̶e̶a̶r̶ ̶t̶h̶a̶t̶'̶s̶ ̶a̶s̶ ̶g̶o̶o̶d̶ ̶a̶s̶ ̶O̶p̶e̶n̶A̶I̶ ̶ ❌
Most open source models next year will be better than OpenAI's ✅
Congrats to Streamlit for the exciting news 🎉
We'll continue supporting Streamlit apps on
@huggingface
Spaces, and along with
@Gradio
we aim to build the best env for the whole community to build and deploy ML apps and demos 🔥🔥
Let's make Spaces 100x better in the next 2 yrs
What if you could casually access your remote GPU in HF Spaces from the comfort of your local VSCode 🤯
EDIT: did I mention we have an insane Infra team at
@huggingface
??
Phil Wang aka `lucidrains` wrote yet another awesome PyTorch implementation, this time of Enformer, Deepmind's model for predicting gene expression.
Thanks to
@NielsRogge
we can now load the pretrained models from the huggingface hub directly from the Enformer codebase 🔥
In case you missed it this week:
🔥🔥 Mistral-7B-Instruct is now available on HuggingChat 🔥🔥
It's the best 7B model to date, released under permissive Apache 2.0
✍️
I don't know who gas9S9zw3P9c is on Hacker News (I promise it's not me 😉), but I'll tell you what, those comments are EXACTLY why we're getting up in the morning
Are you interested in Inference optimization?
🔥 Step 1: export your model to ONNX directly from transformers:
🏎 Step 2: Use our new repo Optimum to optimize your ONNX (operator fusion etc), including to a target hardware
A few thoughts on PyTorch Mobile (released last Thursday at PyTorch dev con: ):
Interesting, but *not* the future of edge ML, at least for large models. ⤵️
On iOS, Core ML runs a ResNet50 inference in 20ms (including image preprocessing). (on iPhone 11)
I've said it before, 2024 is the year of the synthetic dataset.
Not just for fine-tuning but also for training from scratch.
In her new guide the awesome
@LoubnaBenAllal1
reveals the actual tips and tricks to:
1/ generate a good 25B tokens dataset
AND
2/ train a good 1B LLM on…