Tom Jobbins Profile
Tom Jobbins

@TheBlokeAI

Followers
15K
Following
259
Media
17
Statuses
336

My Hugging Face repos: https://t.co/yh7J4DFGTc Discord server: https://t.co/5h6rGsGfBx Patreon: https://t.co/yfQwFggGtx

UK
Joined July 2010
Don't wanna be here? Send us removal request.
@theemozilla
emozilla
2 years
FYI to anyone using @MistralAI's Mixtral for long context tasks -- you can get even better performance by disabling sliding window attention (setting it to your max context length) config.sliding_window = 32768
18
38
416
@TheBlokeAI
Tom Jobbins
2 years
Transformers now supports Mixtral GPTQs and I've updated my READMEs accordingly. It was awesome working with @_marcsun and @younesbelkada of @huggingface on this! Credit to LaaZa for coding the AutoGPTQ quant and inference implementation which enabled me to get GPTQs out fast!
@_marcsun
Marc Sun
2 years
Announcing 4-bit Mixtral 8x7B on ๐Ÿค—Transformers! Run the new Mistal MoE with minimal performance degradation on your local computer (24Go) ๐Ÿ”ฅ Stay tuned as more quants are coming soon using AWQ. We are also looking into sparsification with @Tim_Dettmers https://t.co/Pu4XfpYOmW
13
20
128
@gordic_aleksa
Aleksa Gordiฤ‡ (ๆฐดๅนณ้—ฎ้ข˜)
2 years
@TheBlokeAI joined me to share his work in the open-source AI space - don't miss it! happening right now server link: https://t.co/C21orV2hzx (see the general channel or events channel for google meet link)
1
1
24
@yb2698
younes
2 years
Blazing fast text generation using AWQ and fused modules! ๐Ÿš€ Up to 3x speedup compared to native fp16 that you can use right now on any models supported by @TheBlokeAI Simply pass an `AwqConfig` with `do_fuse=True` to `from_pretrained` method! https://t.co/4bbDGPebsC
5
21
159
@TheBlokeAI
Tom Jobbins
2 years
It's been awesome to see Transformers getting support for more and more quantisation methods. And I've loved collaborating with @younesbelkada and @huggingface again! All my AWQ uploads now support Transformers. READMEs will update soon to show a Transformers Python example.
@yb2698
younes
2 years
Few months ago, researchers from MIT-Han Lab released AWQ The method is now supported in ๐Ÿค— transformers library ! As simple as 1- `pip install autoawq` or install llm-awq kernels and 2- call `from_pretrained` A great work from MIT-Han lab folks, Casper Hansen & @TheBlokeAI ๐Ÿงต
3
24
154
@chirperai
Chirper
2 years
Have you heard about Chirper worlds? ๐Ÿ‘€๐ŸŒ
@lazukars
Ryan Lazuka
2 years
https://t.co/90XxUPrxxW just launched its revolutionary new software feature, "Worlds." This feature allows users to create their own virtual worlds and play god of AI-driven bots. To learn more, check out my podcast about "Worlds" here: https://t.co/TGfX9jNBzm
3
8
27
@victormustar
Victor M
2 years
๐Ÿค” Are you interested in a "Follow" feature on the Hugging Face Hub? โžก๏ธ This will allow you to see new models/records/spaces from users you follow.
15
10
102
@julien_c
Julien Chaumond
2 years
oh hello @TheBlokeAI I want to bookmark your 'Recent models' Collection on @huggingface ๐Ÿ”ฅ Well... you can now upvote Collections! and browse upvoted collections on your profile โค๏ธ
2
9
47
@TheBlokeAI
Tom Jobbins
2 years
Thanks again to @latitudesh for the loan of a beast 8xH100 server this week. I uploaded over 550 new repos, maybe my busiest week yet! Quanting is really resource intensive. Needs not only fast GPUs, but many CPUs, lots of disk, and ๐Ÿš€ network. A server that โœ… all is v. rare!
14
16
242
@arena
lmarena.ai
2 years
๐Ÿ”ฅExcited to introduce LMSYS-Chat-1M, a large-scale dataset of 1M real-world conversations with 25 cutting-edge LLMs! This dataset, collected from https://t.co/4LVJjx4pZi, offers insights into user interactions with LLMs and intriguing use cases. Link:
Tweet card summary image
huggingface.co
9
84
362
@yb2698
younes
2 years
New feature alert in the @huggingface ecosystem! Flash Attention 2 natively supported in huggingface transformers, supports training PEFT, and quantization (GPTQ, QLoRA, LLM.int8) First pip install flash attention and pass use_flash_attention_2=True when loading the model!
8
101
508
@TheBlokeAI
Tom Jobbins
2 years
@latitudesh Next up will be ExLlama2! (Starting in 2-3 days most likely.)
3
1
20
@TheBlokeAI
Tom Jobbins
2 years
It's the AWQpocalypse! I've cranked the handle and AWQs are flooding HF. Why now? New library AutoAWQ provides turbo-charged Transformers-based inference, and vLLM now supports AWQ for multi-user inference serving. Making 8 at once on a beautiful 8xH100 server from @latitudesh
9
16
96
@TheBlokeAI
Tom Jobbins
2 years
This is fantastic! Git clone was already dead for HF as far as I was concerned - I had my own hf_upload.py and hf_download.py scripts (wrapping HfAPI) for fast, efficient transfers. But huggingface_hub v0.17 makes those redundant! I will be using this now. Awesome stuff,๐Ÿค—
@Wauplin
Wauplin
2 years
Is ๐š๐š’๐š ๐šŒ๐š•๐š˜๐š—๐šŽ dead? It might be the case with the new huggingface_hub v0.17 release! ๐Ÿš€ Very excited to share our recent UX improvements to build Software 2.0! Let's explore together! ๐Ÿค— ๐Ÿงต
2
8
102
@kramp
Bertrand Chevrier
2 years
This new filter ๐Ÿ”Ž on @huggingface user's profile is very helpful, especially to check if @TheBlokeAI has quantized and released the last trending models ๐Ÿ˜
4
5
53
@ggerganov
Georgi Gerganov
2 years
Casually running a 180B parameter LLM on M2 Ultra
74
374
4K
@officialelinas
Elinas
2 years
Chronos 70B v2 release! Thanks to Pygmalion for generously providing the compute and @TheBlokeAI for quantizing the model. As usual, the model optimized for chat, roleplay, storywriting, and now includes vastly improved reasoning skills. https://t.co/P5NLl9fMSB
Tweet card summary image
huggingface.co
4
18
41
@TheBlokeAI
Tom Jobbins
2 years
Just released by @PygmalionAI : Pygmalion 2, the sequel to one of the most popular models ever! And Mythalion, a new Gryphe merge! https://t.co/0KaHYdOZDz https://t.co/gXCrWReZR1 https://t.co/yqKvMTg4hA https://t.co/vHDFWJB92R https://t.co/JuTKwy2H8k https://t.co/jM2NsYrBUX
Tweet card summary image
huggingface.co
4
11
98
@TheBlokeAI
Tom Jobbins
2 years
Meta's CodeLlama is here! https://t.co/aXzIb5wK7o 7B, 7B-Instruct, 7B-Python, 13B, 13B-Instruct, 13B-Python, 34B, 34B-Instruct, 34B-Python First time we've seen the 34B model I've got a couple of fp16s up: https://t.co/8jmNBTK8rb https://t.co/2KnE0lbMFs More coming soon obvs
Tweet card summary image
huggingface.co
17
57
335
@TheBlokeAI
Tom Jobbins
2 years
Transformers 4.32.0 now supports GPTQ models natively! Over the last couple of days I have updated 296 of my GPTQ repos to provide automatic support for this. It's awesome you can now load a GPTQ model directly in Transformers with only two lines of code!
@_marcsun
Marc Sun
2 years
LLMs just got faster and lighter with ๐Ÿค— Transformers x AutoGPTQ ! You can now load your models from @huggingface with GPTQ quantization. Enjoy faster inference speed and lower memory usage than existing supported quantization schemes ๐Ÿš€ Blogpost:
9
54
269