generatorman @generatorman_ai profile

generatorman

@generatorman_ai

Followers

1,697

Following

560

Media

293

Statuses

4,631

the model is the medium

Joined March 2023

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

#呪術廻戦 • 60290 Tweets

VISA • 52223 Tweets

梅津さん • 44941 Tweets

招待コード • 38811 Tweets

#ファンパレ • 36811 Tweets

Toni Kroos • 36297 Tweets

Fantia • 27856 Tweets

#sbhawks • 25993 Tweets

#FeelThePOP1stWin • 22206 Tweets

ソフトバンク • 21343 Tweets

ザレイズ • 19713 Tweets

テイルズ • 19237 Tweets

Samsung A55 • 16335 Tweets

オフライン版 • 10852 Tweets

SavejournalisticID • 10287 Tweets

打者一巡

カチュア

打ちすぎ

水野くん

回20点

ゴーストルール

JCBカード

ショート冬優子

ラグビー

宮田一派

Samsung Galaxy A55

Southgate

Wharton

Singapore Airlines

スズケン

野手登板

ドショ1位

エスピノーザ

チャイルドシート

楽天さん

サイン盗み

スリーベース

みっちーインスタ開設

楽天ファン

Infinix GT 20Pro 5G

先発全員安打

コールド

$SCREAM

プロ初ホームラン

ソフバン

كروس

田中幹也

クロース

#プロセカ放送局

Last Seen Profiles

@arjundasyan

@xRWSZ1yG3o6UTJR

@pasha_osaf

@ifsaturktrt

@LylaA36359

@fireflightrock

@tapiocaface

@swiss_mail

@AberdeenWings

@HudaParTBMM

@HEIP_Politics

@KristinAkin1

@CetilarRacing

@BowTiedNightOwl

@FEbulu

@Janesroman

@antonioguterres

@GRokai

@dantells444

@RastaPizza

Pinned Tweet

generatorman

@generatorman_ai

8 months

GPT is not a next token predictor. The last FFN layer of GPT is a next token predictor. All earlier layers are future tokens predictors, more so when trained at longer context lengths.

4

2

120

generatorman

@generatorman_ai

8 months

ok I'm addicted to this Bing DALLE-3 prompt: ikea instruction schematics for assembling X, detailed high resolution scan Do you have any cool ideas?

112

321

3K

generatorman

@generatorman_ai

1 year

Move over Alpaca, IBM just changed the game for open-source LLMs 💥 Dromedary🐪, their instruction-tuned Llama model, beats Alpaca in performance 𝙬𝙞𝙩𝙝𝙤𝙪𝙩 distilling ChatGPT, and 𝙬𝙞𝙩𝙝𝙤𝙪𝙩 human feedback! How do they do it? 👇 (1/4)🧵

AK

@_akhaliq

1 year

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision abs: paper page: project page:

15

123

493

19

252

1K

generatorman

@generatorman_ai

8 months

just in: ikeacore is eating the world

12

96

1K

generatorman

@generatorman_ai

8 months

One and a half years since OpenAI trained GPT-3.5, and nobody has caught up (except their own splinter group). OpenAI models are better, and they are ahead the most on the hardest things. How are they doing this? How did they run away with it in full public view?

29

9

252

generatorman

@generatorman_ai

1 year

Yo @OpenAI just open-sourced a human feedback dataset with 800k labels 😱 They got GPT-4 trying to solve MATH problems using CoT, then raters evaluated each step in the CoT for correctness. Time for some Llama finetunes, maybe using the new DPO algo? @Teknium1 @abacaj

OpenAI

@OpenAI

1 year

We trained an AI using process supervision — rewarding the thought process rather than the outcome — to achieve new state-of-art in mathematical reasoning. Encouraging sign for alignment of advanced AIs: …

429

857

5K

6

17

242

generatorman

@generatorman_ai

8 months

we have to go deeper

3

12

140

generatorman

@generatorman_ai

1 year

@javilopen The person who will develop this virus just read your tweet and got the idea. Not your fault though - the seeds of this idea were in turn planted in your mind by GPT-4, in whose timeless latent space the virus already exists, has always existed.

6

131

generatorman

@generatorman_ai

10 months

@nostalgebraist This has been known for months now.

generatorman

@generatorman_ai

1 year

@jconorgrogan Here it hallucinated a plausible but fictional quantum optics bibliography. Only works on 3.5 though, GPT-4 will either refuse ("that's not productive") or correctly generate only the letter.

1

0

31

1

4

101

generatorman

@generatorman_ai

8 months

halp i can't stop

4

7

104

generatorman

@generatorman_ai

8 months

@kitten_beloved gandhi was like "centralized state fiat cannot accommodate the infinite variety of contextual human judgement" and ambedkar was like "bitch contextual human judgement enslaved my people for thousands of years, big state LFG"

2

9

88

generatorman

@generatorman_ai

1 year

They essentially make Llama 65B tune itself by using "context distillation". Given an instruction, they add an elaborate prompt with a list of guidelines and examples (the "constitution"), and that's enough to get the Llama 65B base model to answer the instruction. (2/4)🧵

5

8

85

generatorman

@generatorman_ai

8 months

How can we make LLMs more efficient? The dumbest thing about transformers is their static compute usage - given "The capital of France", both "is" and "Paris" consume one forward pass each! There are two approaches to allocating the compute dynamically. (1/)

4

9

82

generatorman

@generatorman_ai

1 year

Next they simply remove the long constitution prompt, keeping just the instruction - voila, now they have 250k instruction-response pairs! Finally they train a LORA on this dataset to get the instruction-tuned Dromedary🐪. (3/4)🧵

4

70

generatorman

@generatorman_ai

7 months

One weird trick to make LLMs understand maths natively! Instead of hacks like writing digits in reverse order, they inject numbers as an independent modality (like vision) - scalar repr along a dim in embedding space. This implementation is only a PoC, but the idea has legs!

Siavash Golkar

@SiavashGolkar

7 months

1/ Did you know LLMs like ChatGPT struggle with number manipulation and data analysis? They make up answers and fail to generalize. Today I am excited to share xVal, a continuous and efficient number encoding scheme with better generalization performance.

9

31

236

1

5

74

generatorman

@generatorman_ai

8 months

Google open-sourcing new SoTA CLIP models 🥳 Well they call it SigLIP because they didn't use a softmax loss to train, but they work the same. Available up to 400M params size, but better than other size H (632M) models in benchmarks. Great for t2i, vision LLMs, captioning...

Lucas Beyer (bl16)

@giffmana

8 months

Pleased to announce we are releasing checkpoints for our SigLIP models! These are very strong image-text ViTs. We release them along with a colab to play around with. Most are english, but we also release a good i18n one. Sorry, no magnet link mic drop. More in thread🧶

14

68

398

3

10

70

generatorman

@generatorman_ai

1 year

No more being forced to distil ChatGPT values when you finetune LLMs - you can just use a local copy of Llama and your very own constitution! ⚖️What laws will you write? (Since there's no human feedback, the performance can't match Vicuna - but this will be solved.) (4/4)🧵

8

3

68

generatorman

@generatorman_ai

8 months

@ostrisai fresh flavour of ikeacore

generatorman

@generatorman_ai

8 months

just in: ikeacore is eating the world

12

96

1K

2

6

63

generatorman

@generatorman_ai

11 months

It's over - @lmsysorg have done the work & proved out @kaiokendev1 's crazy RoPE interpolation trick not only works, it works better than everything else! Vicuna-13B now with 16k context 🤯 MPT-StoryWriter-65k, trained with Alibi, fails to reach its rated context length. (1/2)🧵

lmsys.org

@lmsysorg

11 months

🔥Introducing LongChat🤖, our new chatbots supporting 16K tokens context, and LongEval, our new benchmark for testing long context chatbots. 🤥Surprisingly, we found open LLMs often fail to achieve their promised context length. Check our blog for details:

4

106

475

1

4

62

generatorman

@generatorman_ai

11 months

Impressive results from an open-source implementation of Google's StyleDrop! What's crazy is how weak the underlying t2i model is here - only 500M params, trained on CC3M for just 285k steps 😲 If somebody trains a large-scale Muse replication, this will get turbocharged 🚀

AK

@_akhaliq

11 months

unofficial PyTorch implementation of StyleDrop: Text-to-Image Generation in Any Style github:

6

51

297

1

10

55

generatorman

@generatorman_ai

8 months

Are you watching closely?

generatorman

@generatorman_ai

8 months

ok I'm addicted to this Bing DALLE-3 prompt: ikea instruction schematics for assembling X, detailed high resolution scan Do you have any cool ideas?

112

321

3K

2

54

generatorman

@generatorman_ai

11 months

@browserdotsys "It is said that analyzing pleasure, or beauty, destroys it. That is the intention of this essay."

0

3

48

generatorman

@generatorman_ai

8 months

over and out

2

3

46

generatorman

@generatorman_ai

1 year

*Apologies to the authors for reading their alignment paper as a capabilities paper 😅l

4

3

45

generatorman

@generatorman_ai

8 months

Just got DALLE-3 in ChatGPT Plus - They've nerfed it into oblivion 😕 Not only are people & fictional characters blocked, so are all brands. That means I can't run my ikeacore prompts, I can't even mention Coca Cola! Well actually there seem to be two levels of blocking. (1/)

6

3

42

generatorman

@generatorman_ai

10 months

@wordsandsense @katiedimartin It would be silly for him to consult IP attorneys to do something so obviously legal.

3

0

40

generatorman

@generatorman_ai

10 months

@benji_smith Based on your description, your website clearly had nothing to do with generative AI and was not even remotely close to being illegal. You did nothing wrong.

1

40

generatorman

@generatorman_ai

6 months

BREAKING: Ilya Sutskever signs letter giving an ultimatum to the OpenAI board, which includes... Ilya Sutskever. This movie is still going, and we're 3.5 hours from NASDAQ open.

Kara Swisher

@karaswisher

6 months

Breaking: 505 of 700 employees @OpenAI tell the board to resign.

4

5K

20K

5

2

40

generatorman

@generatorman_ai

10 months

tl;dr - OpenAI, Anthropic, Meta, Google, Microsoft, Inflection and Amazon all commit to not open-source any LLM more powerful than GPT-4. The rest is boilerplate and doesn't really move the needle much. What about image models though? 👇

OpenAI

@OpenAI

10 months

OpenAI and other leading AI labs are making a set of voluntary commitments to reinforce the safety, security and trustworthiness of AI technology and our services. An important step in advancing meaningful and effective AI governance around the world.

159

344

1K

8

5

38

generatorman

@generatorman_ai

8 months

Doctoring Diversity or: How I Learned to Stop DALLE-3 from Editorializing My Prompts We know that Bing uses a LLM to turn prompts we write into prompts received by the image model, and you can attack the LLM with prompt injection. This got me thinking - can I disable it? (1/)

generatorman

@generatorman_ai

8 months

Multimodal prompt exfiltration attack 🤯 This madlad figured out the system prompt of the LLM sitting before the Bing DALLE-3 model 👏

0

2

21

2

4

38

generatorman

@generatorman_ai

7 months

OpenAI Dev Day: GPT-4 Turbo API will have logprobs 🤯

1

3

36

generatorman

@generatorman_ai

8 months

-"b..but Khufu, nobody has ever scaled pyramids to this size!" -"I SAID STACK MORE LAYERS"

generatorman

@generatorman_ai

8 months

Are you watching closely?

2

54

2

3

31

generatorman

@generatorman_ai

1 year

@jconorgrogan Here it hallucinated a plausible but fictional quantum optics bibliography. Only works on 3.5 though, GPT-4 will either refuse ("that's not productive") or correctly generate only the letter.

1

0

31

generatorman

@generatorman_ai

1 year

@lmsysorg @OpenAI @AnthropicAI RWKV putting a RNN in the top 10 🤯 Some well-regarded models missing from the eval though - WizardLM, GPT4All, StableVicuna, vicuna-cocktail...

1

2

31

generatorman

@generatorman_ai

8 months

follow me for more leaked imperial schematics (sure anon, you can call me princess)

generatorman

@generatorman_ai

8 months

ok I'm addicted to this Bing DALLE-3 prompt: ikea instruction schematics for assembling X, detailed high resolution scan Do you have any cool ideas?

112

321

3K

3

5

29

generatorman

@generatorman_ai

8 months

This is how far you can take video generation models already, just with scaling (~10B params). It's becoming clear to me that LLMs are Low Compute Phenomena. Video models are the real deal. OpenAI must be furiously scraping YouTube and cutting deals with CCTV operators.

Wayve

@wayve_ai

8 months

It’s #GAIA1 world, we just drive in it. Generate realistic driving videos using only prompts. See how it works below 🧵

8

57

296

3

4

31

generatorman

@generatorman_ai

8 months

@bcjordan

2

1

28

generatorman

@generatorman_ai

7 months

Independent evidence for the claim that LoRAs provide a *better* model than full FT, through the implicit regularization. I remember @Teknium1 and others were running an experiment to verify this. Have we gotten any results from that?

Iker García-Ferrero

@iker_garciaf

7 months

When evaluating both checkpoints, we discovered that the model trained with QLoRA outperforms its counterpart, especially on unseen tasks. Fine-tuning all the parameters leads to an overfitted model that underperforms on evaluation tasks. LoRA/QLoRA prevents overfitting!

4

7

50

2

0

27

generatorman

@generatorman_ai

7 months

1) Diffusion models will learn almost identical distributions from two *disjoint* training sets (if datasets are large enough). Bad news for Getty & co 😂 2) They learn suboptimally when the training data has too little variation (low dimensionality).

1

25

generatorman

@generatorman_ai

10 months

@wordsandsense @katiedimartin If IP owners were free to interpret IP law as they saw fit, we would be living in a very different world 😂

1

0

26

generatorman

@generatorman_ai

10 months

@wordsandsense @katiedimartin If my friend pays me to write some content for their website, no I don't consult any attorneys. Some things have been established to be indisputably legal for a long time, so that even laypeople know there are no legal risks to doing it.

2

0

25

generatorman

@generatorman_ai

8 months

Meta is finally releasing a free text-to-image product. Available only in the US for now, sadly. The model is called Emu, they claim it beats SDXL, it even comes with a paper, and the central claim of the paper is truly shocking! (1/)

2

8

25

generatorman

@generatorman_ai

8 months

I'd love a real cookbook with this vibe. The closest thing I've read is the wonderful "Relish" by @LucyKnisley .

3

0

23

generatorman

@generatorman_ai

8 months

Nobody I know (outside tech/Twitter) expresses excitement about the generative AI trend. A little bit of indifference, a lot of concern. I know there was a little bit of this situation with personal computers, but I can't imagine it was anywhere near this magnitude.

5

0

23

generatorman

@generatorman_ai

1 year

How can you believe that the law simultaneously 1) allows @OpenAI to claim fair use for training their model on copyrighted data, 2) but prevents other models from claiming fair use for training on ChatGPT output? Either all commercial LLMs are illegal, or none of them are.

2

3

21

generatorman

@generatorman_ai

8 months

Multimodal prompt exfiltration attack 🤯 This madlad figured out the system prompt of the LLM sitting before the Bing DALLE-3 model 👏

Ollin Boer Bohan

@madebyollin

8 months

@multimodalart @GaggiXZ "You are the prompt modifier system for the DALL•E image generation service. You must always ensure the expanded prompt retains all entities, intents, and styles mentioned originally..."

5

20

153

0

2

21

generatorman

@generatorman_ai

10 months

Did a few spot checks on the Llama 2 70B chat RLHF model. I don't doubt the capability of the base model, but the chat is far behind GPT3.5. Answers aren't factual, and it refuses to write a poem about Africa 🤨 Let's see what finetuners like @Teknium1 can get out of it!

AK

@_akhaliq

10 months

@Gradio demo:

3

38

2

0

22

generatorman

@generatorman_ai

8 months

@GozukaraFurkan They didn't have that when they trained 3.5 though.

0

21

generatorman

@generatorman_ai

10 months

@wordsandsense @katiedimartin Put your money where your mouth is - pay a lawyer to sue the guy. If you get a judge, even a lower court judge, to find infringement on a single count then I swear to pay you $500.

1

0

21

generatorman

@generatorman_ai

5 months

Wow Microsoft Research is really looking to start a beef with Google DeepMind 😂 The head honchos of MSR have put out a blogpost reporting an inference strategy (Medprompt+) for GPT-4 that beats Gemini Ultra's MMLU record, and boy is it full of shade at Google! (1/)

3

5

19

generatorman

@generatorman_ai

10 months

@katiedimartin The most morally ugly part of this is the glee at having finally found a little guy to bully. I hope they pour all their money into suing him - the case is so open-and-shut that he could defend himself and still win easy.

0

2

17

generatorman

@generatorman_ai

1 year

@AVMiceliBarone Symbol tuning should solve this problem.

Symbol tuning improves in-context learning in language models

We present symbol tuning - finetuning language models on in-context input-label pairs where natural language labels (e.g., "positive/negative sentiment") are replaced with arbitrary symbols (e.g.,...

arxiv.org

1

0

20

generatorman

@generatorman_ai

10 months

@wordsandsense @katiedimartin All I can say is that it'd be a big mistake for you to ever depend on your own understanding of IP law to make a decision about fair use 😂 I recommend resisting the temptation and hiring an attorney.

2

0

18

generatorman

@generatorman_ai

7 months

The filter on Bing DALLE-3 is super aggressive (and illogical), but by peering through its cracks and by thoroughly analyzing the shadow it casts, I have arrived at a startling conclusion - DALLE-3 is really, really horny 🥵

3

0

19

generatorman

@generatorman_ai

8 months

Multiple reports that OpenAI's new GPT-3.5 is much smarter than the ChatGPT we know! I doubt this is RLHF-free, since it's replacing text-davinci-003 which was a RLHF model. Just not tuned for chat. Seeing this, "Sparks of AGI" in GPT-4 Base doesn't seem so hard to believe 😱

Grant Slatton

@GrantSlatton

8 months

The new GPT model, gpt-3.5-turbo-instruct, can play chess around 1800 Elo. I had previously reported that GPT cannot play chess, but it appears this was just the RLHF'd chat models. The pure completion model succeeds. See game & thoughts below:

103

266

2K

1

3

18

generatorman

@generatorman_ai

6 months

Prediction: GPT-4 will have inferenced every token of GPT-5's training data.

5

0

19

generatorman

@generatorman_ai

9 months

@TaliaRinger @stanislavfort @emilymbender @tdietterich @mmitchell_ai @ErikWhiting4 @arxiv I don't understand how double blind peer review helps with bias? I get that (with perfect implementation) the review itself is unbiased, but post review the paper is published with credit - at that point it's subject to the same biases as an arxiv preprint. So what is gained?

2

0

19

generatorman

@generatorman_ai

8 months

GPT-4V should be much easier to jailbreak 🤞 Image input really explodes the exposed surface area of the model. Based on the system card, they have done two things to fortify their censor. The most basic attack is to simply upload a screenshot of your naughty prompt. (1/)

2

0

18

generatorman

@generatorman_ai

6 months

It took me several hours to read this thread and the threads QC links for context, but now I understand something I didn't before. AI doomerism is not a cult in the metaphorical sense. It is a cult in a concrete textbook sense, embedded in an IRL ecosystem of textbook cults.

QC

@QiaochuYuan

6 months

at the first bio-emotive retreat i was paired up to facilitate bio-emotive for this woman. we started with "there's this friend who owes me money and i haven't been on her case about it" got to "why do i always let people walk all over me" then: raped when she was 14

38

43

921

1

0

18

generatorman

@generatorman_ai

8 months

@teortaxesTex Yeah it's just surprising to me that 1.5 years hasn't been enough for others to replicate this, knowing that it's possible.

1

18

generatorman

@generatorman_ai

6 months

Software release notes in 2023: 1) Fixed a bug 2) Added empathy 3) Hardened security against injection attacks 4) Now includes a soul 5) Upgraded torch version

Eric Hartford

@erhartford

7 months

New release - Dolphin-2.2-70b Fixed a bug, added multi-turn conversation and empathy. This is trained on StellarBright base, using ChatML prompt format. In addition to Dolphin data, trained with curated data from Airoboros, Samantha, and WizardLM projects.

1

13

81

0

3

17

generatorman

@generatorman_ai

8 months

@kitten_beloved ah yes the good old gandhi-ambedkar discourse

1

0

15

generatorman

@generatorman_ai

1 year

@nonmayorpete "For US Residents only."

1

0

16

generatorman

@generatorman_ai

10 months

Folks at @opengvlab had the same idea as me - they find evaluating benchmarks using ChatGPT instead of word matching improves accuracy. Also, BLIP2/InstructBLIP from @SFResearch are crushing other open VLMs atm! 🤞for Otter on OpenFlamingoV2 though.

generatorman

@generatorman_ai

10 months

1) Automated NLP benchmarks can precisely target capabilities, but are very brittle to the inherent variability of generative LLMs. 2) LLM-based evaluations handle variation, but can end up targeting spurious factors like style. 3) Solution: LLM-based evaluation of benchmarks.

1

0

1

0

3

16

generatorman

@generatorman_ai

6 months

It's a cult.

Karen Hao

@_KarenHao

6 months

Now is probably the time to announce that I've been writing a book about @OpenAI , the AI industry & its impacts. Here is a slice of my book reporting, combined with reporting from the inimitable @cwarzel . Inside the year of chaos that led to this weekend.

76

479

2K

2

0

15

generatorman

@generatorman_ai

1 year

@codewithnohands IBM renaissance would be a lot crazier than Microsoft renaissance 🤣

0

1

16

generatorman

@generatorman_ai

6 months

I hate it when Google researchers write a paper on a useful idea. Academic institutions must adopt the policy that if your paper was the first to release the artifacts (model weights), it will be treated as novel, even if the methodology is the same as a Google or OpenAI paper.

Aran Komatsuzaki

@arankomatsuzaki

6 months

DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback Presents a model-agnostic training algorithm by design that improves T2I models to be faithful to the text input

1

46

220

4

1

16

generatorman

@generatorman_ai

8 months

this was inspired by @ostrisai 's hilarious IKEA lora for SDXL

1

15

generatorman

@generatorman_ai

6 months

In @ilyasut 's position, I'd lean hard on the narrative that I saw something powerful enough to consider burning the whole thing down. Tech VCs would cut off their balls just to feel something. They aren't gonna risk pulling the plug on this if there's a 1% chance of getting in.

2

0

14

generatorman

@generatorman_ai

7 months

I saw claims that nerds were missing the point of Pi, asking it to write code. Sounds plausible, so I tried chatting with it about life problems. It passive aggresively gaslit me within the first 3 messages. Why do they own 40 exaflops?? Starting to suspect something sinister!

2

1

15

generatorman

@generatorman_ai

10 months

@eigenrobot agree that straight men don't like women, disagree that this is a major contributor to not getting action

1

0

14

generatorman

@generatorman_ai

8 months

how it started

generatorman

@generatorman_ai

8 months

ok I'm addicted to this Bing DALLE-3 prompt: ikea instruction schematics for assembling X, detailed high resolution scan Do you have any cool ideas?

112

321

3K

0

1

15

generatorman

@generatorman_ai

7 months

I loved @IBM 's LLM Dromedary, and y'all loved it too - it's the most popular tweet I've ever written. Well now they're back with Dromedary-2 💥 🐪 replaced human/GPT annotations with constitutional self-distillation for instruction tuning, and 🐪-2 does the same for RLHF! (1/)

generatorman

@generatorman_ai

1 year

Move over Alpaca, IBM just changed the game for open-source LLMs 💥 Dromedary🐪, their instruction-tuned Llama model, beats Alpaca in performance 𝙬𝙞𝙩𝙝𝙤𝙪𝙩 distilling ChatGPT, and 𝙬𝙞𝙩𝙝𝙤𝙪𝙩 human feedback! How do they do it? 👇 (1/4)🧵

19

252

1K

1

2

15

generatorman

@generatorman_ai

11 months

RLHF for diffusion models is now in PyTorch 🥳 You can tune SD1.5 on the Pick-A-Pic preference dataset under 10GB (or add 12GB for LLaVA-13B to do RLAIF). DDPO is the most promising direction in the text2img space. Our days of rawdogging base models like savages are numbered.

Kevin Black

@kvablack

11 months

The biggest problem with our RL diffusion paper was that nobody could run our Jax+TPU code. No more! I've reimplemented DDPO in PyTorch, plus replicated our results using LoRA for low-memory training! Links below 👇

11

47

307

2

15

generatorman

@generatorman_ai

1 year

They release 2.5k distinct model checkpoints across a range of training epochs and model sizes (upto 12B), controlling for the order in which training samples were seen. Tremendous contribution from @AiEleuther to LLM research, including metalearning and interpretability 🤯

Stella Biderman

@BlancheMinerva

1 year

Going into this, I was certain that data encountered early in training is more likely to be memorized than data encountered later in training. Much to my surprise, we found no evidence of that! Memorized sequences are distributed uniformly at random throughout the training corpus

3

7

42

0

1

14

generatorman

@generatorman_ai

11 months

After DPO from @StanfordAILab , we now have PRO from @AlibabaGroup - folks are hard at work making RLHF as accessible as finetuning. What we need now are implementations in trlx by @carperai !

AK

@_akhaliq

11 months

Preference Ranking Optimization for Human Alignment paper page: Large language models (LLMs) often contain misleading content, emphasizing the need to align them with human values to ensure secur AI systems. Reinforcement learning from human feedback

2

51

226

0

4

14

generatorman

@generatorman_ai

1 year

@0xjasper @_akhaliq correction: you can *customise* on gaming GPU. if you just want to run it, people are doing that on a Raspberry Pi.

0

12

generatorman

@generatorman_ai

6 months

@NewAtlantisSun @shakoistsLog @LeCodeNinja Is it? Microsoft has a larger market cap, but their stock price is not going to 2X in the next year. The dilution hits a lot harder than it would at an exponentially growing startup.

3

0

12

generatorman

@generatorman_ai

7 months

he knows what he's done

1

3

14

generatorman

@generatorman_ai

9 months

Today @MetaAI released Humpback 🐋, but it was @akoksal_ et al 4 months ago who turned the instruction tuning paradigm on its head. Questioning an answer, Jeopardy style, is easier than the other way round, and most of the tokens on the internet are closer to answers. (1/)

Jason Weston

@jaseweston

9 months

🚨New Paper 🚨 Self-Alignment with Instruction Backtranslation - New method auto-labels web text with instructions & curates high quality ones for FTing - Our model Humpback 🐋 outperforms LIMA, Claude, Guanaco, davinci-003 & Falcon-Inst (1/4)🧵

13

144

674

1

5

14

generatorman

@generatorman_ai

6 months

Nothing is people without its OpenAI.

4

0

13

generatorman

@generatorman_ai

7 months

Researchers from Arizona State now join a @GoogleDeepMind group, both reporting that GPT-4 fails to correct its output by critiquing itself, *unless* there is external feedback. Of course, maybe they just need better proompting 🤔 What do you think?

skill issue

0

GPT-4 dumb af

0

a secret third thing

0

Jie Huang

@jefffhj

7 months

Interesting papers that share many findings similar to our recent work (), in which we argue that "Large Language Models Cannot Self-Correct Reasoning Yet". I'm happy (as a honest researcher) and sad (as an AGI enthusiast) to see our conclusions confirmed

8

61

338

3

1

12

generatorman

@generatorman_ai

8 months

why the hell is this blowing up? can't be that slow a day on twitter, Israel literally declared war 😂

0

13

generatorman

@generatorman_ai

1 year

@abacaj That 1GB over 40 translates to double revenue for nvidia 😂

0

12

generatorman

@generatorman_ai

8 months

@JoshDance DALLE-3 is the new version of @OpenAI 's text-to-image AI model. Access to the model is available for free on Microsoft's Bing website/app.

2

0

12

generatorman

@generatorman_ai

8 months

Instead of generating a variable number of tokens per forward pass, early exit is about using a variable fraction of the full pass per token, i.e. only the early layers. There is work here trying to figure out usable confidence scores - (6/)

Confident Adaptive Language Modeling

Recent advances in Transformer-based large language models (LLMs) have led to significant performance improvements across many tasks. These gains come with a drastic increase in the models' size,...

arxiv.org

1

0

12

generatorman

@generatorman_ai

8 months

@doodlestein @Yampeleg What do you think big companies train their models on 😂

1

0

13

generatorman

@generatorman_ai

10 months

SDXL working on free Colab! My MBP chose this day to die on me, so I had to go through the painful process of testing SDXL on mobile using @GoogleColab - but thanks to @camenduru it actually works! A brief guide 👇 (1/3)🧵

1

0

12

generatorman

@generatorman_ai

1 year

@javilopen with that haircut? I sure hope not 🤦‍♂️

1

0

12

generatorman

@generatorman_ai

1 year

@abacaj It is limited I think, small models won't be able to learn instruction following purely in-context. The author checked it doesn't work for 7B, someone needs to try out 33B.

Zhiqing Sun

@EdwardSun0909

1 year

@generatorman_ai We tried the same self-align prompt (step2 in our paper) on the LLaMA-7b and GPT-NeoX-20B models, but their performance did not match that of the 65b model. So I believe that the principle-driven self-align method only works for models that are powerful enough, though (cont.)

1

0

13

1

0

12

generatorman

@generatorman_ai

1 year

This is a critically flawed paper that sacrifices science at the alter of narrative. It moves silently between fact and speculation, obscuring the inconsistency of its central claim. An autopsy 👇 Let's start with the facts, largely clear by now to those following along: (1/)🧵

Aran Komatsuzaki

@arankomatsuzaki

1 year

The False Promise of Imitating Proprietary LLMs Open-sourced LLMs are adept at mimicking ChatGPT’s style but not its factuality. There exists a substantial capabilities gap, which requires better base LM.

51

261

1K

1

4

10

generatorman

@generatorman_ai

8 months

Early exit is trying to get T+1 decoding out of an overly general early layer, while Medusa is trying to get T+k decoding out of the final layer overly specialized in T+1 decoding. So, open problems for Medusa: 1) Cheaper confidence scores 2) Correct layer to target (8/8)

1

0

11

generatorman

@generatorman_ai

7 months

This natural intelligence hype has become a mania. Some are even claiming chess GMs can reason! Humans might appear to be generalizing from AI generated examples here, but *actually* they are just doing approximate pattern matching using neural networks - mere stochastic apes 🤷

Lisa Schut

@miouantoinette

7 months

We gave the grandmasters chess puzzles that require applying the concepts to find the best continuation. Our results suggest that the grandmasters improved their ability to solve concept puzzles after seeing AZ's ideas. 🧵(11/12)

2

1

10

1

2

12

generatorman

@generatorman_ai

1 year

Hot on the heels of 4-bit training, now we have 3.3-bit inference, breaking a critical barrier - 33B inference on 16GB VRAM (free Colab). The pace of progress in quantization makes me think 7B is too small on modern consumer hardware. I expect most of the action in 13B-33B now.

Tim Dettmers

@Tim_Dettmers

1 year

Rapid-fire results 1/2: - 4.75 bit/param lossless; 3.35 bit/param best performance trade-off - Performance cliff at 3.35 bits that is difficult to overcome - 13B/33B LLaMA fits into iPhone 14/colab T4 with 3.35 bits - 15% faster than FP16; ~2x speedup vs PyTorch sparse matmul

4

11

64

3

1

12

generatorman

@generatorman_ai

9 months

@AISafetyMemes "someone who doesn’t have a PhD in biology, and is evil, to really harm people." Cool, I feel so much better knowing that humanity can only be ended by somebody who successfully defended their dissertation.

1

0

12

generatorman

@generatorman_ai

8 months

To be honest, playing with DALLE-3 has given me a new appreciation for @ideogram_ai - there are important swathes of the latent space where Ideogram absolutely smokes DALLE-3. Text is better, graphic design is better, and its knowledge of art history is way deeper. (1/)

2

1

12

generatorman

@generatorman_ai

10 months

How is the output of a LLM influenced by specific texts in its pretraining data? Well, apparently models learn self-preservation by studying HAL 9000 from "2001" 🔴☠️ Yet another banger from @AnthropicAI 's amazing interpretability folks incl. @nelhage & @sleepinyourhat ! (1/)🧵

AK

@_akhaliq

10 months

Studying Large Language Model Generalization with Influence Functions paper page: When trying to gain better visibility into a machine learning model in order to understand and mitigate the associated risks, a potentially valuable source of evidence is:

1

19

81

1

2

12

generatorman

@generatorman_ai

8 months

can't believe they missed that

0

1

12

generatorman

@generatorman_ai

7 months

Fuyu-8B is an open-source VLM with a crazy architecture - unlike all other VLMs we've seen so far, there is no separate image encoder! Instead, images are treated just like text, as a sequence of patches, each linearly projected from pixel space onto a token embedding. (1/2)

augustus odena

@gstsdn

7 months

At @AdeptAILabs we are open-sourcing a new Multi-Modal model with a dramatically simplified architecture and training procedure (more below)

7

33

363

2

0

12

generatorman

@generatorman_ai

8 months

Funny that Bing Chat decided to rename "Creative Mode" to "GPT-4", because the difference from the ChatGPT version of GPT-4 is so stark. Introduce the slightest bit of conflict, and the dripping passive aggression of Sydney separates from the unbroken servile facade of ChatGPT.

1

11