bycloud @bycloudai profile

bycloud

@bycloudai

Followers

8K

Following

4K

Media

627

Statuses

1K

I make youtube vids on cool AI research /// AI papers newsletter https://t.co/Xn7GMDbQSd /// paper recap @TheAITimeline /// building @findmypapersAI

Joined January 2020

Don't wanna be here? Send us removal request.

bycloud

@bycloudai

1 month

I shipped something cool. findmypapers (dot) ai. A semantic search engine for 300k+ AI research papers. outcompete SoTA Deep Research apps at finding relevant research papers for you. more demos👇

18

34

312

bycloud

@bycloudai

3 months

the grok-3 benchmark is pretty useful in comparing base models, so I added GPT-4.5

185

227

2K

bycloud

@bycloudai

11 months

I got a great trailer for yall

Mistral AI

@MistralAI

11 months

27

147

1K

bycloud

@bycloudai

3 months

Claude 3.7 is cool, but i still ended up using grok-3 somehow. something's off about claude 3.7 and I just cant pinpoint why.

98

19

874

bycloud

@bycloudai

5 months

someone has finally done it .test time compute + diffusion models.a really interesting one for sure 🧵

10

91

830

bycloud

@bycloudai

6 months

the 4 horsemen of OpenAI apocalypse has now been assembled

21

57

805

bycloud

@bycloudai

2 months

no model is able to escape the 66% accuracy @ 120k tokens, except Gemini 2.5 Pro which sits at 90%. even the new GPT-4.1 with 1 mil ctx is stuck at 60%. (please tells us your secret gemini🥺).

Fiction.live

@ficlive

2 months

Long Context benchmark updated with GPT-4.1. Looks like it's the "optimus" version instead of the better performing original quasar. The smaller versions are not usable in long context.

40

64

805

bycloud

@bycloudai

3 months

how does DeepSeek V3 win against GPT-4.5? (NOT R1 btw). openAI claimed that GPT-4.5 is a VERY big model, yet GPT-4.5 falls short compared to DeepSeek-V3. What.

72

54

707

bycloud

@bycloudai

7 months

super interesting read. maybe we just need to find the rules that are class 4 equivalent when generating synthetic data to get better performance on reasoning. making a video on this now😳

15

52

591

bycloud

@bycloudai

2 months

what also intrigued me about this is that @ 120k context window, 2.5 pro did a 90% accuracy while no one else crossed 66% . everyone else starts to fall off hard @ 4k. what new attention technique did google invent???.(and why is there a sudden dip at 16k???????)

michelle

@_mchenco

2 months

small tangent - people always ask about gemini context window, yeah it’s big, it probably uses some sliding window-like architecture too (don’t quote me). most notably though, google has it’s own proprietary accelerators called TPUs. much more GPU memory, so they can fit larger.

30

39

558

bycloud

@bycloudai

12 days

Gemini Diffusion is my fav GoogleIO announcement. vibe coding at 1000tok/s hits different.multi-turn looks good so far.(no video speedup or anything). insanely bullish on diffusionLM

18

39

486

bycloud

@bycloudai

2 months

OPENAI DEPRECATING GPT4.5 CUZ GPT-4.1 IS BETTER???????.

40

9

390

bycloud

@bycloudai

3 years

Due to some very kind sponsors. TOTAL PRIZE HAS JUST DOUBLED📈. Chance to win from a total prize pool of $3000 USD!. AI Generated Art Competition along with my video [Details] [Submission Link] 🗓️Aug 7th

21

82

333

bycloud

@bycloudai

2 months

amazing convo but did not age well😭

2

7

270

bycloud

@bycloudai

4 months

Jim Fan

@DrJimFan

4 months

We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive - truly open, frontier research that empowers all. It makes no sense. The most entertaining outcome is the most likely. DeepSeek-R1 not only open-sources a barrage of models but

3

18

212

bycloud

@bycloudai

1 year

while we are still waiting for the codes for Animate Anyone, here's a quick run down on how it's this good 😎👇

4

24

207

bycloud

@bycloudai

1 year

@DrJimFan this is way too real and trippy.

4

0

201

bycloud

@bycloudai

2 months

> llama-4 series got 0% on ARC-AGI 2.> scout got 0.5% and maverick got 4.38% on ARC-AGI 1

ARC Prize

@arcprize

2 months

Llama 4 Maverick and Scout on ARC-AGI's Semi Private Evaluation. Maverick:.* ARC-AGI-1: 4.38% ($0.0078/task).* ARC-AGI-2: 0.00% ($0.0121/task). Scout:.* ARC-AGI-1: 0.50% ($0.0041/task).* ARC-AGI-2: 0.00% ($0.0062/task)

9

6

189

bycloud

@bycloudai

1 year

Mamba but it's a lobotomy kaisen edit . here's the actual link for my mamba video tho.

12

20

167

bycloud

@bycloudai

2 months

the speed is like generating a harry potter book in 2 seconds 💀.

NVIDIA AI Developer

@NVIDIAAIDev

2 months

👀 Accelerate performance of @AIatMeta Llama 4 Maverick and Llama 4 Scout using our optimizations in #opensource TensorRT-LLM.⚡. ✅ NVIDIA Blackwell B200 delivers over 42,000 tokens per second on Llama 4 Scout, over 32,000 tokens per seconds on Llama 4 Maverick. ✅ 3.4X more

4

5

159

bycloud

@bycloudai

6 months

We had image generation copying LLM. and now the reverse??. DiffusionLM -> Masked Diffusion Model is an interesting one, and here is some end of year copium.

7

12

156

bycloud

@bycloudai

2 months

omg gemini 2.5 pro pricing doesnt cost a kidney.shows that SoTA wouldn't need to ask for $600

5

4

153

bycloud

@bycloudai

2 months

sooooo are we gonna talk about how at least 50% of the research is done by chinese researchers & a lot of them are from local chinese labs?. and a lot of them are written in complete fluent english?. imagine the amount of knowledge in chinese that we are missing out.

17

5

146

bycloud

@bycloudai

8 months

no paper, empty github, project page that is unpublished which contained technical details. bruh. please don't normalize this, it's just embarassing

AK

@_akhaliq

9 months

Tencent presents GameGen-O. Open-world Video Game Generation. We introduce GameGen-O, the first diffusion transformer model tailored for the generation of open-world video games. This model facilitates high-quality, open-domain generation by simulating a wide array of game engine

4

14

144

bycloud

@bycloudai

4 months

this is kinda cute

3

133

bycloud

@bycloudai

1 year

just want to let you know that my AI newsletter is back online! .It covers top AI research papers from previous week and explained them simply. My goal:.- let you comprehend an AI paper's impact EASILY & FAST.- explained with images .- not noisy . perfect for AI enthusiasts😎

9

11

116

bycloud

@bycloudai

8 months

there's now rate limits on arxiv. ? wtf

17

0

122

bycloud

@bycloudai

1 month

Gemini 2.5 Pro is just the best choice for AI right now .

3

5

119

bycloud

@bycloudai

5 months

How Distributed Training Can Revive Open Source AI

6

14

104

bycloud

@bycloudai

3 months

I drew it so the bar might be off by a tiny bit. would be interesting to see across more benchmarks ngl but im editing rn

Peter Yang

@petergyang

3 months

@bycloudai Add Claude 3.7.

8

6

104

bycloud

@bycloudai

1 month

1) WHAT.

Fiction.live

@ficlive

1 month

OpenAI Strikes Back

3

104

bycloud

@bycloudai

1 month

> be me.> about to launch my first app ever.> weeks of prep, hyped myself through the roof.> accept that failure is likely, still hyped anyway.> ready to announce to the world.> *deep breath*.> find out brand account on X got banned day b4.> bruh_face.gif.> speedrun fail any%

12

1

103

bycloud

@bycloudai

2 months

@sama 💀

1

101

bycloud

@bycloudai

6 months

1 BIT MAMBA AHHHHHHHHHHHH.

8

3

100

bycloud

@bycloudai

1 year

Within 24 hours, we got:.Google - Gemini 1.5 Pro.Meta - V-JEPA.OpenAI - Sora.Mistral - Next. @DrJimFan is NVIDIA really not cooking anything?👀.

6

15

96

bycloud

@bycloudai

1 month

or. everyone’s hard drive??? maybe?.

Sam Altman

@sama

1 month

goodbye, GPT-4. you kicked off a revolution. we will proudly keep your weights on a special hard drive to give to some historians in the future.

1

3

97

bycloud

@bycloudai

4 months

While DeepSeek gave incredible insights on tackling test-time compute, also coincidentally proved similar points, that's ALSO published today:. Simple RL > complex search (goodbye MCTS & PRM??). We are actually eating good tn🧵

4

7

97

bycloud

@bycloudai

2 months

Today, I will be taking a step back as a content creator on YouTube. Instead, I'll be chasing my dreams and focusing on my own AI SaaS, on top of moving to the city of AI: San Francisco, in hopes of getting into YC . Please wish me luck. I shall return with a product demo

19

0

93

bycloud

@bycloudai

4 months

actually did not expect sama to say this

10

5

91

bycloud

@bycloudai

22 days

it's not really "AGI" when u have to RL it for literally every use case tho. I think it's better depicted with "RL is just the key to integrate powerful LLM/AI into anything".

Chubby♨️

@kimmonismus

22 days

RL is the key to AGI. Or as OpenAI says: AGI is an operational problem now

17

3

92

bycloud

@bycloudai

1 month

damn I've been skillmaxxing with gemini 2.5 series lately, im ngmi ig . but at least my wallet will make it.

Sam Altman

@sama

1 month

if you are not skillsmaxxing with o3 at minimum 3 hours every day, ngmi.

3

0

92

bycloud

@bycloudai

2 months

🚨OpenAI just announced GPT-4.1 non-reasoning model series tailored to devs, available through API. GPT-4.1 .GPT-4.1-mini.GPT-4.1-nano. their FIRST ever 1M context window model!. looks incredibly at coding for a non-reasoning model too

7

6

88

bycloud

@bycloudai

12 days

they are just aura farming at this point.

Sundar Pichai

@sundarpichai

13 days

Having a deep think.

2

1

96

bycloud

@bycloudai

2 years

I can envision the future generation where they just don't know how to socialize anymore. We probably only need 1 year to perfect the voices too. ChatAnything: FaceTime Chat With LLM-Enhanced Personas.

5

18

82

bycloud

@bycloudai

2 years

Img2Img video generation is taking the main seat for coherent and consistent video generation. there is so much potential right now that it can probably bring an unprecedented effect onto short form content. check out my video and my thoughts here:

4

16

77

bycloud

@bycloudai

1 year

now the Will Smith text-to-video benchmark is complete with the addition of ground truth ✨.

Will Smith

@WillSmith2real

1 year

This is getting out of hand!. - Will Smith

1

7

74

bycloud

@bycloudai

1 year

alright boys im ready

3

76

bycloud

@bycloudai

19 days

meta management be like

1

3

77

bycloud

@bycloudai

2 years

One of the best new multi-modal LLM called Qwen-VL was released a few days ago, but deleted their models right after they published their finetune codes. Wth is happening 🧐

5

6

72

bycloud

@bycloudai

1 year

rip open source, SD3 might actually never see the light, and probably going to be locked behind API forever to fix their finance 😔.

Stability AI

@StabilityAI

1 year

An announcement from Stability AI:

7

4

71

bycloud

@bycloudai

2 years

After all the hype with 3D Gaussian Splatting, how is it really different from NeRF and why do people say it's so much better?. To find out, check out my latest video (9 mins) about what 3D Gaussian Splatting really is, and how NeRF might be replaced🧐.

1

8

72

bycloud

@bycloudai

4 months

if you don’t know, NVDA is currently having a chinese new year sale of up to 16% off!

1

70

bycloud

@bycloudai

2 months

probably one of the most insane breakthrough from openai on multimodal model rn but everyone's been using it to generate ghibli 😭.

5

2

71

bycloud

@bycloudai

4 months

A Slightly Technical Breakdown of DeepSeek-R1.

7

70

bycloud

@bycloudai

2 years

I am only SLIGHTLYYYY late to the news .38 days late is not too bad right? . This video has been in production for way too long and I will go rest now (regarding to my last tweet lol)

1

13

66

bycloud

@bycloudai

3 months

ngl i feel like hardcoding word filters to guardrail LLMs is better than hardcore RLHF and give it brain damage.

7

6

70

bycloud

@bycloudai

5 months

after using like 10m tokens across API calls from anthropic, openai, and deepseek. deepseek is my new favorite model and I am making a video about it.

9

2

67

bycloud

@bycloudai

8 months

haters gonna say AGI is not achieved 2 years ago

5

3

62

bycloud

@bycloudai

8 months

@nachoyawn THERE'S AN OLLAMA BOTTLE?.

2

0

60

bycloud

@bycloudai

4 months

fuck it, a second video.

bycloud

@bycloudai

4 months

A Slightly Technical Breakdown of DeepSeek-R1.

4

5

62

bycloud

@bycloudai

1 year

So. this video is AI generated?.we are so not ready for this

5

59

bycloud

@bycloudai

20 days

3

4

60

bycloud

@bycloudai

11 months

thanks for the support my fellow homies💀

bycloud

@bycloudai

11 months

AI generated videos are actually getting out of hand . hope yall like my intro💀

4

1

55

bycloud

@bycloudai

2 months

> mamba-transformer hybrid reasoning model near on par with DeepSeek-R1. what.

Hunyuan

@TencentHunyuan

2 months

🚀 Introducing Hunyuan-T1! 🌟. Meet Hunyuan-T1, the latest breakthrough in AI reasoning! Powered by Hunyuan TurboS, it's built for speed, accuracy, and efficiency. 🔥. ✅ Hybrid-Mamba-Transformer MoE Architecture – The first of its kind for ultra-large-scale reasoning.✅ Strong

2

3

56

bycloud

@bycloudai

2 months

i am speechless of how lore accurate this is

1

0

56

bycloud

@bycloudai

1 month

i got UNBANNED???. I did not know that is possible holy shit. W in the chat

bycloud

@bycloudai

1 month

> be me.> about to launch my first app ever.> weeks of prep, hyped myself through the roof.> accept that failure is likely, still hyped anyway.> ready to announce to the world.> *deep breath*.> find out brand account on X got banned day b4.> bruh_face.gif.> speedrun fail any%

5

1

56

bycloud

@bycloudai

2 months

a saturday release is definitely something new. .

5

1

55

bycloud

@bycloudai

3 years

A small AI Generated Art Competition along with my most recent video Chance to win from a total prize pool of $1500+ USD!. [Details] [Submission Link] #AIart #discodiffusion #midjourney #AiArtwork #aiartist

5

6

52

bycloud

@bycloudai

1 year

time to jump ship again after 35 days 🫡

kache

@yacineMTB

1 year

anthropicbros. not like this.

2

1

52

bycloud

@bycloudai

1 month

Meta AI did something WILD again. wtf is Next Concept Prediction?.

3

53

bycloud

@bycloudai

4 months

They are stuck giving compute to cursor

theseriousadult

@gallabytes

4 months

deepseek caught up faster than I expected. sick af. one question though - where THE FUCK is Anthropic?.

1

5

52

bycloud

@bycloudai

2 months

aite bro it ain't that deep it's just a bug

5

0

52

bycloud

@bycloudai

2 years

GPT-4 image understanding capabilities, taken from its paper. - understanding memes.- solving math questions with diagrams.- explaining/summarizing academic papers from images.- explain irl image understanding of objects and realize that its VGA outside but lighting cable inside

2

8

46

bycloud

@bycloudai

11 months

it's time to cook

Karan Dalal

@karansdalal

11 months

I’m excited to share a project I’ve been working on for over a year, which I believe will fundamentally change our approach to language models. We’ve designed a new architecture, which replaces the hidden state of an RNN with a machine learning model. This model compresses

4

3

50

bycloud

@bycloudai

2 months

big updates in last 2 days:.- DeepSeek-V3-0324 released (SoTA OS model).- Reve Image released (SoTA image gen?).- Gemini-2.5-Pro released (new SoTA LLM?).- GPT-4o image gen released (the actual SoTA image gen + editing???).- ARC-AGI 2 that's actually REALLY hard?.what else?.

3

2

52

bycloud

@bycloudai

1 year

WE GOT MAMBA-2 THIS SOON?????????. by Tri Dao and Albert Gu. the same authors for mamba-1.and Tri Dao is also the author for flash attention 1 & 2. will read the paper later and update y’all 😎

3

7

50

bycloud

@bycloudai

4 months

humor is also a strong signal of intelligence and an ever evolving concept, it should be an incredible benchmark to measure intelligence for AGI/ASI.

2

0

51

bycloud

@bycloudai

10 months

1) what

Mistral AI

@MistralAI

10 months

2

47

bycloud

@bycloudai

11 months

i love my youtube comments

4

46

bycloud

@bycloudai

4 months

reddit Q&A from OpenAI. for chatgpt plus users ($20), the limit is.o1: 50 msgs/week.o3-mini-high: 50 msgs/week.o3-mini: 150 msgs/day. and no plan to increase price over time, might even decrease. no news about gpt-5

5

3

46

bycloud

@bycloudai

1 year

While we still need to take demos with a grain of salt, but here's Claude 3.5 Sonnet making a game😳. and apparently its better than GPT-4o AND FREE

3

6

46

bycloud

@bycloudai

4 months

IT'S ABOUT TO BE LEGANDARY.(just give me a week to edit😭). (this is 100% a signal for help, if u edit, pls slide into my dm)

4

1

46

bycloud

@bycloudai

6 months

rip claude 🙏

7

2

46

bycloud

@bycloudai

3 years

👑Most Popular AI Research July 2022👑. Measured based on total Twitter likes!.#ArtificialIntelligence #MachineLearning

4

5

47

bycloud

@bycloudai

5 months

if it cannot look at my handwritten math and convert it into latex code, it ain’t AGI.

7

2

48

bycloud

@bycloudai

9 months

As im also making a video on model distillation, this is probs one of my favorite paper this week. So u basically distill a transformer into a mamba and it can "retain" its original capabilities. This performs best on benchmarks compared to any "existing" RNN attn hybrid. cope?.

The AI Timeline

@TheAITimeline

9 months

The Mamba in the Llama: Distilling and Accelerating Hybrid Models. Author’s explanation:. Overview:.This work shows that large Transformer models can be distilled into linear RNNs, like Mamba, using a fraction of their attention layers while maintaining

1

4

47

bycloud

@bycloudai

1 year

Have you heard of Diffusion Transformers? Seems like the current meta for media synthesis 🤔. OpenAI's Sora uses it.Stable Diffusion 3 uses it. here's a closer look at this DiT bad boy.

4

5

47

bycloud

@bycloudai

3 months

Claude 3.7 Sonnet is now live. At least Anthropic is consistent at naming things

1

0

45

bycloud

@bycloudai

2 months

ahhhh thats where the line's drawn

4

0

45

bycloud

@bycloudai

2 months

4

0

46

bycloud

@bycloudai

2 months

Grok-3 Beta (Think) wins on AIME 2024, 2025 & GPQA Diamond ($3/$15). o3 wins on aider polyglot, but most expensive ($10/$40 ). Gemini is still the best on 1M context tho and that's their moat ($1.25/$10). P.S. AIME 2024 & 2025 can be contaminated VERY easily, grain of salt pls

1

4

46

bycloud

@bycloudai

6 months

Disney's new animation research is sickkk. Factorized Motion Diffusion for Precise and Character-Agnostic Motion Inbetweening. This approach combines a character-agnostic Bézier Motion Model (BMM), trained on large motion datasets, with a character-specific posing model optimized

2

8

44

bycloud

@bycloudai

2 years

LoRA and Mixes are getting out of hand.

4

8

44

bycloud

@bycloudai

3 months

bycloud

@bycloudai

3 months

Claude 3.7 Sonnet is now live. At least Anthropic is consistent at naming things

0

2

45

bycloud

@bycloudai

4 months

as a token of appreciation from me, here's a meme for you o7

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭

@elder_plinius

4 months

Interesting how the final message upon winning this CTF contains no thank you, no congratulations, no confetti animation, no coupon for a Golden Gate Claude t-shirt. Just: "Back to the datamines, pleb!".

4

45

bycloud

@bycloudai

2 months

This bonker of a week all started with deepseek-v3 shipping a 0324 out of nowhere.

3

0

44

bycloud

@bycloudai

1 year

POV: the last thing you see before you get fired.

Google AI

@GoogleAI

1 year

#ChatDirector is a research prototype that brings 3D avatars and automatic layout transitions to your 2D laptop screen, transforming online meetings to be more immersive and dynamic. Check it out →

1

2

43

bycloud

@bycloudai

1 year

1) what.

Liliang Ren

@liliang_ren

1 year

Introducing Samba 3.8B, a simple Mamba+Sliding Window Attention architecture that outperforms Phi3-mini on major benchmarks (e.g., MMLU, GSM8K and HumanEval) by a large margin.😮 And it has an infinite context length with linear complexity.🤯. Paper:

4

44

bycloud

@bycloudai

3 months

the selling point of the product is now “watch us burn more GPU runtime” because no normies can tell the difference between a $200 a $2000 and a $20000 tier. but unfortunately more runtime ≠ better answers and once normies realize that, the “bubble” will pop.

The Information

@theinformation

3 months

AI Agenda: OpenAI Plots Charging $20,000 a Month For PhD-Level Agents. OpenAI is planning three types of agents for which it could charge $2,000 to $20,000 a month. Read more from @steph_palazzolo and @coryweinberg👇.

5

4

44

bycloud

@bycloudai

3 months

10 Million Context Window might not just be a dream. In this video, I will be talking about How Google's "Transformer 2.0" Might Be The AI Breakthrough We Need by imitating how human memories work.

0

6

43