bycloudai Profile Banner
bycloud Profile
bycloud

@bycloudai

Followers
8K
Following
4K
Media
627
Statuses
1K

I make youtube vids on cool AI research /// AI papers newsletter https://t.co/Xn7GMDbQSd /// paper recap @TheAITimeline /// building @findmypapersAI

Joined January 2020
Don't wanna be here? Send us removal request.
@bycloudai
bycloud
1 month
I shipped something cool. findmypapers (dot) ai. A semantic search engine for 300k+ AI research papers. outcompete SoTA Deep Research apps at finding relevant research papers for you. more demos👇
18
34
312
@bycloudai
bycloud
3 months
the grok-3 benchmark is pretty useful in comparing base models, so I added GPT-4.5
Tweet media one
185
227
2K
@bycloudai
bycloud
11 months
I got a great trailer for yall
@MistralAI
Mistral AI
11 months
Tweet media one
Tweet media two
27
147
1K
@bycloudai
bycloud
3 months
Claude 3.7 is cool, but i still ended up using grok-3 somehow. something's off about claude 3.7 and I just cant pinpoint why.
98
19
874
@bycloudai
bycloud
5 months
someone has finally done it .test time compute + diffusion models.a really interesting one for sure 🧵
Tweet media one
10
91
830
@bycloudai
bycloud
6 months
the 4 horsemen of OpenAI apocalypse has now been assembled
Tweet media one
21
57
805
@bycloudai
bycloud
2 months
no model is able to escape the 66% accuracy @ 120k tokens, except Gemini 2.5 Pro which sits at 90%. even the new GPT-4.1 with 1 mil ctx is stuck at 60%. (please tells us your secret gemini🥺).
@ficlive
Fiction.live
2 months
Long Context benchmark updated with GPT-4.1. Looks like it's the "optimus" version instead of the better performing original quasar. The smaller versions are not usable in long context.
Tweet media one
40
64
805
@bycloudai
bycloud
3 months
how does DeepSeek V3 win against GPT-4.5? (NOT R1 btw). openAI claimed that GPT-4.5 is a VERY big model, yet GPT-4.5 falls short compared to DeepSeek-V3. What.
Tweet media one
72
54
707
@bycloudai
bycloud
7 months
super interesting read. maybe we just need to find the rules that are class 4 equivalent when generating synthetic data to get better performance on reasoning. making a video on this now😳
Tweet media one
15
52
591
@bycloudai
bycloud
2 months
what also intrigued me about this is that @ 120k context window, 2.5 pro did a 90% accuracy while no one else crossed 66% . everyone else starts to fall off hard @ 4k. what new attention technique did google invent???.(and why is there a sudden dip at 16k???????)
Tweet media one
@_mchenco
michelle
2 months
small tangent - people always ask about gemini context window, yeah it’s big, it probably uses some sliding window-like architecture too (don’t quote me). most notably though, google has it’s own proprietary accelerators called TPUs. much more GPU memory, so they can fit larger.
30
39
558
@bycloudai
bycloud
12 days
Gemini Diffusion is my fav GoogleIO announcement. vibe coding at 1000tok/s hits different.multi-turn looks good so far.(no video speedup or anything). insanely bullish on diffusionLM
18
39
486
@bycloudai
bycloud
2 months
OPENAI DEPRECATING GPT4.5 CUZ GPT-4.1 IS BETTER???????.
40
9
390
@bycloudai
bycloud
3 years
Due to some very kind sponsors. TOTAL PRIZE HAS JUST DOUBLED📈. Chance to win from a total prize pool of $3000 USD!. AI Generated Art Competition along with my video [Details] [Submission Link] 🗓️Aug 7th
Tweet media one
21
82
333
@bycloudai
bycloud
2 months
amazing convo but did not age well😭
Tweet media one
2
7
270
@bycloudai
bycloud
4 months
Tweet media one
@DrJimFan
Jim Fan
4 months
We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive - truly open, frontier research that empowers all. It makes no sense. The most entertaining outcome is the most likely. DeepSeek-R1 not only open-sources a barrage of models but
Tweet media one
3
18
212
@bycloudai
bycloud
1 year
while we are still waiting for the codes for Animate Anyone, here's a quick run down on how it's this good 😎👇
4
24
207
@bycloudai
bycloud
1 year
@DrJimFan this is way too real and trippy.
4
0
201
@bycloudai
bycloud
2 months
> llama-4 series got 0% on ARC-AGI 2.> scout got 0.5% and maverick got 4.38% on ARC-AGI 1
@arcprize
ARC Prize
2 months
Llama 4 Maverick and Scout on ARC-AGI's Semi Private Evaluation. Maverick:.* ARC-AGI-1: 4.38% ($0.0078/task).* ARC-AGI-2: 0.00% ($0.0121/task). Scout:.* ARC-AGI-1: 0.50% ($0.0041/task).* ARC-AGI-2: 0.00% ($0.0062/task)
Tweet media one
9
6
189
@bycloudai
bycloud
1 year
Mamba but it's a lobotomy kaisen edit . here's the actual link for my mamba video tho.
12
20
167
@bycloudai
bycloud
2 months
the speed is like generating a harry potter book in 2 seconds 💀.
@NVIDIAAIDev
NVIDIA AI Developer
2 months
👀 Accelerate performance of @AIatMeta Llama 4 Maverick and Llama 4 Scout using our optimizations in #opensource TensorRT-LLM.⚡. ✅ NVIDIA Blackwell B200 delivers over 42,000 tokens per second on Llama 4 Scout, over 32,000 tokens per seconds on Llama 4 Maverick. ✅ 3.4X more
Tweet media one
4
5
159
@bycloudai
bycloud
6 months
We had image generation copying LLM. and now the reverse??. DiffusionLM -> Masked Diffusion Model is an interesting one, and here is some end of year copium.
Tweet media one
7
12
156
@bycloudai
bycloud
2 months
omg gemini 2.5 pro pricing doesnt cost a kidney.shows that SoTA wouldn't need to ask for $600
Tweet media one
5
4
153
@bycloudai
bycloud
2 months
sooooo are we gonna talk about how at least 50% of the research is done by chinese researchers & a lot of them are from local chinese labs?. and a lot of them are written in complete fluent english?. imagine the amount of knowledge in chinese that we are missing out.
17
5
146
@bycloudai
bycloud
8 months
no paper, empty github, project page that is unpublished which contained technical details. bruh. please don't normalize this, it's just embarassing
Tweet media one
Tweet media two
Tweet media three
@_akhaliq
AK
9 months
Tencent presents GameGen-O. Open-world Video Game Generation. We introduce GameGen-O, the first diffusion transformer model tailored for the generation of open-world video games. This model facilitates high-quality, open-domain generation by simulating a wide array of game engine
4
14
144
@bycloudai
bycloud
4 months
this is kinda cute
Tweet media one
3
3
133
@bycloudai
bycloud
1 year
just want to let you know that my AI newsletter is back online! .It covers top AI research papers from previous week and explained them simply. My goal:.- let you comprehend an AI paper's impact EASILY & FAST.- explained with images .- not noisy . perfect for AI enthusiasts😎
9
11
116
@bycloudai
bycloud
8 months
there's now rate limits on arxiv. ? wtf
Tweet media one
17
0
122
@bycloudai
bycloud
1 month
Gemini 2.5 Pro is just the best choice for AI right now .
Tweet media one
3
5
119
@bycloudai
bycloud
5 months
How Distributed Training Can Revive Open Source AI
Tweet media one
6
14
104
@bycloudai
bycloud
3 months
I drew it so the bar might be off by a tiny bit. would be interesting to see across more benchmarks ngl but im editing rn
Tweet media one
@petergyang
Peter Yang
3 months
@bycloudai Add Claude 3.7.
8
6
104
@bycloudai
bycloud
1 month
1) WHAT.
@ficlive
Fiction.live
1 month
OpenAI Strikes Back
Tweet media one
3
3
104
@bycloudai
bycloud
1 month
> be me.> about to launch my first app ever.> weeks of prep, hyped myself through the roof.> accept that failure is likely, still hyped anyway.> ready to announce to the world.> *deep breath*.> find out brand account on X got banned day b4.> bruh_face.gif.> speedrun fail any%
Tweet media one
12
1
103
@bycloudai
bycloud
2 months
@sama 💀
Tweet media one
1
1
101
@bycloudai
bycloud
6 months
1 BIT MAMBA AHHHHHHHHHHHH.
Tweet media one
8
3
100
@bycloudai
bycloud
1 year
Within 24 hours, we got:.Google - Gemini 1.5 Pro.Meta - V-JEPA.OpenAI - Sora.Mistral - Next. @DrJimFan is NVIDIA really not cooking anything?👀.
6
15
96
@bycloudai
bycloud
1 month
or. everyone’s hard drive??? maybe?.
@sama
Sam Altman
1 month
goodbye, GPT-4. you kicked off a revolution. we will proudly keep your weights on a special hard drive to give to some historians in the future.
1
3
97
@bycloudai
bycloud
4 months
While DeepSeek gave incredible insights on tackling test-time compute, also coincidentally proved similar points, that's ALSO published today:. Simple RL > complex search (goodbye MCTS & PRM??). We are actually eating good tn🧵
Tweet media one
4
7
97
@bycloudai
bycloud
2 months
Today, I will be taking a step back as a content creator on YouTube. Instead, I'll be chasing my dreams and focusing on my own AI SaaS, on top of moving to the city of AI: San Francisco, in hopes of getting into YC . Please wish me luck. I shall return with a product demo
Tweet media one
19
0
93
@bycloudai
bycloud
4 months
actually did not expect sama to say this
Tweet media one
10
5
91
@bycloudai
bycloud
22 days
it's not really "AGI" when u have to RL it for literally every use case tho. I think it's better depicted with "RL is just the key to integrate powerful LLM/AI into anything".
@kimmonismus
Chubby♨️
22 days
RL is the key to AGI. Or as OpenAI says: AGI is an operational problem now
17
3
92
@bycloudai
bycloud
1 month
damn I've been skillmaxxing with gemini 2.5 series lately, im ngmi ig . but at least my wallet will make it.
@sama
Sam Altman
1 month
if you are not skillsmaxxing with o3 at minimum 3 hours every day, ngmi.
3
0
92
@bycloudai
bycloud
2 months
🚨OpenAI just announced GPT-4.1 non-reasoning model series tailored to devs, available through API. GPT-4.1 .GPT-4.1-mini.GPT-4.1-nano. their FIRST ever 1M context window model!. looks incredibly at coding for a non-reasoning model too
Tweet media one
Tweet media two
Tweet media three
7
6
88
@bycloudai
bycloud
12 days
they are just aura farming at this point.
@sundarpichai
Sundar Pichai
13 days
Having a deep think.
Tweet media one
2
1
96
@bycloudai
bycloud
2 years
I can envision the future generation where they just don't know how to socialize anymore. We probably only need 1 year to perfect the voices too. ChatAnything: FaceTime Chat With LLM-Enhanced Personas.
5
18
82
@bycloudai
bycloud
2 years
Img2Img video generation is taking the main seat for coherent and consistent video generation. there is so much potential right now that it can probably bring an unprecedented effect onto short form content. check out my video and my thoughts here:
Tweet media one
4
16
77
@bycloudai
bycloud
1 year
now the Will Smith text-to-video benchmark is complete with the addition of ground truth ✨.
@WillSmith2real
Will Smith
1 year
This is getting out of hand!. - Will Smith
1
7
74
@bycloudai
bycloud
1 year
alright boys im ready
Tweet media one
3
3
76
@bycloudai
bycloud
19 days
meta management be like
Tweet media one
1
3
77
@bycloudai
bycloud
2 years
One of the best new multi-modal LLM called Qwen-VL was released a few days ago, but deleted their models right after they published their finetune codes. Wth is happening 🧐
Tweet media one
Tweet media two
Tweet media three
Tweet media four
5
6
72
@bycloudai
bycloud
1 year
rip open source, SD3 might actually never see the light, and probably going to be locked behind API forever to fix their finance 😔.
@StabilityAI
Stability AI
1 year
An announcement from Stability AI:
7
4
71
@bycloudai
bycloud
2 years
After all the hype with 3D Gaussian Splatting, how is it really different from NeRF and why do people say it's so much better?. To find out, check out my latest video (9 mins) about what 3D Gaussian Splatting really is, and how NeRF might be replaced🧐.
Tweet media one
1
8
72
@bycloudai
bycloud
4 months
if you don’t know, NVDA is currently having a chinese new year sale of up to 16% off!
Tweet media one
1
1
70
@bycloudai
bycloud
2 months
probably one of the most insane breakthrough from openai on multimodal model rn but everyone's been using it to generate ghibli 😭.
5
2
71
@bycloudai
bycloud
4 months
A Slightly Technical Breakdown of DeepSeek-R1.
Tweet media one
7
7
70
@bycloudai
bycloud
2 years
I am only SLIGHTLYYYY late to the news .38 days late is not too bad right? . This video has been in production for way too long and I will go rest now (regarding to my last tweet lol)
Tweet media one
1
13
66
@bycloudai
bycloud
3 months
ngl i feel like hardcoding word filters to guardrail LLMs is better than hardcore RLHF and give it brain damage.
7
6
70
@bycloudai
bycloud
5 months
after using like 10m tokens across API calls from anthropic, openai, and deepseek. deepseek is my new favorite model and I am making a video about it.
9
2
67
@bycloudai
bycloud
8 months
haters gonna say AGI is not achieved 2 years ago
Tweet media one
5
3
62
@bycloudai
bycloud
8 months
@nachoyawn THERE'S AN OLLAMA BOTTLE?.
2
0
60
@bycloudai
bycloud
4 months
fuck it, a second video.
Tweet media one
@bycloudai
bycloud
4 months
A Slightly Technical Breakdown of DeepSeek-R1.
Tweet media one
4
5
62
@bycloudai
bycloud
1 year
So. this video is AI generated?.we are so not ready for this
5
5
59
@bycloudai
bycloud
20 days
Tweet media one
3
4
60
@bycloudai
bycloud
11 months
thanks for the support my fellow homies💀
Tweet media one
@bycloudai
bycloud
11 months
AI generated videos are actually getting out of hand . hope yall like my intro💀
Tweet media one
4
1
55
@bycloudai
bycloud
2 months
> mamba-transformer hybrid reasoning model near on par with DeepSeek-R1. what.
@TencentHunyuan
Hunyuan
2 months
🚀 Introducing Hunyuan-T1! 🌟. Meet Hunyuan-T1, the latest breakthrough in AI reasoning! Powered by Hunyuan TurboS, it's built for speed, accuracy, and efficiency. 🔥. ✅ Hybrid-Mamba-Transformer MoE Architecture – The first of its kind for ultra-large-scale reasoning.✅ Strong
Tweet media one
Tweet media two
2
3
56
@bycloudai
bycloud
2 months
i am speechless of how lore accurate this is
Tweet media one
1
0
56
@bycloudai
bycloud
1 month
i got UNBANNED???. I did not know that is possible holy shit. W in the chat
Tweet media one
@bycloudai
bycloud
1 month
> be me.> about to launch my first app ever.> weeks of prep, hyped myself through the roof.> accept that failure is likely, still hyped anyway.> ready to announce to the world.> *deep breath*.> find out brand account on X got banned day b4.> bruh_face.gif.> speedrun fail any%
Tweet media one
5
1
56
@bycloudai
bycloud
2 months
a saturday release is definitely something new. .
5
1
55
@bycloudai
bycloud
3 years
A small AI Generated Art Competition along with my most recent video Chance to win from a total prize pool of $1500+ USD!. [Details] [Submission Link] #AIart #discodiffusion #midjourney #AiArtwork #aiartist
Tweet media one
5
6
52
@bycloudai
bycloud
1 year
time to jump ship again after 35 days 🫡
Tweet media one
@yacineMTB
kache
1 year
anthropicbros. not like this.
2
1
52
@bycloudai
bycloud
1 month
Meta AI did something WILD again. wtf is Next Concept Prediction?.
Tweet media one
3
3
53
@bycloudai
bycloud
4 months
They are stuck giving compute to cursor
Tweet media one
@gallabytes
theseriousadult
4 months
deepseek caught up faster than I expected. sick af. one question though - where THE FUCK is Anthropic?.
1
5
52
@bycloudai
bycloud
2 months
aite bro it ain't that deep it's just a bug
Tweet media one
5
0
52
@bycloudai
bycloud
2 years
GPT-4 image understanding capabilities, taken from its paper. - understanding memes.- solving math questions with diagrams.- explaining/summarizing academic papers from images.- explain irl image understanding of objects and realize that its VGA outside but lighting cable inside
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
8
46
@bycloudai
bycloud
11 months
it's time to cook
Tweet media one
@karansdalal
Karan Dalal
11 months
I’m excited to share a project I’ve been working on for over a year, which I believe will fundamentally change our approach to language models. We’ve designed a new architecture, which replaces the hidden state of an RNN with a machine learning model. This model compresses
Tweet media one
4
3
50
@bycloudai
bycloud
2 months
big updates in last 2 days:.- DeepSeek-V3-0324 released (SoTA OS model).- Reve Image released (SoTA image gen?).- Gemini-2.5-Pro released (new SoTA LLM?).- GPT-4o image gen released (the actual SoTA image gen + editing???).- ARC-AGI 2 that's actually REALLY hard?.what else?.
3
2
52
@bycloudai
bycloud
1 year
WE GOT MAMBA-2 THIS SOON?????????. by Tri Dao and Albert Gu. the same authors for mamba-1.and Tri Dao is also the author for flash attention 1 & 2. will read the paper later and update y’all 😎
Tweet media one
3
7
50
@bycloudai
bycloud
4 months
humor is also a strong signal of intelligence and an ever evolving concept, it should be an incredible benchmark to measure intelligence for AGI/ASI.
2
0
51
@bycloudai
bycloud
10 months
1) what
Tweet media one
@MistralAI
Mistral AI
10 months
2
2
47
@bycloudai
bycloud
11 months
i love my youtube comments
Tweet media one
4
4
46
@bycloudai
bycloud
4 months
reddit Q&A from OpenAI. for chatgpt plus users ($20), the limit is.o1: 50 msgs/week.o3-mini-high: 50 msgs/week.o3-mini: 150 msgs/day. and no plan to increase price over time, might even decrease. no news about gpt-5
Tweet media one
Tweet media two
Tweet media three
5
3
46
@bycloudai
bycloud
1 year
While we still need to take demos with a grain of salt, but here's Claude 3.5 Sonnet making a game😳. and apparently its better than GPT-4o AND FREE
Tweet media one
Tweet media two
Tweet media three
3
6
46
@bycloudai
bycloud
4 months
IT'S ABOUT TO BE LEGANDARY.(just give me a week to edit😭). (this is 100% a signal for help, if u edit, pls slide into my dm)
Tweet media one
4
1
46
@bycloudai
bycloud
6 months
rip claude 🙏
Tweet media one
7
2
46
@bycloudai
bycloud
3 years
👑Most Popular AI Research July 2022👑. Measured based on total Twitter likes!.#ArtificialIntelligence #MachineLearning
Tweet media one
4
5
47
@bycloudai
bycloud
5 months
if it cannot look at my handwritten math and convert it into latex code, it ain’t AGI.
7
2
48
@bycloudai
bycloud
9 months
As im also making a video on model distillation, this is probs one of my favorite paper this week. So u basically distill a transformer into a mamba and it can "retain" its original capabilities. This performs best on benchmarks compared to any "existing" RNN attn hybrid. cope?.
@TheAITimeline
The AI Timeline
9 months
The Mamba in the Llama: Distilling and Accelerating Hybrid Models. Author’s explanation:. Overview:.This work shows that large Transformer models can be distilled into linear RNNs, like Mamba, using a fraction of their attention layers while maintaining
Tweet media one
1
4
47
@bycloudai
bycloud
1 year
Have you heard of Diffusion Transformers? Seems like the current meta for media synthesis 🤔. OpenAI's Sora uses it.Stable Diffusion 3 uses it. here's a closer look at this DiT bad boy.
Tweet media one
4
5
47
@bycloudai
bycloud
3 months
Claude 3.7 Sonnet is now live. At least Anthropic is consistent at naming things
Tweet media one
1
0
45
@bycloudai
bycloud
2 months
ahhhh thats where the line's drawn
Tweet media one
4
0
45
@bycloudai
bycloud
2 months
Tweet media one
4
0
46
@bycloudai
bycloud
2 months
Grok-3 Beta (Think) wins on AIME 2024, 2025 & GPQA Diamond ($3/$15). o3 wins on aider polyglot, but most expensive ($10/$40 ). Gemini is still the best on 1M context tho and that's their moat ($1.25/$10). P.S. AIME 2024 & 2025 can be contaminated VERY easily, grain of salt pls
Tweet media one
1
4
46
@bycloudai
bycloud
6 months
Disney's new animation research is sickkk. Factorized Motion Diffusion for Precise and Character-Agnostic Motion Inbetweening. This approach combines a character-agnostic Bézier Motion Model (BMM), trained on large motion datasets, with a character-specific posing model optimized
2
8
44
@bycloudai
bycloud
2 years
LoRA and Mixes are getting out of hand.
Tweet media one
4
8
44
@bycloudai
bycloud
3 months
Tweet media one
@bycloudai
bycloud
3 months
Claude 3.7 Sonnet is now live. At least Anthropic is consistent at naming things
Tweet media one
0
2
45
@bycloudai
bycloud
4 months
as a token of appreciation from me, here's a meme for you o7
Tweet media one
@elder_plinius
Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭
4 months
Interesting how the final message upon winning this CTF contains no thank you, no congratulations, no confetti animation, no coupon for a Golden Gate Claude t-shirt. Just: "Back to the datamines, pleb!".
4
4
45
@bycloudai
bycloud
2 months
This bonker of a week all started with deepseek-v3 shipping a 0324 out of nowhere.
3
0
44
@bycloudai
bycloud
1 year
POV: the last thing you see before you get fired.
@GoogleAI
Google AI
1 year
#ChatDirector is a research prototype that brings 3D avatars and automatic layout transitions to your 2D laptop screen, transforming online meetings to be more immersive and dynamic. Check it out →
1
2
43
@bycloudai
bycloud
1 year
1) what.
@liliang_ren
Liliang Ren
1 year
Introducing Samba 3.8B, a simple Mamba+Sliding Window Attention architecture that outperforms Phi3-mini on major benchmarks (e.g., MMLU, GSM8K and HumanEval) by a large margin.😮 And it has an infinite context length with linear complexity.🤯. Paper:
Tweet media one
4
4
44
@bycloudai
bycloud
3 months
the selling point of the product is now “watch us burn more GPU runtime” because no normies can tell the difference between a $200 a $2000 and a $20000 tier. but unfortunately more runtime ≠ better answers and once normies realize that, the “bubble” will pop.
@theinformation
The Information
3 months
AI Agenda: OpenAI Plots Charging $20,000 a Month For PhD-Level Agents. OpenAI is planning three types of agents for which it could charge $2,000 to $20,000 a month. Read more from @steph_palazzolo and @coryweinberg👇.
5
4
44
@bycloudai
bycloud
3 months
10 Million Context Window might not just be a dream. In this video, I will be talking about How Google's "Transformer 2.0" Might Be The AI Breakthrough We Need by imitating how human memories work.
Tweet media one
0
6
43