Christian S. Perone
@tarantulae
Followers
8K
Following
7K
Media
949
Statuses
10K
Machine Learning, Computer Science, Math. Computer Science (UPF Brazil) 🇧🇷🧉 / Machine Learning (@polymtl/@UMontreal). Working with Autonomous Vehicles in UK
London, United Kingdom
Joined February 2009
Gemma3n was released a few months ago, I wasn't able to find more info and I found it a *very interesting* architecture with a lot of innovations (Matryoshka Transformer, MobileNetV5, etc), so I decided to dig further, here you are the slides of this talk: https://t.co/izMbmav1cO
1
5
15
Style built for the spotlight. Crafted for performance. Blake Snell wears JAXXON.
0
7
75
Really happy to share our new paper on using AlphaEvolve for mathematical exploration at scale, written with Javier Gómez-Serrano, Terence Tao, and @GoogleDeepMind's Bogdan Georgiev. We tested it on 67 problems and documented all our successes and failures. 🧵
19
136
845
Leaving Meta and PyTorch I'm stepping down from PyTorch and leaving Meta on November 17th. tl;dr: Didn't want to be doing PyTorch forever, seemed like the perfect time to transition right after I got back from a long leave and the project built itself around me. Eleven years
484
498
10K
(1) Our team at @GoogleDeepMind has been collaborating with Terence Tao and Javier Gómez-Serrano to use our AI agents (AlphaEvolve, AlphaProof, & Gemini Deep Think) for advancing Maths research. They find that AlphaEvolve can help discover new results across a range of problems.
26
181
2K
44% of released incarcerated people in the U.S. are arrested within a year. 3 years? 68% 5 years? 77% Two factors consistently lower re-offense rates: - Education while incarcerated - Strong family communication Full article in comments @PrisonPolicy
@ElizabethHolmes
9
7
83
When we began applying diffusion to language in my lab at Stanford, many doubted it could work. That research became Mercury diffusion LLM: 10X faster, more efficient, and now the foundation of @_inception_ai. Proud to raise $50M with support from top investors.
Today’s LLMs are painfully slow and expensive. They are autoregressive and spit out words sequentially. One. At. A. Time. Our dLLMs generate text in parallel, delivering answers up to 10X faster. Now we’ve raised $50M to scale them. Full story from @russellbrandom in
38
85
1K
🚀 Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. 🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%) 🔹 Executes up to 200 – 300 sequential tool calls without human interference 🔹 Excels in reasoning, agentic search, and coding 🔹 256K context window Built
531
1K
9K
jax.config.update('jax_default_matmul_precision', https://t.co/pt7fplFj9R_HIGHEST)
Our TPUs are headed to space! Inspired by our history of moonshots, from quantum computing to autonomous driving, Project Suncatcher is exploring how we could one day build scalable ML compute systems in space, harnessing more of the sun’s power (which emits more power than 100
0
0
5
🌀New paper on the generation phases of Flow Matching https://t.co/tzG2kPVGsE Are FM & diffusion models nothing else than denoisers trained at every noise level? In theory yes, *if trained optimally*. But in practice, do all noise level matter equally?
6
105
665
Interesting, but I was expecting an Orin or Thor evaluation for deployment.
Excited to unveil @nvidia's latest work on #Reasoning Vision–Language–Action (#VLA) models — Alpamayo-R1! Alpamayo-R1 is a new #reasoning VLA architecture featuring a diffusion-based action expert built on top of the #Cosmos-#Reason backbone. It represents one of the core
0
0
0
Ah! If you recently came across claims like "A100 are known bad for RL" on your feed and like me you raised an eyebrow, because how on earth does such a statement make any sense?! Here is the likely resolution:
@danielhanchen, glad you liked the post! You're spot on to suspect lower-level implementation issues. That's exactly what we found in the original blog. The disable_cascade_attn finding (Sec 4.2.4) was the symptom, but the root cause was that silent FlashAttention-2 kernel bug
17
29
467
Last time I checked, humans were part of nature.
Fei-Fei Li (@drfeifei) on limitations of LLMs. "There's no language out there in nature. You don't go out in nature and there's words written in the sky for you.. There is a 3D world that follows laws of physics." Language is purely generated signal. https://t.co/FOomRpGTad
1
0
6
Many people are confused by Minimax’s recent return to full attention - especially since it was the first large-scale pivot toward hybrid linear attention - and by Kimi’s later adoption of hybrid linear variants (as well as earlier attempts by Qwen3-Next, or Qwen3.5). I actually
12
64
505
Put simply, the real world web onboards robots to new physical spaces instantly. Simplify robot deployment → scale robot deployment. Accelerate.
2
15
61
Google Gemma is being trained on some peculiar datasets 😅
0
0
2
Few-step diffusion model field is wild, and there are many methods trying to train a high-quality few-step generator from scratch: Consistency Models, Shortcut Models, and MeanFlow. Turns out, they could be unified in a quite elegant way, which we did in our recent work.
3
50
395
Q2: Which languages actually help each other during training? And how much? 🌟Answer: We measure this empirically. We built a 38×38 transfer matrix, or 1,444 language pairs—the largest such resource to date. We highlight the top 5 most beneficial source languages for each
1
3
20
Another WILD week in Wiami 🌴 Here’s how the ecosystem evolved this week:
9
8
55
"- Once AGI is declared by OpenAI, that declaration will now be verified by an independent expert panel." "OpenAI: we declare AGI." 8 months later "Expert panel: once AGI is defined, that definition will now be verified by another independent expert panel.
OpenAI has completed its recapitalization. The nonprofit is now called the OpenAI Foundation. Big changes to the relationship between OpenAI and Microsoft. We finally get to see the ownership numbers. I will quote: 'Microsoft holds an investment in OpenAI Group PBC valued at
0
0
2
To give context on age, Grace Hopper began computing at 38, completed the first compiler at 46, helped shape COBOL at 53, kept developing COBOL for the Navy in her 70s, retired from the Navy at 80 & then became a consultant for the Digital Equipment Corporation. v/@cooperx86
33
272
1K
Congratulations @Yoshua_Bengio!! Possibly the first scientist with one million citations in the world. Crazy how fast the field has grown 🤯
78
375
5K