tarantulae Profile Banner
Christian S. Perone Profile
Christian S. Perone

@tarantulae

Followers
8K
Following
7K
Media
949
Statuses
10K

Machine Learning, Computer Science, Math. Computer Science (UPF Brazil) 🇧🇷🧉 / Machine Learning (@polymtl/@UMontreal). Working with Autonomous Vehicles in UK

London, United Kingdom
Joined February 2009
Don't wanna be here? Send us removal request.
@tarantulae
Christian S. Perone
1 year
I'm also in bsky:
0
1
4
@tarantulae
Christian S. Perone
1 day
Gemma3n was released a few months ago, I wasn't able to find more info and I found it a *very interesting* architecture with a lot of innovations (Matryoshka Transformer, MobileNetV5, etc), so I decided to dig further, here you are the slides of this talk: https://t.co/izMbmav1cO
1
5
15
@Jaxxonjewelry
JAXXON
10 days
Style built for the spotlight. Crafted for performance. Blake Snell wears JAXXON.
0
7
75
@azwagner_
Adam Zsolt Wagner
3 days
Really happy to share our new paper on using AlphaEvolve for mathematical exploration at scale, written with Javier Gómez-Serrano, Terence Tao, and @GoogleDeepMind's Bogdan Georgiev. We tested it on 67 problems and documented all our successes and failures. 🧵
19
136
845
@soumithchintala
Soumith Chintala
2 days
Leaving Meta and PyTorch I'm stepping down from PyTorch and leaving Meta on November 17th. tl;dr: Didn't want to be doing PyTorch forever, seemed like the perfect time to transition right after I got back from a long leave and the project built itself around me. Eleven years
484
498
10K
@pushmeet
Pushmeet Kohli
3 days
(1) Our team at @GoogleDeepMind has been collaborating with Terence Tao and Javier Gómez-Serrano to use our AI agents (AlphaEvolve, AlphaProof, & Gemini Deep Think) for advancing Maths research. They find that AlphaEvolve can help discover new results across a range of problems.
26
181
2K
@JordanRejaud
Jordan Réjaud
2 months
44% of released incarcerated people in the U.S. are arrested within a year. 3 years? 68% 5 years? 77% Two factors consistently lower re-offense rates: - Education while incarcerated - Strong family communication Full article in comments @PrisonPolicy @ElizabethHolmes
9
7
83
@StefanoErmon
Stefano Ermon
2 days
When we began applying diffusion to language in my lab at Stanford, many doubted it could work. That research became Mercury diffusion LLM: 10X faster, more efficient, and now the foundation of @_inception_ai. Proud to raise $50M with support from top investors.
@_inception_ai
Inception
2 days
Today’s LLMs are painfully slow and expensive. They are autoregressive and spit out words sequentially. One. At. A. Time. Our dLLMs generate text in parallel, delivering answers up to 10X faster. Now we’ve raised $50M to scale them. Full story from @russellbrandom in
38
85
1K
@Kimi_Moonshot
Kimi.ai
2 days
🚀 Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. 🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%) 🔹 Executes up to 200 – 300 sequential tool calls without human interference 🔹 Excels in reasoning, agentic search, and coding 🔹 256K context window Built
531
1K
9K
@tarantulae
Christian S. Perone
2 days
jax.config.update('jax_default_matmul_precision', https://t.co/pt7fplFj9R_HIGHEST)
@sundarpichai
Sundar Pichai
4 days
Our TPUs are headed to space!  Inspired by our history of moonshots, from quantum computing to autonomous driving, Project Suncatcher is exploring how we could one day build scalable ML compute systems in space, harnessing more of the sun’s power (which emits more power than 100
0
0
5
@mathusmassias
Mathurin Massias
4 days
🌀New paper on the generation phases of Flow Matching https://t.co/tzG2kPVGsE Are FM & diffusion models nothing else than denoisers trained at every noise level? In theory yes, *if trained optimally*. But in practice, do all noise level matter equally?
6
105
665
@liftedtrucks
Lifted Trucks
1 month
Follow America's Number One Custom Truck Dealer on X
0
24
118
@jiqizhixin
机器之心 JIQIZHIXIN
6 days
Wow, language models can talk without words. A new framework, Cache-to-Cache (C2C), lets multiple LLMs communicate directly through their KV-caches instead of text, transferring deep semantics without token-by-token generation. It fuses cache representations via a neural
19
291
6K
@tarantulae
Christian S. Perone
5 days
Interesting, but I was expecting an Orin or Thor evaluation for deployment.
@drmapavone
Marco Pavone
5 days
Excited to unveil @nvidia's latest work on #Reasoning Vision–Language–Action (#VLA) models — Alpamayo-R1! Alpamayo-R1 is a new #reasoning VLA architecture featuring a diffusion-based action expert built on top of the #Cosmos-#Reason backbone. It represents one of the core
0
0
0
@giffmana
Lucas Beyer (bl16)
6 days
Ah! If you recently came across claims like "A100 are known bad for RL" on your feed and like me you raised an eyebrow, because how on earth does such a statement make any sense?! Here is the likely resolution:
@RichardYRLi
Yingru Li
7 days
@danielhanchen, glad you liked the post! You're spot on to suspect lower-level implementation issues. That's exactly what we found in the original blog. The disable_cascade_attn finding (Sec 4.2.4) was the symptom, but the root cause was that silent FlashAttention-2 kernel bug
17
29
467
@tarantulae
Christian S. Perone
7 days
Last time I checked, humans were part of nature.
@rohanpaul_ai
Rohan Paul
7 days
Fei-Fei Li (@drfeifei) on limitations of LLMs. "There's no language out there in nature. You don't go out in nature and there's words written in the sky for you.. There is a 3D world that follows laws of physics." Language is purely generated signal. https://t.co/FOomRpGTad
1
0
6
@SonglinYang4
Songlin Yang
9 days
Many people are confused by Minimax’s recent return to full attention - especially since it was the first large-scale pivot toward hybrid linear attention - and by Kimi’s later adoption of hybrid linear variants (as well as earlier attempts by Qwen3-Next, or Qwen3.5). I actually
12
64
505
@Auki
Auki
14 hours
Put simply, the real world web onboards robots to new physical spaces instantly. Simplify robot deployment → scale robot deployment. Accelerate.
2
15
61
@tarantulae
Christian S. Perone
9 days
Google Gemma is being trained on some peculiar datasets 😅
0
0
2
@huijiezh
Huijie Zhang
12 days
Few-step diffusion model field is wild, and there are many methods trying to train a high-quality few-step generator from scratch: Consistency Models, Shortcut Models, and MeanFlow. Turns out, they could be unified in a quite elegant way, which we did in our recent work.
3
50
395
@tarantulae
Christian S. Perone
11 days
https://t.co/S4kVCbDmHX, very cool move from NVIDIA.
0
0
0
@ShayneRedford
Shayne Longpre
11 days
Q2: Which languages actually help each other during training? And how much? 🌟Answer: We measure this empirically. We built a 38×38 transfer matrix, or 1,444 language pairs—the largest such resource to date. We highlight the top 5 most beneficial source languages for each
1
3
20
@WilderWorld
Wilder World
5 hours
Another WILD week in Wiami 🌴 Here’s how the ecosystem evolved this week:
9
8
55
@tarantulae
Christian S. Perone
11 days
"- Once AGI is declared by OpenAI, that declaration will now be verified by an independent expert panel." "OpenAI: we declare AGI." 8 months later "Expert panel: once AGI is defined, that definition will now be verified by another independent expert panel.
@AndrewCurran_
Andrew Curran
11 days
OpenAI has completed its recapitalization. The nonprofit is now called the OpenAI Foundation. Big changes to the relationship between OpenAI and Microsoft. We finally get to see the ownership numbers. I will quote: 'Microsoft holds an investment in OpenAI Group PBC valued at
0
0
2
@MIT_CSAIL
MIT CSAIL
13 days
To give context on age, Grace Hopper began computing at 38, completed the first compiler at 46, helped shape COBOL at 53, kept developing COBOL for the Navy in her 70s, retired from the Navy at 80 & then became a consultant for the Digital Equipment Corporation. v/@cooperx86
33
272
1K
@danijarh
Danijar Hafner
15 days
Congratulations @Yoshua_Bengio!! Possibly the first scientist with one million citations in the world. Crazy how fast the field has grown 🤯
78
375
5K