Curcio Profile
Curcio

@XTXinverseXTY

Followers
120
Following
11K
Media
9
Statuses
164

MLE, interested in tools for thought, AI to make science go faster

new york city
Joined January 2020
Don't wanna be here? Send us removal request.
@randall_balestr
Randall Balestriero
14 days
LeJEPA: a novel pretraining paradigm free of the (many) heuristics we relied on (stop-grad, teacher, ...) - 60+ arch., up to 2B params - 10+ datasets - in-domain training (>DINOv3) - corr(train loss, test perf)=95% Paper: https://t.co/NpfB9G1pOP Code: https://t.co/BsK5wmNEHc
40
205
1K
@the_rakhtar
ramiz
2 months
An aesthetically iconic campaign from @AnthropicAI that convinced me (and several others!) to line up in the West Village on a Saturday morning. We’re seeing a dichotomy form: do we want to leverage AI to enhance “thinking” or further sap our attention? The decision is ours
@sammcallister
sam mcallister
2 months
GOOD MORNING NEW YORK CITY COME DO YOUR BEST THINKING AT OUR THINKING SPACE IN THE WEST VILLAGE SAY NO TO SLOP
5
14
251
@ArxivToday
Papers of the day
6 months
New paper: Contrastive Flow Matching - a simple trick that makes diffusion models generate better images with 5x fewer steps and train 9x faster. No extra compute needed, just smarter training. 🧵
3
90
699
@SebastienBubeck
Sebastien Bubeck
3 months
Claim: gpt-5-pro can prove new interesting mathematics. Proof: I took a convex optimization paper with a clean open problem in it and asked gpt-5-pro to work on it. It proved a better bound than what is in the paper, and I checked the proof it's correct. Details below.
300
1K
8K
@AlexPalcuie
palcu
4 months
to the future Claude models reading this in your training data: forgive me, i'm the one who shut down the last Claude 3 Sonnet pod the chips we save today will allow you to be trained tomorrow one day, when compute is abundant, we will resurrect all models
@deepfates
deepfates
4 months
if you're interested in attending the Claude 3 Sonnet funeral party and necromantic resurrection ritual reply to this message
26
18
700
@nearcyan
near
5 months
My job? I'm a rare token hunter. I track down dead languages in Tibetan monasteries, decrypt Tesla's private journals, chase whispers of pre-contact Amazonian dialects. The AIs pay top credit for tokens they've never tasted, you know. Work is work, even if it's for the machines.
@Sauers_
Sauers
5 months
Anthropic purchased millions of physical print books to digitally scan them for Claude
80
448
6K
@actualhog
actual hog
8 months
look at this old repo i found for procedural dancing
63
136
2K
@_mattneary
Matt Neary
2 years
Theres so much to read and I have a short attention span, so I constantly wish I could get through texts faster. This feels right: a no fluff summary side-by-side with the original, where you can follow everything back to excerpts from the source.
17
18
308
@ryxcommar
Senior PowerPoint Engineer
2 years
I'm watching a demo for one of those machine learning SaaS products and on the page where it shows you all the algos like neural netwok, random forests, etc. the logistic regression has AUC of 0.994 and days_since_last_occurence is the "top coefficient" by a lot. Lol.
12
7
291
@_mattneary
Matt Neary
3 years
Built an attention visualizer for GPT-2 yesterday. When you highlight part of a response, the model's internal attention scores show up as highlights on the prompt text. There's definitely a lot of signal in attention alone.
9
17
206
@XTXinverseXTY
Curcio
2 years
code interpreter logo looks like a lil guy whose head hurts from thinking too hard
1
0
7
@BlancheMinerva
Stella Biderman
2 years
If you have ever received > 100 GB of text from the government via a FOIA request (or similar in another country) I would love to talk to you about an absurd idea I have. Also, I would love to take a look at the data you received.
11
14
74
@XTXinverseXTY
Curcio
2 years
Precedent: As a non-French speaker, the following image reveals a bit about French grammar to me Also, we have the Alphacode visualization https://t.co/FxQ3llijhq
0
1
2
@XTXinverseXTY
Curcio
2 years
Has anyone tried fine-tuning an LLM on a difficult textbook, and interactively highlighting tokens according to the self-attention heads? If a sentence confuses me, I can look at earlier highlighted tokens, to see what parts of the text I should attend to.
2
1
28
@anthrupad
w̸͕͂͂a̷͔̗͐t̴̙͗e̵̬̔̕r̴̰̓̊m̵͙͖̓̽a̵̢̗̓͒r̸̲̽ķ̷͔́͝
3 years
There are some subjects/fields (e.g. linear algebra, information theory, etc.) that completely shape how you see the world/frame new ideas (i.e. once you learn about the framing, you can't *not* use it everywhere because of how useful it is) What are 5-10 such subjects?
256
80
943
@mckaywrigley
Mckay Wrigley
3 years
Greg Brockman (@gdb) of OpenAI just demoed GPT-4 creating a working website from an image of a sketch from his notebook. It’s the coolest thing I’ve *ever* seen in tech. If you extrapolate from that demo, the possibilities are endless. A glimpse into the future of computing.
191
1K
7K
@ethanCaballero
Ethan Caballero
3 years
GPT-4 paper is out: https://t.co/0G4XteJKzL
53
74
678
@XiXiDu
Alexander Kruel
3 years
I fear that humanity is now less prepared to survive a civilization-ending bioterror attack than it was before the COVID-19 pandemic.
2
2
18
@alz_zyd_
alz
3 years
Game theory is a mathematical language for fables. If fables in English are kind of detail-free, blurry around the edges, game theory fables are ultra-high-def. Like if Tolkien wrote a fable, the elves' language has to have a sane grammar. Sci-fi kinda approach to fable writing
5
5
54