Romain Beaumont @rom1504 X Profile

Romain Beaumont

@rom1504

Followers

2K

Following

12K

Media

20

Statuses

976

LLM infra at YouTube. Current interests: multimodal and reasoning

Paris, France

Joined July 2008

Don't wanna be here? Send us removal request.

Romain Beaumont

@rom1504

21 days

github.com

Real-time CNN training dashboard with web interface. Train neural networks on MNIST with live metrics, interactive charts, and image classification. 🤖 Fully live-coded with Claude on the server! -...

0

3

Romain Beaumont

@rom1504

21 days

Built a real-time CNN training dashboard in one conversation with Claude. - SSH'd from phone via Termius .- Claude coding directly on GPU server .- From "check if I have GPU" to full ML web app in ~1 hour . Live training metrics, drawing canvas, file upload! . Inspired by.

@levelsio

27 days

I made this entire 3d computer thing with Claude Code 100% on the server in just a few hours today. I've never been so fast, no deployment, no Git push and pull, I'm not even deploying to production (like I normally do), I have Claude ON production now. It's vibecoding on

1

0

8

Romain Beaumont

@rom1504

27 days

I only recently realized the energy received by the earth is almost nothing compared to total sun emission (4.5e-10). That's why Dyson spheres make sense.

0

2

Romain Beaumont

@rom1504

1 month

RT @_jasonwei: New blog post about asymmetry of verification and "verifier's law": Asymmetry of verification–the i….

0

245

0

Romain Beaumont

@rom1504

2 months

Would be fun if there was an X bot that would look for interesting ideas in posts, build them and send links to prototypes in replies. It feels like this is almost possible now with coding assistants. It could use the text of the tweet but also the context of the author to.

0

3

Romain Beaumont

@rom1504

2 months

If this is possible it will be the fastest way to replace 1.0 software. It will grow in usage in a decentralized way everywhere by being at the core of existing software deployments rather than the centralized way that tries to eat everything else from the outside. Really shows.

Andrej Karpathy

@karpathy

2 months

The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability. It lives always-on and by default on every computer as the kernel of LLM personal computing. Its features are slowly crystalizing:. - Natively multimodal.

0

3

Romain Beaumont

@rom1504

4 months

Cursor and MCP are both pretty awesome. I am really looking forward for more generalized tool usage (and tool building) as part of the generic UIs for consumers. Right now it's a bit dev-only but I am not sure why. Installing an MCP is really like installing an app for a LLM.

0

2

Romain Beaumont

@rom1504

4 months

It can for example use which allows controlling mineflayer bots from an LLM.

1

0

2

Romain Beaumont

@rom1504

4 months

So I was curious to try cursor and I build this generic mcp chat client in a few hours with it . Now it can use a few example mcp servers in the repo (cursor can one shot create new ones on demand), but also any of thousands of existing mcp servers.

2

0

3

Romain Beaumont

@rom1504

4 months

Great speech. I'm really curious how we build this next generation of datasets by making these much more dynamic data engines. Not a large dataset you collect and dump once, but rather an infinite data source that the adaption engine pulls from based on needs.

Jim Fan

@DrJimFan

4 months

The Physical Turing Test: your house is a complete mess after a Sunday hackathon. On Monday night, you come home to an immaculate living room and a candlelight dinner. And you couldn't tell whether a human or a machine had been there. Deceptively simple, insanely hard. It is the

0

1

Romain Beaumont

@rom1504

4 months

Yes! Apps and UIs on demand for even the need of a single person and a single use is the (short term) future. App stores will become prompt stores and will be a lot more composable. Download and connect your favorite tools and prompts and solve your problems in minutes.

Andrej Karpathy

@karpathy

4 months

"Chatting" with LLM feels like using an 80s computer terminal. The GUI hasn't been invented, yet but imo some properties of it can start to be predicted. 1 it will be visual (like GUIs of the past) because vision (pictures, charts, animations, not so much reading) is the 10-lane

0

Romain Beaumont

@rom1504

4 months

I was curious about Xi'an dialect today. I think it's a bit sad these dialects do not get taught to the newer generation. So I found a dictionary of the dialect, freely available on internet archive and I had chatgpt write me a script to use openai batch API (which works pretty.

0

Romain Beaumont

@rom1504

4 months

Well it worked after sending it the error like 4 times, but this seems so basic an use case that I am surprised it's not fixed in post training by default.

0

Romain Beaumont

@rom1504

4 months

I think it's so surprising that ChatGPT still cannot use OpenAI API. It tries to use some old version of the API and then get it wrong even if sending it the whole readme containing the doc. Seems so basic an use case to ask ChatGPT: ok you can't do that through UI then just use.

1

0

2

Romain Beaumont

@rom1504

4 months

Well written essay on AI native apps. I especially like the part explaining that we used to have only 2 choices: 1. Build the app 2. Use an existing app. Now there's a third choice which is to let the computer build one app for one user.

Pete Koomen

@koomen

4 months

I wrote an essay about Gmail’s useless email-writing AI assistant: “AI Horseless Carriages”. link and TLDR in thread

1

0

1

Romain Beaumont

@rom1504

4 months

This kind of heavy tool usage is also a start into the "one use only" app direction. LLMs will generate apps based on user needs, even if that need happens only a single time for a single person.

0

Romain Beaumont

@rom1504

4 months

Really impressive overall, tool usage + thinking loop seems very promising. Conversation at

1

0

Romain Beaumont

@rom1504

4 months

"The emergency staircase is on the right-hand side. In the event of fire or earthquake, please descend the stairs calmly. Wheelchair users, please contact the building office. Tel. 03-XXXX-XXXX".That looks correct to me!.

1

0

Romain Beaumont

@rom1504

4 months

So I told it to try and write another script focusing on japanese braille and finally it cracked it and then translated to English.

1

0

Romain Beaumont

@rom1504

4 months

It decided to go ahead and crop the image to understand better what's happening, then write some python code with opencv to analyze it. It had assumed English braille (which is different from japanese braille) which crashed its script.

1

0