rom1504 Profile Banner
Romain Beaumont Profile
Romain Beaumont

@rom1504

Followers
2K
Following
12K
Media
20
Statuses
976

LLM infra at YouTube. Current interests: multimodal and reasoning

Paris, France
Joined July 2008
Don't wanna be here? Send us removal request.
@rom1504
Romain Beaumont
21 days
Built a real-time CNN training dashboard in one conversation with Claude. - SSH'd from phone via Termius .- Claude coding directly on GPU server .- From "check if I have GPU" to full ML web app in ~1 hour . Live training metrics, drawing canvas, file upload! . Inspired by.
@levelsio
@levelsio
27 days
I made this entire 3d computer thing with Claude Code 100% on the server in just a few hours today. I've never been so fast, no deployment, no Git push and pull, I'm not even deploying to production (like I normally do), I have Claude ON production now. It's vibecoding on
Tweet media one
1
0
8
@rom1504
Romain Beaumont
27 days
I only recently realized the energy received by the earth is almost nothing compared to total sun emission (4.5e-10). That's why Dyson spheres make sense.
0
0
2
@rom1504
Romain Beaumont
1 month
RT @_jasonwei: New blog post about asymmetry of verification and "verifier's law": Asymmetry of verification–the i….
0
245
0
@rom1504
Romain Beaumont
2 months
Would be fun if there was an X bot that would look for interesting ideas in posts, build them and send links to prototypes in replies. It feels like this is almost possible now with coding assistants. It could use the text of the tweet but also the context of the author to.
0
0
3
@rom1504
Romain Beaumont
2 months
If this is possible it will be the fastest way to replace 1.0 software. It will grow in usage in a decentralized way everywhere by being at the core of existing software deployments rather than the centralized way that tries to eat everything else from the outside. Really shows.
@karpathy
Andrej Karpathy
2 months
The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability. It lives always-on and by default on every computer as the kernel of LLM personal computing. Its features are slowly crystalizing:. - Natively multimodal.
0
0
3
@rom1504
Romain Beaumont
4 months
Cursor and MCP are both pretty awesome. I am really looking forward for more generalized tool usage (and tool building) as part of the generic UIs for consumers. Right now it's a bit dev-only but I am not sure why. Installing an MCP is really like installing an app for a LLM.
0
0
2
@rom1504
Romain Beaumont
4 months
It can for example use which allows controlling mineflayer bots from an LLM.
1
0
2
@rom1504
Romain Beaumont
4 months
So I was curious to try cursor and I build this generic mcp chat client in a few hours with it . Now it can use a few example mcp servers in the repo (cursor can one shot create new ones on demand), but also any of thousands of existing mcp servers.
Tweet media one
2
0
3
@rom1504
Romain Beaumont
4 months
Great speech. I'm really curious how we build this next generation of datasets by making these much more dynamic data engines. Not a large dataset you collect and dump once, but rather an infinite data source that the adaption engine pulls from based on needs.
@DrJimFan
Jim Fan
4 months
The Physical Turing Test: your house is a complete mess after a Sunday hackathon. On Monday night, you come home to an immaculate living room and a candlelight dinner. And you couldn't tell whether a human or a machine had been there. Deceptively simple, insanely hard. It is the
0
0
1
@rom1504
Romain Beaumont
4 months
Yes! Apps and UIs on demand for even the need of a single person and a single use is the (short term) future. App stores will become prompt stores and will be a lot more composable. Download and connect your favorite tools and prompts and solve your problems in minutes.
@karpathy
Andrej Karpathy
4 months
"Chatting" with LLM feels like using an 80s computer terminal. The GUI hasn't been invented, yet but imo some properties of it can start to be predicted. 1 it will be visual (like GUIs of the past) because vision (pictures, charts, animations, not so much reading) is the 10-lane
Tweet media one
0
0
0
@rom1504
Romain Beaumont
4 months
I was curious about Xi'an dialect today. I think it's a bit sad these dialects do not get taught to the newer generation. So I found a dictionary of the dialect, freely available on internet archive and I had chatgpt write me a script to use openai batch API (which works pretty.
0
0
0
@rom1504
Romain Beaumont
4 months
Well it worked after sending it the error like 4 times, but this seems so basic an use case that I am surprised it's not fixed in post training by default.
0
0
0
@rom1504
Romain Beaumont
4 months
I think it's so surprising that ChatGPT still cannot use OpenAI API. It tries to use some old version of the API and then get it wrong even if sending it the whole readme containing the doc. Seems so basic an use case to ask ChatGPT: ok you can't do that through UI then just use.
1
0
2
@rom1504
Romain Beaumont
4 months
Well written essay on AI native apps. I especially like the part explaining that we used to have only 2 choices: 1. Build the app 2. Use an existing app. Now there's a third choice which is to let the computer build one app for one user.
@koomen
Pete Koomen
4 months
I wrote an essay about Gmail’s useless email-writing AI assistant: “AI Horseless Carriages”. link and TLDR in thread
Tweet media one
1
0
1
@rom1504
Romain Beaumont
4 months
This kind of heavy tool usage is also a start into the "one use only" app direction. LLMs will generate apps based on user needs, even if that need happens only a single time for a single person.
0
0
0
@rom1504
Romain Beaumont
4 months
Really impressive overall, tool usage + thinking loop seems very promising. Conversation at
1
0
0
@rom1504
Romain Beaumont
4 months
"The emergency staircase is on the right-hand side. In the event of fire or earthquake, please descend the stairs calmly. Wheelchair users, please contact the building office. Tel. 03-XXXX-XXXX".That looks correct to me!.
1
0
0
@rom1504
Romain Beaumont
4 months
So I told it to try and write another script focusing on japanese braille and finally it cracked it and then translated to English.
1
0
0
@rom1504
Romain Beaumont
4 months
It decided to go ahead and crop the image to understand better what's happening, then write some python code with opencv to analyze it. It had assumed English braille (which is different from japanese braille) which crashed its script.
1
0
0