
Brad Hilton
@bradthilton
Followers
868
Following
90K
Media
312
Statuses
3K
Reinforcement Learning Research Engineer • Sometimes Political Commentator • Husband and Father • Believer in Jesus Christ
Orem, UT
Joined February 2013
looks like there's a sweet spot for planning.
Learning When to Plan. LLM agents trained with dynamic planning learn when to spend test-time compute, balancing cost & performance. This is the first work to explore training LLM agents for dynamic test-time compute allocation in sequential decision-making tasks.
0
0
2
this would be really entertaining.
Enough with the backroom scheming. If @realDonaldTrump is serious about intervening in the mayoral race, he should come to New York City and debate me directly.
0
0
1
is this mostly due to distillation of claude models by chinese labs?.
Anthropic banned Claude in certain regions, explicitly labeling China as an “adversarial nation” in yesterday’s blog. Many Chinese people say they’re unsubscribing and switching from Claude Code to OpenAI Codex. What did Dario see during his 1 year at Baidu?
0
0
1
RT @Yampeleg: 𝗦𝘁𝗲𝗽 𝟭: Buy a sh*tload of GPUs.𝗦𝘁𝗲𝗽 𝟮:. 𝚠𝚑𝚒𝚕𝚎 𝚃𝚛𝚞𝚎:. θᵢ ← θᵢ − lr · (∂L / ∂θᵢ).𝗦𝘁𝗲𝗽 𝟯: Profit.
0
3
0
RT @corbtt: Ok, some big news that I've been sitting on for a minute: @openpipeai is getting acquired by @coreweave!.
0
15
0
RT @l2k: CoreWeave is buying @OpenPipeAI - I'm a big fan of the @corbtt and the ART RL library and excited to work together!.
0
6
0
burnout comes from working on something you don’t believe in.
If you tell your friends you're burned out, they'll always prescribe the same thing: "You need to take a vacation". But it never works. Burnout is completely misunderstood. This new article by @bscholl is the best explanation I've seen.
1
0
6
agree with the shorthand observation. the writing is a bit too dense. o3 was easier to read.
I find GPT5 "thinking" to be much less clear than o3 was. Responses are full of jargon, shorthand, much harder to use and interpret. It's like talking to someone who wants to show off his expertise but doesn't really want to help. Is it just me?.
0
0
1
every startup should do this from series a on.
In our Series C round at @linear, we gave all current and former teammates the opportunity to sell a portion of their vested options. From the start, we’ve aimed to make Linear’s equity program as employee-friendly as possible. Now including path to liquidity.
2
0
3
RT @dvdcrbt: Small update to MCP•RL: you can now automatically generate training scenarios through the art.mcp package.
0
2
0
RT @corbtt: Really cool post by a community member sharing some of their results running RL with ART!.
0
6
0
RT @mattshumer_: OpenPipe is the singular player bringing RL to regular devs. They make it as simple as hell to build products that learn….
0
5
0
RT @corbtt: Super excited to announce the official integration between ART and LangGraph!. You can now easily train your LangGraph agents w….
0
40
0
making the agent reinforcement trainer more accessible to langgraph users with this latest integration. amazing work by andie jones from openpipe!.
Super excited to announce the official integration between ART and LangGraph!. You can now easily train your LangGraph agents with reinforcement learning — automatically improving reasoning, tool use, and adaptability. More info below:
0
0
4