Leo Dirac @leopd X Profile

Leo Dirac

@leopd

Followers

6K

Following

3K

Media

164

Statuses

5K

Building the next generation of AI vision at Groundlight. Ex-physicist, ex-google, ex-amazon.

Seattle, WA

Joined April 2007

Don't wanna be here? Send us removal request.

Leo Dirac

@leopd

2 months

We trained small LLM's using GRPO to use an image zoom tool to better answer visual questions.

arxiv.org

Despite tremendous recent advances in large model reasoning ability, vision-language models (VLMs) still struggle with detailed visual reasoning, especially when compute resources are limited. To...

0

2

Leo Dirac

@leopd

4 months

RT @__sunil_kumar_: We've open-sourced a MCP that allows big models to use huggingface computer vision models as tools. This allows Clau….

0

2

0

Leo Dirac

@leopd

4 months

New open source MCP server for vision! MCP will be the fabric by which LLMs communicate with other systems. While LLMs can accept images as input, they remain stubbornly stupid at answering simple visual questions. Meanwhile, Groundlight and traditional CV systems are super.

Groundlight

@GroundlightAI

4 months

We made an open-source MCP server that turns HuggingFace zero-shot object detection pipelines into tools that Claude and others can use to locate objects or zoom (crop) to an object. Conceptually vision capabilities as tools are complementary to VLM's

1

2

13

Leo Dirac

@leopd

4 months

RT @__sunil_kumar_: It’s pretty remarkable how many of the GRPO findings from super verifiable environments (like math) haven’t generalized….

0

6

0

Leo Dirac

@leopd

4 months

Democratization of AI is one of the most powerful forces for long-term good in the world today. True democratization means not just open models & code, but code that can run without multi-million dollar hardware budgets. e.g. in a browser. Nice work Hyperparam team.

Kenny Daniel

@platypii

4 months

What if someone ported the entire data engineering stack to JavaScript? What new kinds of data applications could you build?. Today Hyperparam is releasing a collection of open source tools for working with large datasets (eg- parquet files) entirely in the browser, no servers.

1

0

1

Leo Dirac

@leopd

5 months

RT @WenhuChen: 🔥 How do you build a state-of-the-art Vision-Language Model with direct RL?. We’re excited to introduce VL-Rethinker, a new….

0

61

0

Leo Dirac

@leopd

5 months

RT @__sunil_kumar_: GRPO/reasoning enthusiasts - are you using the liger kernel? If not, I strongly suggest you give it a try! It is making….

0

14

0

Leo Dirac

@leopd

6 months

RT @andrewgwils: Good luck to everyone receiving ICML reviews tomorrow!.

0

2

0

Leo Dirac

@leopd

6 months

RT @GroundlightAI: The last day to vote for @GroundlightAI is coming up this Sunday! We appreciate your continuous support and for making….

0

1

0

Leo Dirac

@leopd

6 months

RT @Marktechpost: Groundlight Research Team Released an Open-Source AI Framework that Makes It Easy to Build Visual Reasoning Agents (with….

0

9

0

Leo Dirac

@leopd

6 months

RT @__sunil_kumar_: Has anyone built MCPs that can input and output image data? I’d appreciate a reference if one exists. VLMs like Qwen2….

0

2

0

Leo Dirac

@leopd

6 months

RT @pabbeel: Founders who were PhD or post-doc in my lab at Berkeley, **largely funded by NSF / DoD grants**, start-up, market cap (collect….

0

503

0

Leo Dirac

@leopd

6 months

Even better practice is to randomize how long you pause for (exponential backoff with jitter) such that the expected delay increases, but each individual delay is varied. That would clearly solve this problem as one would pause longer and they'd get out of each other's way. But.

0

Leo Dirac

@leopd

6 months

Good practice for dealing with errors is always to pause before trying again, and pause longer and longer each time - this has a nice theoretical benefit that the total load from each retrying agent has a constant cap, even if the error condition never resolves. (Sum n=1. �� of.

1

0

1

Leo Dirac

@leopd

6 months

Pretty funny as an isolated anecdote, but also a hidden lesson in why to use jitter in backoff algorithms. (Maybe these robots don't even recognize their state as an error condition?).

Massimo

@Rainmaker1973

6 months

Two equally smart Amazon robots.

1

0

4

Leo Dirac

@leopd

6 months

RT @andrewgwils: Good research is mostly about knowing what questions to ask, not about answering questions that other people are asking.

0

59

0

Leo Dirac

@leopd

6 months

RT @__sunil_kumar_: @leopd @BowenROIM @willccbb PS: we’re working on multi turn conversations and tool use. Stay tooned!.

0

1

0

Leo Dirac

@leopd

6 months

RT @__sunil_kumar_: We just released an open-source framework that makes it easy to build visual reasoning agents (with GRPO). https://t.….

0

124

0

Leo Dirac

@leopd

6 months

CC @charliermarsh.

0

Leo Dirac

@leopd

6 months

TIL about using uv for python. While you _can_ install uv using pip or something like that, IMHO that's a bad idea. You're better off installing uv directly (`curl -LsSf | sh`) - because then uv will manage your different python versions and everything.

1

5