Atla_AI Profile Banner
Atla Profile
Atla

@Atla_AI

Followers
409
Following
147
Media
42
Statuses
117

Find and fix AI agent failures. Backed by Y Combinator, Creandum, and the founders of Reddit, Cruise, etc.

Joined July 2024
Don't wanna be here? Send us removal request.
@Atla_AI
Atla
5 days
๐ŸŒ Trusted by patent attorneys in 15+ countries, ClaimWise is moving faster than ever โ€” spotting agent failure modes in ๐—ฑ๐—ฎ๐˜†๐˜€ ๐—ถ๐—ป๐˜€๐˜๐—ฒ๐—ฎ๐—ฑ ๐—ผ๐—ณ ๐˜„๐—ฒ๐—ฒ๐—ธ๐˜€ with Atla. โ€œ[๐˜ˆ๐˜ต๐˜ญ๐˜ขโ€™๐˜ด] ๐˜ณ๐˜ฆ๐˜ข๐˜ญ-๐˜ต๐˜ช๐˜ฎ๐˜ฆ ๐˜ฅ๐˜ข๐˜ต๐˜ข ๐˜ค๐˜ณ๐˜ถ๐˜ฏ๐˜ค๐˜ฉ๐˜ช๐˜ฏ๐˜จ ๐˜ข๐˜ฏ๐˜ฅ ๐˜ฑ๐˜ณ๐˜ฆ๐˜ด๐˜ฆ๐˜ฏ๐˜ต๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ.
1
0
4
@Atla_AI
Atla
11 days
๐Ÿ‘€.
@toby_drane
Toby Drane
11 days
New instrumentation library who dis ๐Ÿ‘€. We're building out the ability for those who build their agents in typescript to use the Atla Insights platform!.
0
0
2
@Atla_AI
Atla
13 days
RT @AgnoAgi: New integration alert!. @AgnoAgi 's production-ready multi-agent orchestration with the @Atla_AI sophisticated agent monitorinโ€ฆ.
0
6
0
@Atla_AI
Atla
14 days
๐Ÿš€ ๐— ๐—ผ๐—ป๐—ถ๐˜๐—ผ๐—ฟ ๐—ฎ๐—ป๐—ฑ ๐—ผ๐—ฝ๐˜๐—ถ๐—บ๐—ถ๐˜‡๐—ฒ ๐—ฎ ๐—บ๐˜‚๐—น๐˜๐—ถ-๐—ฎ๐—ด๐—ฒ๐—ป๐˜ ๐˜„๐—ผ๐—ฟ๐—ธ๐—ณ๐—น๐—ผ๐˜„ โ€” ๐—ถ๐—ป ๐˜๐—ต๐—ถ๐˜€ ๐—ถ๐—ป๐˜๐—ฒ๐—ฟ๐—ฎ๐—ฐ๐˜๐—ถ๐˜ƒ๐—ฒ ๐—ฑ๐—ฒ๐—บ๐—ผ. Among teams using our platform that build with frameworks, one has stood out as the most popular: @AgnoAgi. We built a demo to show the combo:.โ†’.
1
7
13
@Atla_AI
Atla
1 month
๐ŸŒ• Start evaluating with Selene 1: ๐ŸŒ” Selene 1 (70B): ๐ŸŒ– Selene 1 quantized:
Tweet card summary image
huggingface.co
0
0
3
@Atla_AI
Atla
1 month
Big news: weโ€™re open sourcing Selene 1!. Weโ€™ve been inspired by @huggingface's new native @vllm_project support for inference endpoints. Weโ€™re celebrating by making Selene our 70B LLM Judge model available to the wider community. Links ๐Ÿงต.
1
2
6
@Atla_AI
Atla
2 months
Why do deep research agents fail? . We ran Open Deep Research through GAIA, a benchmark that requires web browsing and reasoning. The biggest issues: planning errors and reasoning errors. So, what can we do about it? We break it down in our latest blog post.
1
1
2
@Atla_AI
Atla
2 months
Researcher @sashankpisupati showed off a research demo at @AITinkerers earlier this week: a human-in-the-loop evaluator that helps stop agents from taking questionable actions. The task now is getting AI evaluators to do the same. Lots of shared pain around agents and their
Tweet media one
0
2
5
@Atla_AI
Atla
2 months
RT @AAyman_1302: Got a room of 60+ people from @cohere @Synthesia @11x_official and more for the second @_ai_collective demo night in Londoโ€ฆ.
0
2
0
@Atla_AI
Atla
3 months
Building with AI agents? The framework you choose will lay the foundation for how your agents think and act. We broke down 9 top open-source frameworks, from LangGraph to Google ADK, to help you find the right fit (link in comments). ๐Ÿค”ย Structured vs flexible .๐Ÿค”ย Lightweight vs.
2
1
5
@Atla_AI
Atla
3 months
We are working to automate this improvement layer, with promising early findings. Stay tuned and follow for more ๐Ÿซก.
0
0
1
@Atla_AI
Atla
3 months
Most errors were terminal. Agents rarely recovered on their own. So how do we help agents recover?. We tested an actor-critic loop:. The agent acts. A critic evaluates each step. The agent self-corrects. With human critics, task success jumped 30%, reaching 80โ€“90% accuracy. No
Tweet media one
1
0
3
@Atla_AI
Atla
3 months
First, we diagnosed agent error types at the step-level. In DA-Code, the most common error category was reasoning errors (incorrect logic, hallucinated information, and not following instructions). Want to understand why your agents fail? Get in touch:
Tweet media one
1
0
1
@Atla_AI
Atla
3 months
Before we can fix broken AI agents, we have to know how theyโ€™re breaking. We analyzed failures from DA-Code, a coding benchmark for data science tasks, to understand how agents break and what we can do about it. Hereโ€™s what we found ๐Ÿงต.
1
2
8
@Atla_AI
Atla
3 months
AI agents are powerful, but still fail in unpredictable ways. Plus, debugging them is a mess. We just published a blog breaking down why agents still fail and what better observability could unlock. Check it out ๐Ÿงต
Tweet media one
1
0
5