Comet
@Cometml
Followers
15K
Following
2K
Media
802
Statuses
3K
Comet provides an end-to-end model evaluation platform for AI developers, with best in class LLM evaluations, experiment tracking, and production monitoring
New York, NY
Joined October 2017
For many teams, hallucinations and security concerns are top of mind when building agents. On Nov 20, join Sarah Ostermeier + the @awscloud team to learn practical ways to build reliable agents, then watch them build a customer support agent live. 🔗 https://t.co/OGvG6ohR6Y
0
0
1
Opik just surpassed 15,000 stars on GitHub⭐ Guess we're not stopping anytime soon. 💻 https://t.co/yJqXZpp71M
3
4
11
I've been working on agent optimization for real-world prompts (prompt is ~10k tokens) and our new algorithm is already up 17% ! Seeing some interesting differences between benchmarks and real-world performance, more to come soon
1
3
5
Do people really try to one-shot features with Claude Code ? I shipped Dark Mode for Opik in less than a day but it took no less than 3 iterations before getting to something that was ready to be merged. A thread on how I use Claude Code 🧵
2
2
5
Dark mode is here — easy to toggle and perfect for those late-night debugging sessions 🌙 Because great tools adapt to your preferences, not the other way around.
1
1
3
Had a great convo with Gideon Mendels @Cometml CEO
Episode 3 of Cloudbreak is out! Tune in for a great conversation on all things AI w/ Gideon Mendels, co-founder and CEO of @Cometml and @yuvaln of Trilogy. YouTube: https://t.co/NLxCCutASR Or watch/listen on @Spotify, @ApplePodcasts and @amazonmusic!
0
1
2
No context switching or manual digging. OpikAssist provides actionable insights that turn trace data into concrete improvements. Right where you’re already working.
0
0
1
🚀 Meet OpikAssist: our new AI-powered trace inspector built into Opik that helps you: ✅Analyze complex LLM conversations ✅Spot where things go wrong ✅Get actionable suggestions
1
0
1
🚨LLM monitoring alone isn't enough. True observability means being able to act on what you see and actually improve your LLM system.
1
0
2
Our R&D team just wrapped up an incredible week in Rome 🇮🇹 Remote-first doesn't have to mean distant, when you are intentional about reconnecting as a team. Here’s some highlights of their week together!
1
0
5
Everyone’s building GenAI apps. Few are evaluating them well. On Sept 24, Claire Longo is running a live workshop on: → feedback loops for conversational agents → logging traces → LLM-as-a-judge metrics Definitely one you’ll want to check out 🔗 https://t.co/hR2L9GC0YH
0
1
6
New Cometeers joined us this summer across the US, UK, and Greece ☀️ Beyond their impressive professional experience, they bring diverse passions: board games, golf, hiking, and professional soccer. Welcome aboard! 👋
0
0
3
Our video with the Google Partners team is live 🎉 We're proud to be part of the ISV Startup Springboard program, officially available as a marketplace offering deployed on GCP. Big thanks to @googlecloud + our team at Comet for making this happen 🙌
2
3
8
Today, we're building a CodeArena, where you can compare any two code-gen models side-by-side. Tech stack: - @LiteLLM for orchestration - @Cometml's Opik to build the eval pipeline - @OpenRouterAI to access cutting-edge models - @LightningAI for hosting CodeArena Let's go!🚀
1
5
29
The line between pretraining and fine-tuning is increasingly blurry, making “training” harder to define. In this piece, @anmorgan24 unpacks how shifting methods and terms complicate LLM behavior—and why pretraining remains key to scaling models responsibly.
1
0
5
Before we dive in, here's a quick demo of what we're building! Tech stack: - @LiteLLM for orchestration - @Cometml's Opik to build the eval pipeline (open-source) - @OpenRouterAI to access the models You'll also learn about G-Eval & building custom eval metrics. Let's go! 🚀
2
5
78
AI coding tools are changing how we build 💡 @StatInStilettos built an AI app from scratch using Cursor — and shared a full breakdown of what worked, what didn’t, and why she thinks there’s a better alternative to “vibe coding.” Read the full breakdown 👇
comet.com
A seasoned software dev shows you how to get the most out of AI-assisted code tools like Cursor and avoid the pitfalls of pure vibe coding.
0
1
2
🧠 Build autonomous AI agents that think, remember, and act. @akshay_pachaar’s new crash course walks through: ✅ Tool integration via MCP servers ✅ Memory with Zep’s Graphiti ✅ Tracing and observability with Comet’s Opik All orchestrated with CrewAI — and fully
1
11
40
A Crash Course on Building AI Agents! Here's what it covers: - What is an AI agent - Connecting agents to tools - Overview of MCP - Replacing tools with MCP servers - Setting up observability and tracing All with 100% open-source tools!
35
179
936