Comet
@Cometml
Followers
15K
Following
2K
Media
803
Statuses
3K
Comet provides an end-to-end model evaluation platform for AI developers, with best in class LLM evaluations, experiment tracking, and production monitoring
New York, NY
Joined October 2017
Our CEO and Co-founder, Gideon Mendels, will take the stage to discuss the next frontier of AI -- agent optimization. Tune in here🔗 https://t.co/ENqglYJdMM
1
0
0
For many teams, hallucinations and security concerns are top of mind when building agents. On Nov 20, join Sarah Ostermeier + the @awscloud team to learn practical ways to build reliable agents, then watch them build a customer support agent live. 🔗 https://t.co/OGvG6ohR6Y
0
0
2
Opik just surpassed 15,000 stars on GitHub⭐ Guess we're not stopping anytime soon. 💻 https://t.co/yJqXZpp71M
3
4
11
I've been working on agent optimization for real-world prompts (prompt is ~10k tokens) and our new algorithm is already up 17% ! Seeing some interesting differences between benchmarks and real-world performance, more to come soon
1
3
5
Do people really try to one-shot features with Claude Code ? I shipped Dark Mode for Opik in less than a day but it took no less than 3 iterations before getting to something that was ready to be merged. A thread on how I use Claude Code 🧵
2
2
5
Dark mode is here — easy to toggle and perfect for those late-night debugging sessions 🌙 Because great tools adapt to your preferences, not the other way around.
1
1
3
Had a great convo with Gideon Mendels @Cometml CEO
Episode 3 of Cloudbreak is out! Tune in for a great conversation on all things AI w/ Gideon Mendels, co-founder and CEO of @Cometml and @yuvaln of Trilogy. YouTube: https://t.co/NLxCCutASR Or watch/listen on @Spotify, @ApplePodcasts and @amazonmusic!
0
1
2
No context switching or manual digging. OpikAssist provides actionable insights that turn trace data into concrete improvements. Right where you’re already working.
0
0
1
🚀 Meet OpikAssist: our new AI-powered trace inspector built into Opik that helps you: ✅Analyze complex LLM conversations ✅Spot where things go wrong ✅Get actionable suggestions
1
0
1
🚨LLM monitoring alone isn't enough. True observability means being able to act on what you see and actually improve your LLM system.
1
0
2
Our R&D team just wrapped up an incredible week in Rome 🇮🇹 Remote-first doesn't have to mean distant, when you are intentional about reconnecting as a team. Here’s some highlights of their week together!
1
0
5
Everyone’s building GenAI apps. Few are evaluating them well. On Sept 24, Claire Longo is running a live workshop on: → feedback loops for conversational agents → logging traces → LLM-as-a-judge metrics Definitely one you’ll want to check out 🔗 https://t.co/hR2L9GC0YH
0
1
6
New Cometeers joined us this summer across the US, UK, and Greece ☀️ Beyond their impressive professional experience, they bring diverse passions: board games, golf, hiking, and professional soccer. Welcome aboard! 👋
0
0
3
Our video with the Google Partners team is live 🎉 We're proud to be part of the ISV Startup Springboard program, officially available as a marketplace offering deployed on GCP. Big thanks to @googlecloud + our team at Comet for making this happen 🙌
2
3
8
Today, we're building a CodeArena, where you can compare any two code-gen models side-by-side. Tech stack: - @LiteLLM for orchestration - @Cometml's Opik to build the eval pipeline - @OpenRouterAI to access cutting-edge models - @LightningAI for hosting CodeArena Let's go!🚀
1
5
29
The line between pretraining and fine-tuning is increasingly blurry, making “training” harder to define. In this piece, @anmorgan24 unpacks how shifting methods and terms complicate LLM behavior—and why pretraining remains key to scaling models responsibly.
1
0
5
Before we dive in, here's a quick demo of what we're building! Tech stack: - @LiteLLM for orchestration - @Cometml's Opik to build the eval pipeline (open-source) - @OpenRouterAI to access the models You'll also learn about G-Eval & building custom eval metrics. Let's go! 🚀
2
5
78
AI coding tools are changing how we build 💡 @StatInStilettos built an AI app from scratch using Cursor — and shared a full breakdown of what worked, what didn’t, and why she thinks there’s a better alternative to “vibe coding.” Read the full breakdown 👇
comet.com
A seasoned software dev shows you how to get the most out of AI-assisted code tools like Cursor and avoid the pitfalls of pure vibe coding.
0
1
2
🧠 Build autonomous AI agents that think, remember, and act. @akshay_pachaar’s new crash course walks through: ✅ Tool integration via MCP servers ✅ Memory with Zep’s Graphiti ✅ Tracing and observability with Comet’s Opik All orchestrated with CrewAI — and fully
1
11
39