Shashank Agarwal
@itsshashank
Followers
4K
Following
5K
Media
416
Statuses
10K
Building https://t.co/8yZmgov4SW of the Internet 🚀 Prev: MagicAPI, AWS, Levity, Activeloop, Pipfeed, Expedia, Hopdata. Weekly thoughts at: https://t.co/CTuxg0X4Hh
Bengaluru, Karnataka, India
Joined September 2008
The solution? Trace-based evaluation. Capture everything the agent does. Analyze the complete journey. This is what separates production-ready agents from prototypes. Learn more: https://t.co/duGOJvNFjS What's your experience been with agent evaluation?
noveum.ai
Real-time monitoring, tracing, and analytics for AI agents in production. 73+ evaluation scorers, multi-agent support, cost tracking.
0
0
1
This is why so many AI agents fail in production. Teams are monitoring like they're monitoring ML models. They're blind to what's actually happening inside the agent. It's like driving a car while only looking at the speedometer.
1
0
1
With ML models, you measure outputs. With agents, you need to measure the entire TRAJECTORY. Every decision point. Every reasoning step. Every tool call. Because an agent can fail at any point in its journey, not just at the end.
1
0
1
When I was at AWS, we learned this the hard way. We built monitoring systems for ML models that worked great. Then we tried to apply the same logic to agents. It failed spectacularly. Why? Because we were measuring the wrong things.
1
0
1
An ML model is straightforward: given input X, predict Y. You can measure accuracy, precision, recall. Done. But an AI agent is different. It's a system that: •Reasons about a problem •Makes decisions •Takes actions •Learns from feedback It's fundamentally more complex.
1
0
1
I've been thinking about how we evaluate AI agents. Most teams treat them like ML models: input → output → score. But that's not how agents work. They're decision-making systems, not prediction systems. This distinction matters more than you think. Let me share what I've
1
0
2
Thanks @coderabbitai @aravindputrevu for sending over these cool stuffs!! things like these show how much you value your community and contributors🙌
My experience with @coderabbitai -> - new pr - adds up >50 comments - pushed the fixes - adds next 15 comments - again push the fixes - add another set of comments (this is repeated at least 3-4 times😭)
3
1
22
We are hiring for two role -> Python AI/ML Engineer & FullStack Next.js Engineer both are full time roles, completely remote (more details in comments)
60
21
534
💡 “Your first paying customer matters more than 100 free users.” Our CEO, Shashank Agarwal, shares the #1 startup mistake to avoid 🚀 Too many founders chase growth before proving people will actually pay. The real validation? That very first customer who trusts your
0
1
3
Your prompts just grew arms and legs. We turned every API on https://t.co/Ox5qM9iQdv into an MCP tool—so Claude and Cursor can use them directly. Not just for devs. If you’re a PM, founder, analyst, or designer, you can now run real workflows from a chat window. Same 4 steps for
0
2
4
🎭 Next-Gen Figurine Design with Ultra-Fast AI (Nano-Banana) No more waiting for slow renders — bring your figurine concepts to life instantly with Google’s Nano-Banana powered API. 🚀 Why it matters? ✅ Ultra-fast image processing — no delays ✅ High-precision figurine edits &
1
1
2
To all -> Indian Software Engineers!! Please stop cheating in your Coding Interview!!!
0
0
0
⚡ Detect Anything Instantly with Real-Time Object Detection API (YOLOv8s Worldv2) No more missed details in images or videos! From smart surveillance to retail analytics, traffic monitoring, or warehouse automation — this API delivers blazing-fast, high-accuracy object
1
1
4
Bring your images to life with FaceSwap Image V3 API 🎭 Instantly create high-resolution, ultra-realistic face swaps — no Photoshop, no manual work. Just plug into our simple REST API and start transforming images in seconds. Ideal for content creators, marketers, game
1
3
6
🤝 NovaEval is open source, and I need YOUR help! High-priority areas: • 🧪 Unit tests (currently 23% coverage) • 📚 Real-world examples • 📝 Documentation & guides • 🔍 RAG evaluation metrics • 🤖 Agent evaluation frameworks First-time contributors welcome!
1
2
6
Just like how every business had to get a computer, every business will get AI agents.
0
0
2
🤝 NovaEval is open source, and I need YOUR help! High-priority areas: • 🧪 Unit tests (currently 23% coverage) • 📚 Real-world examples • 📝 Documentation & guides • 🔍 RAG evaluation metrics • 🤖 Agent evaluation frameworks First-time contributors welcome!
1
2
6