mohak__sharma Profile Banner
Mohak Sharma Profile
Mohak Sharma

@mohak__sharma

Followers
615
Following
610
Media
12
Statuses
181

co-founder and ceo @honeyhiveai - hiring!

NYC
Joined June 2020
Don't wanna be here? Send us removal request.
@mohak__sharma
Mohak Sharma
7 months
I'm excited to share we've raised $7.4M Seed from @gkm1 at @InsightPartners, along with @petesoder at @ZeroPrimeVC, @antgoldbloom at @AIXVentures, @flo at @468Capital, and others to bring evals & observability to AI agents! We're also going GA today! 🧵
9
5
34
@honeyhiveai
HoneyHive
9 days
The HoneyHive team had a blast at @_odsc west! The highlight? Our CTO @ds3638 spoke to a packed house about why your agents keep breaking in production, and how evaluation-driven development fixes it. What conference should we hit next? Reply w/ your suggestions 👇
0
1
3
@honeyhiveai
HoneyHive
15 days
The @honeyhiveAI team is so excited to be @_odsc AI West today! If you are at the conference, stop by booth #22 to learn more about our best-in-class platform for designing, evaluating, and monitoring AI agents. 🍯🐝🚀
@_odsc
ODSC (Open Data Science Conference) AI
26 days
🍯 Build, Test, and Monitor AI Agents with HoneyHive at ODSC AI West! 📍 Meet HoneyHive at Booth #22 during the ODSC AI West! 🔗 Learn more: https://t.co/S1dWWNQFNc
0
2
3
@honeyhiveai
HoneyHive
15 days
Tracking your eval scores across experiments just got a whole lot easier. Our new Experiments dashboard visualizes metric trends across all your experiments in one view — making it easy to see how changes affect your agent's quality. ✅ Spot performance regressions at a glance
0
2
3
@mohak__sharma
Mohak Sharma
15 days
One of the things i'm super proud about is the high talent density in our team We've collectively built the first version of Codex CLI agent, built core infra from 0->1->10 at multiple unicorns, and shipped consumer products used by millions!
@honeyhiveai
HoneyHive
15 days
@honeyhiveai is hiring in NYC 🍎 & SF 🌉 We’re looking for: •  SWEs to build our core product and SDKs •  FDE to help customers scale AI agents If you’re passionate about your craft, love working with customers, and aren’t afraid to solve complex technical problems then
1
0
1
@honeyhiveai
HoneyHive
27 days
Are you going to be in SF from Oct 28–30? Then come to @_odsc west to meet the @honeyhiveai Team! We will be at Booth 22 demoing our platform for designing, evaluating, and monitoring AI agents! For anyone working on the next generation of agentic AI, HoneyHive is the
0
1
2
@scaleupevent
ScaleUp:AI
2 months
At ScaleUp:AI — now less than two weeks away! — we’re looking beyond copilots. AI is evolving into autonomous Agents: digital teammates that can reason, decide, and execute. Moderated by Managing Director George Mathew, this session will unpack what that shift means for
0
1
1
@honeyhiveai
HoneyHive
2 months
🚨 Calling all AI engineers in SF 🚨 We are sponsoring the MCP - AI Agents Hackathon this Friday, Sep 19 at the AWS Builder Loft in San Francisco with over $50k in prizes! Sponsors include @anthropicai, @awscloud, @lovable_dev, @redisinc, and many others. Register below 👇
3
4
9
@mohak__sharma
Mohak Sharma
2 months
been fun watching this debate from the sidelines :) throwback to our first deck in 2022 - the why has obv changed a lot (eg: no one cares about rlhf anymore) but the core thesis holds - you need evals + observability + a/b testing for any real chance at alignment
@kamathematic
anirudh
2 months
@benhylak no dog in this fight but idt it's either or so much as it's you need both? idt people are "bragging" about evals except model providers. when i did recsys/trad ML we had both statsig + offline eval systems that helped us decide what to even put in prod in the first place
1
0
10
@honeyhiveai
HoneyHive
3 months
Introducing Alerts🔔 Alerts in @honeyhiveai give you real-time monitoring over everything that matters in your agent: ✅ Metric drift - Detect quality degradation over time ✅ Cost spikes - Stay within budget thresholds with usage alerts ✅ Guardrail violations - Monitor safety
0
1
6
@mohak__sharma
Mohak Sharma
4 months
As agents become more complex, it's becoming harder than ever to debug and understand what's really happening. Excited to ship something that actually helps with this.
@honeyhiveai
HoneyHive
4 months
Today we're shipping some major quality-of-life improvements to traces 🎁 🔍 Session Summaries: Unified view of metrics, evals, and feedback across all spans in an agent session. No more jumping between individual spans. ⏱️ Timeline View: Flamegraph visualization to identify
0
0
4
@honeyhiveai
HoneyHive
5 months
Introducing Role-Based Access Control (RBAC) in @HoneyHiveAI! Built in partnership with our largest financial services and insurance customers, RBAC brings enterprise-grade security and access controls to your critical observability workflows.
1
2
4
@honeyhiveai
HoneyHive
6 months
Chasing benchmark leaderboards is the easiest way to build an AI product that fails in the real world. Most teams waste 90% of their eval effort on academic benchmarks instead of finding exactly where their system breaks with real users.
1
2
4
@mohak__sharma
Mohak Sharma
6 months
Particularly proud to share that a major Fortune 100 enterprise is already logging their PHI data to our HIPAA-compliant cloud. Learn more about our security policies here:
@honeyhiveai
HoneyHive
6 months
We're excited to announce that HoneyHive has officially achieved SOC 2 Type II, GDPR, and HIPAA compliance! Every LLM interaction today—from user prompts to contextual retrieval or tool-use data—contains potential PII and PHI, which is why we've built our platform with
0
0
4
@mohak__sharma
Mohak Sharma
7 months
Any product designers / people with good UX taste up for grading some UX mockups for a vibe coding eval? 💴 Paid opportunity!
0
0
4
@mohak__sharma
Mohak Sharma
7 months
Most recommendation systems just show you more of what you already like. The trouble is, we can't always describe the new things we might enjoy. This technique works differently. It learns what you like and don't like, then helps you discover truly new content.
@honeyhiveai
HoneyHive
7 months
Traditional vector search systems often struggle with nuanced user preferences that are difficult to articulate. Our latest collaboration with @qdrant_engine showcases an iterative optimization technique that dynamically adapts search results based on user preferences.
0
0
4
@mohak__sharma
Mohak Sharma
7 months
Evals are a means to an end (better product), not the end itself
@eugeneyan
Eugene Yan
7 months
Product evals are misunderstood. Many teams think that adding another tool, metric, or llm-as-judge will solve all their problems and save their product. But that just dodges the hard truth and avoids the real work. Here's how to fix your process instead. https://t.co/vG8XE5bait
0
0
8
@jobergum
Jo Kristian Bergum
7 months
Because we struggle even with a single agent, never mind the exponential complexity of n agents.
@jobergum
Jo Kristian Bergum
7 months
There is something about it. Vendors hype agents, tools and MCP/A2A. But accuracy is so so on benchmarks. IMHO most don’t realize that tools and MCP It’s just stuffing tool descriptions into the context prompt. It’s all just text stuffed into a prompt. No magic sauce.
1
1
6
@mohak__sharma
Mohak Sharma
7 months
Funny how this is being resurfaced 4yrs later after @ds3638 built the original version at Microsoft:
Tweet card summary image
github.com
CLI tool that uses Codex to turn natural language commands into their Bash/ZShell/PowerShell equivalents - microsoft/Codex-CLI
@gdb
Greg Brockman
7 months
Also released today is Codex CLI — an open-source lightweight coding agent that runs in your terminal: https://t.co/CXDz0aK2rX This is the first of a series of tools we'll be releasing over upcoming months, which we think show the future of programming.
0
0
6
@mohak__sharma
Mohak Sharma
7 months
At @honeyhiveai, we call them the "YOLO to prod" people. Too easy to ship a half-baked app when no one knows how to really do ML The core issue is SWE doesn't prep you for ML's experimentation culture. ML is more science than code—and that's what most people are getting wrong
@jobergum
Jo Kristian Bergum
7 months
0
0
9