GentraceAI Profile Banner
Gentrace Profile
Gentrace

@GentraceAI

Followers
207
Following
10
Media
36
Statuses
89

Test generative AI across teams. Automate evals for reliable LLM products and agents.

San Francisco
Joined May 2023
Don't wanna be here? Send us removal request.
@GentraceAI
Gentrace
16 days
RT @dougsafreno: Agents are significantly more powerful than standalone LLM calls. But, debugging them is a nightmare. You can trace their….
0
4
0
@GentraceAI
Gentrace
6 months
LLM-as-a-judge evals often fail because they ask the same question twice. Instead, give the model an unfair advantage: extra context, constraints, or comparisons that make grading easier than generation. Here's a guide breaking down our approach:.
gentrace.ai
LLM-as-a-judge evaluation uses an LLM to grade an output from an AI system, augmenting or replacing manual, human evaluation.
0
0
1
@grok
Grok
10 days
Join millions who have switched to Grok.
108
204
1K
@GentraceAI
Gentrace
6 months
Your dataset will never be perfect! But you need one to get started with evals. Instead, the best AI teams don’t chase a "golden" dataset. They start small with 5-10 examples, capture real production use cases, and iterate continuously. Here’s a practical guide for building.
0
0
1
@GentraceAI
Gentrace
7 months
Multiverse is using AI to improve how students learn on the job. Their AI team uses LLMs for delivering realtime feedback to students, requiring high quality and reliability. Before Gentrace, evals were managed in spreadsheets, creating bottlenecks. Now they:. - Use LLM
Tweet media one
0
0
2
@GentraceAI
Gentrace
7 months
What a night! Packed house and some of the sharpest minds in AI from Webflow, Asana, and 11x sharing how they’re building with agents today. Big thanks to Ampersand for co-hosting and our speakers our speakers @bryantchou @rodrigodavies @prabhavjain @ezelby @patrickt010 for
Tweet media one
2
1
7
@GentraceAI
Gentrace
7 months
Agents are changing everything about how we build software. Will agents replace purpose-built tools or create entirely new opportunities?. Learn where agents are actually headed this Thurs 2/6 at Gentrace SF with our speakers @bryantchou @rodrigodavies @prabhavjain @ezelby.
0
1
5
@GentraceAI
Gentrace
7 months
Last week, we hosted some incredible AI builders from companies like Asana, Cribl, Block, Vanta, and Pinterest to share their stories on shipping AI apps to production. What we learned is that the road from POC to production isn't a straight path:. 1. Turns out the biggest
Tweet media one
1
1
6
@GentraceAI
Gentrace
8 months
Self-hosted just got an upgrade. Now you can deploy Gentrace in your @kubernetesio cluster with:. - Helm charts for quick setup.- @istiomesh for secure service-to-service communication.- Support for your existing data infra (Postgres, Kafka, S3). Get started with our guide:
Tweet media one
0
3
5
@GentraceAI
Gentrace
8 months
We're excited to join the @foundersysk showcase on Feb 5th in San Francisco!. Meet our cofounders, @dougsafreno and @virtuallyvivek, and learn how we're helping teams at Webflow and Quizlet test their AI apps. Apply here:.
Tweet card summary image
newsletter.foundersysk.com
Former attendees on why you should join our next showcase
0
2
4
@GentraceAI
Gentrace
8 months
RT @virtuallyvivek: How do you process 45,000 tasks/day without adding infrastructure complexity?. At @gentraceai, we built a task queue wi….
0
3
0
@GentraceAI
Gentrace
8 months
It was a big year for Gentrace in 2024!. 📈 21M+ evals and traces ran, growing 2x in the last 6 months (up from 40k when we first started Oct 2023!) .🤖 19 features and improvements launched.🪲 320 bugs squashed. Thank you to our customers and team for making this possible. 🫶
Tweet media one
0
1
4
@GentraceAI
Gentrace
8 months
It’s been an exciting year for us with lots of new releases and bugs fixed. Here’s a recap of our 5 favorite things we shipped:. 1. Datasets support—organize test data into separate groups within a pipeline. 2. Compare—updated compare mode for easily viewing outputs and test
Tweet media one
0
1
3
@GentraceAI
Gentrace
9 months
RT @damndanielliem: Been a chaotic but fun year at @GentraceAI. We hosted 10+ community events, partnering with @modal_labs, @webflow, @me….
0
2
0
@GentraceAI
Gentrace
9 months
RT @dougsafreno: I'll die on this hill: people are ripping on LLM as a judge because they aren't doing it right. 80% are asking the LLM the….
0
5
0
@GentraceAI
Gentrace
9 months
RT @GentraceAI: With the rise of LLMs, the @webflow team set an ambitious goal to use natural language to make modifications to websites. T….
0
2
0
@GentraceAI
Gentrace
9 months
With the rise of LLMs, the @webflow team set an ambitious goal to use natural language to make modifications to websites. To set up evals, they chose Gentrace. With Gentrace, the Webflow team:. - Evaluates multimodal outputs (like website screenshots) using human and
Tweet media one
1
2
6
@GentraceAI
Gentrace
9 months
What if your LLM testing system could automatically optimize your prompts? That's where we're headed with Experiments, our new feature helping developers speed up last-mile tuning. Here's how it works:. Unlike prompt playgrounds, Experiments provides a testing environment
Tweet media one
1
3
7
@GentraceAI
Gentrace
9 months
RT @dougsafreno: Big news today: @GentraceAI raised our $8M Series A led by @MatrixVC. We’re celebrating by launching Experiments, the fir….
0
25
0
@GentraceAI
Gentrace
10 months
RT @dougsafreno: Most engineers approach LLM-as-a-judge all wrong. The usual high-level metrics like hallucination or safety rarely tell y….
0
5
0