EvidentlyAI Profile Banner
Evidently AI Profile
Evidently AI

@EvidentlyAI

Followers
2K
Following
1K
Media
510
Statuses
2K

Open source ML and LLM evaluation πŸ“Š , testing 🚦and monitoring πŸ“ˆ GitHub: https://t.co/37H9bfnYj6 Discord: https://t.co/ElZ9RlroUa

Joined February 2020
Don't wanna be here? Send us removal request.
@EvidentlyAI
Evidently AI
9 months
3️⃣ 2️⃣ 1️⃣ Our free course on LLM evaluations for AI product teams starts today! πŸŽ₯ 7 days of byte-sized videos into your inbox ⭐️ Certificate upon completion πŸ‘©β€πŸ’» No coding skills required πŸ‘©β€πŸŽ“500+ students have signed up You can still join the courseπŸ‘‡ https://t.co/Go2bNYJXCR
Tweet media one
1
1
6
@EvidentlyAI
Evidently AI
18 hours
For more patterns in Gen AI applications, read the blog: https://t.co/WfL237WONA Or explore our database of 650 ML and LLM case studies from over 100 companies: https://t.co/Ipe0L1OfB8 5/5 🧡
Tweet media one
0
0
0
@EvidentlyAI
Evidently AI
18 hours
3️⃣ RAG is one of the most popular newcomer use cases. We highlighted RAG as a separate category, with customer support being the most common application. For example, DoorDash created a RAG-based delivery support chatbot: https://t.co/5QnujPusXq 4/5
Tweet media one
1
0
0
@EvidentlyAI
Evidently AI
18 hours
2️⃣ RecSys and search are reimagined with GenAI. Search and RecSys are still a core theme, with LLMs adding even better semantic understanding and quality of results. For example, Netflix created a foundation model for personalized recommendations: https://t.co/Uo2gfmKPBw 3/5
Tweet media one
1
0
0
@EvidentlyAI
Evidently AI
18 hours
1️⃣ Automation is still king. As with ML, companies pay great attention to optimizing and automating high-volume workflows. Gen AI helps achieve that for more complex flows. For example, Intuit uses GenAI to improve knowledge discovery: https://t.co/gSEO6dnS5X 2/5
Tweet media one
1
0
0
@EvidentlyAI
Evidently AI
18 hours
πŸ’‘ Gen AI use cases in 2025: learnings from 650 examples. We highlighted some new patterns of how top companies apply Gen AI based on a database of AI and ML use cases we’ve been curating: https://t.co/WfL237WONA Here are 3 of them πŸ‘‡ 1/5 🧡
Tweet media one
1
1
1
@EvidentlyAI
Evidently AI
3 days
How to control character traits in LLMs? Anthropic’s research identifies patterns that control AI character, allowing to: 🦹 Monitor its personality changes βœ… Mitigate undesirable personality shifts πŸ”‘ Identify training data leading to these shifts https://t.co/VtBdGLpyTQ
Tweet card summary image
anthropic.com
A paper from Anthropic describing persona vectors and their applications to monitoring and controlling model behavior
0
0
2
@EvidentlyAI
Evidently AI
6 days
πŸ“Œ In case you missed it Synthetic data generator in Evidently open-source! A tool that helps you create custom test datasets: Specify what you want to generate ➑️ Define user profiles & roles ➑️ Pick LLMs ➑️ Run the generator. Try it out πŸ‘‡ https://t.co/b2AnvH1MDz
Tweet media one
0
1
4
@EvidentlyAI
Evidently AI
7 days
A Friday ML use case πŸ“• πŸ“š From the database of 500 ML & LLM systems: https://t.co/jJoUj6MfFZ How Nextdoor, the neighborhood network app, uses LLMs to generate engaging email subject lines to boost email opens, clicks, and subsequent platform sessions. https://t.co/54oExGEsE6
Tweet card summary image
engblog.nextdoor.com
Generative AI (Gen AI) has demonstrated proficiency in content generation but does not consistently guarantee user engagement, mainly for…
0
0
3
@EvidentlyAI
Evidently AI
7 days
Practical tips on LLM evaluation 🧠 Booking shares its learnings from building LLM judges: 🎯 Clearly define evaluation metrics 🦾 Choose a strong LLM ✍️ Write a good evaluation prompt πŸ… Evaluate the judge and update the prompt https://t.co/WhAYPIu5XJ
Tweet card summary image
booking.ai
Lessons learned from 1 year of Judge-LLM Development
0
1
1
@Al_Grigor
Alexey Grigorev
11 days
Just finished the module on agents and MCP in my new course I cover: - Function calling - Deep research on complete @DataTalksClub podcast transcript history - MCP server for @cursor_ai with @EvidentlyAI docs - Ton of examples with OpenAI Agents SDK and PydanticAI
Tweet media one
0
12
46
@EvidentlyAI
Evidently AI
13 days
πŸ“Œ In case you missed it 250 LLM benchmarks! We updated the database of LLM benchmarks and datasets used to measure LLM capabilities in reasoning, math, coding, info retrieval, tool use, and safety. Save the list πŸ‘‡ https://t.co/nZjQF9ljF2
0
0
2
@EvidentlyAI
Evidently AI
14 days
A Friday ML use case πŸ“• πŸ“š From the database of 500 ML & LLM systems: https://t.co/jJoUj6MfFZ How Yelp, an online reviews platform, uses LLMs to detect threats, harassment, lewdness, personal attacks, and hate speech. https://t.co/XkLbPGUedV
Tweet card summary image
evidentlyai.com
How do top companies apply AI? A database of 650 case studies from 100+ companies with practical ML use cases, LLM applications, and learnings from designing ML and LLM systems.
0
0
2
@AxSaucedo
Alejandro Saucedo | KubeCon 2025 AI Day Keynote
17 days
Production is where machine learning meets business value, and Evidently AI has put together a comprehensive compendium of 650 real production ML/LLM case studies from 100+ companies (e.g., Netflix, Airbnb, DoorDash): https://t.co/TgNQOYTgmK #ML #MachineLearning #AI
Tweet media one
0
1
3
@EvidentlyAI
Evidently AI
20 days
πŸ“Œ In case you missed it RAG evaluation: an in-depth guide! You need a way to test how well your RAG system works – and catch what doesn't. Learn how to evaluate RAG retrieval and generation quality, build test sets, run experiments, and monitor πŸ‘‡ https://t.co/Jh2adbYeXo
Tweet card summary image
evidentlyai.com
This guide breaks down how to evaluate and test RAG systems. You'll learn how to evaluate retrieval and generation quality, build test sets with synthetic data, run experiments, and monitor in...
0
0
0
@EvidentlyAI
Evidently AI
22 days
πŸŒ€Evidently + Grafana for LLM evals! You can now visualize your Evidently LLM evaluation metrics on a Grafana dashboard. All in open source! Check out the code example: https://t.co/lGKSja0wnd
Tweet media one
0
1
2
@Al_Grigor
Alexey Grigorev
22 days
The code for the first module of my AI Bootcamp is ready! There I cover - LLMs and structured output - RAG with FAQ questions - RAG YouTube video + Summarizer - RAG on @EvidentlyAI docs - Search libraries: minsearch, elasticsearch, @qdrant_engine
Tweet media one
0
1
9
@EvidentlyAI
Evidently AI
25 days
πŸ›  See how top companies design their AI systems We updated our database of 650 practical ML use cases, including real-world LLM and Gen AI applications, from 100+ companies. Enjoy the reading πŸ‘‡ https://t.co/Ipe0L1OfB8
0
0
0
@EvidentlyAI
Evidently AI
27 days
πŸ“Œ In case you missed it How to evaluate an LLM app? πŸŽ“ An intro from LLM evals course: https://t.co/Go2bNYJXCR πŸ”‘ Prepare a dataset with test inputs ✏️ Manually label responses as Good or Bad πŸ“Š Design LLM evals system for automation Watch the video:
0
1
1
@EvidentlyAI
Evidently AI
28 days
A Friday ML use case πŸ“• πŸ“š From the database of 500 ML & LLM systems: https://t.co/jJoUj6MfFZ How LinkedIn uses Skills Graph to extract skill data from texts and map the relationships between skills, people, and companies for relevant job matches. https://t.co/3FrKBbhYYj
Tweet card summary image
linkedin.com
0
2
5
@EvidentlyAI
Evidently AI
29 days
Want to see more examples of AI agents in production? We put together a database of 650 practical ML and LLM case studies from over 100 companies. Enjoy the reading! https://t.co/Ipe0L1OfB8 5/5 🧡
Tweet card summary image
evidentlyai.com
How do top companies apply AI? A database of 650 case studies from 100+ companies with practical ML use cases, LLM applications, and learnings from designing ML and LLM systems.
0
0
2