David Hershey Profile
David Hershey

@DavidSHershey

Followers
2K
Following
4K
Media
28
Statuses
224

AI Generalist | Writer of https://t.co/l1jTizWyTv

Joined April 2017
Don't wanna be here? Send us removal request.
@DavidSHershey
David Hershey
3 months
So, I did a thing 🙂. This was really just a fun little side project - I wanted to spend some time working on agents, and Pokemon was the most fun way I could come up with. And then it kinda took off! 3.7 Sonnet is so fun to watch play!.
@AnthropicAI
Anthropic
3 months
A few researchers at Anthropic have, over the past year, had a part-time obsession with a peculiar problem. Can Claude play Pokémon?. A thread:
29
22
552
@DavidSHershey
David Hershey
2 years
I built a thing! a chatbot built to help founders learn about the fundamentals of building seed-stage tech companies. Powered by @qdrant_engine ♥️ Definitely the easiest vector database to get started with for LLM applications.
5
8
53
@DavidSHershey
David Hershey
2 years
1/n) Never going to miss a chance to write about new devtools!. @doppenhe and I wrote an overview of the new wave of devtools empowering devs to build with large language models! Highlights in this 🧵.
1
10
35
@DavidSHershey
David Hershey
3 months
I also want to shout out @computerender - their video on RL for Pokemon was the first inspiration for us to try hooking up Claude to see how it did!.
1
1
35
@DavidSHershey
David Hershey
5 years
@UMichAthletics His name is Bo Schembarkler 🐾
Tweet media one
1
7
32
@DavidSHershey
David Hershey
3 months
So naturally I have a stream that I threw together with some friends. Check it out! Let me know if you have any questions! Have fun!.
1
0
29
@DavidSHershey
David Hershey
3 years
I'm thrilled to have joined Unusual Ventures as an investor! Get to spend my days talking to smart data + ML folks -- going to be a blast. Reach out if you want to talk shop!.
4
1
24
@DavidSHershey
David Hershey
3 years
Tell me about today's news, in the style of a pirate 🏴‍☠️ Having fun with LLMs via @dust4ai .
2
2
23
@DavidSHershey
David Hershey
11 months
Claude 3.5 Sonnet is an incredible model! So excited its out! . I have a new hobby - making stupid games with Sonnet 3.5 + Artifacts. Reminds me of old-school flash game websites, but as fun as your imagination.
4
1
22
@DavidSHershey
David Hershey
2 years
1/ The current generation of ML systems is so much bigger than just "Generative". Finally put words to something I've been stewing on for a while.
1
3
20
@DavidSHershey
David Hershey
2 years
🧵 LLM evaluation is broken right now. There are no good objective measures of the "quality" of a model, so everyone is flying blind. (1/n)
Tweet media one
2
1
19
@DavidSHershey
David Hershey
3 months
I have seen Claude do a lot of adorable things, this has to be at the top.
@zswitten
Zack Witten
3 months
My favorite Claude Plays Pokémon tidbit (mentioned in @latentspacepod) is that when @DavidSHershey told Claude to nickname its Pokémon, it instantly became much more protective of them, making sure to heal them when they got hurt.
1
0
19
@DavidSHershey
David Hershey
3 years
@sarahcat21 @codexeditor Pretty bullish on "LLMs are a new type of database that can power new applications", but whew this is a bit hotter of a take than that 😅.
1
1
17
@DavidSHershey
David Hershey
3 months
This was really fun!.
@FanaHOVA
Alessio Fanelli
3 months
New lightning pod with @DavidSHershey on how Claude Plays Pokémon was made, w/ special co-host @vibhuuuus! We covered:. - Designing tools for AI agents playing games.- Managing memory for long running tasks.- Why naming Pokémons is important. Watch now 📺
Tweet media one
1
0
15
@DavidSHershey
David Hershey
1 year
Anthropic rules 🌉.
@AnthropicAI
Anthropic
1 year
This week, we showed how altering internal "features" in our AI, Claude, could change its behavior. We found a feature that can make Claude focus intensely on the Golden Gate Bridge. Now, for a limited time, you can chat with Golden Gate Claude:
Tweet media one
0
1
13
@DavidSHershey
David Hershey
2 years
@goodside @goodside every time a model update happens
0
0
12
@DavidSHershey
David Hershey
3 months
Will try to answer some questions here if anyone has them!.
12
0
12
@DavidSHershey
David Hershey
2 months
LFG, come hack on Pokemon with me on Sunday.
@latentspacepod
Latent.Space 🔜 @aiDotEngineer
2 months
THIS SUNDAY IN SF. COME PLAY POKEMON. WITH AI AGENTS. AND CLAUDE. WITH SPECIAL GUEST @DavidSHershey .AND VMS FROM @JESSEMHAN. 100 SEATS ONLY.
1
1
11
@DavidSHershey
David Hershey
2 years
Awesome conversation about open source AI with @vipulved, @rxin and @weiliendang the other day!.
@Unusual_VC
Unusual Ventures
2 years
Leading founders weigh in on the intricacies of building with open source AI. @vipulved and @rxin talked w/ @weiliendang and @DavidSHershey about the future of open source AI and foundation models. Hear Reynold explain why @databricks chose to build Dolly w/ open source AI 👇
0
1
9
@DavidSHershey
David Hershey
2 years
3) You have to start by designing good prompts and getting the right data to the LLM context window. @LangChainAI @dust4ai @gpt_index @CognosisAI are rocking the prompt building and chaining. @weaviate_io @qdrant_engine @milvusio @pinecone make vector DBs easy!
Tweet media one
1
1
10
@DavidSHershey
David Hershey
2 years
Seen a weird number of straw man arguments about the shortcomings of AI in the last few days?. Congrats on finding something ChatGPT can't do? In the meanwhile everyone else can go change the world figuring out all the things it *can* do lol.
2
1
9
@DavidSHershey
David Hershey
2 years
Such a privilege to get to work with the amazing team at @roboto_ai; they're building such a cool product. If you're in robotics, check out their sandbox!.
@Unusual_VC
Unusual Ventures
2 years
Amazing things are happening with AI + robotics. And @roboto_ai is at the center of it all. Learn about the insights and experience from Co-Founders @BBarash and Yves Albers-Schoenberg that led to this startup, and why we led their Seed.
0
4
9
@DavidSHershey
David Hershey
2 years
4/ Shoutouts to some examples of folks building cool apps:.@vectara working on revolutionizing search.
2
4
9
@DavidSHershey
David Hershey
1 year
How high do I have to get on hackernews to get my honorary engineer badge
Tweet media one
1
0
9
@DavidSHershey
David Hershey
2 years
1/n GPT-4 just further confirms to me that LLM agents are absolutely the future. The biggest limitation of GPT-3.5 was deficiencies in the ability to do advanced planning, which seems to be changing quickly. 🧵.
1
4
7
@DavidSHershey
David Hershey
2 years
@aniiyengar @imjaredz Finally got a version done last night!. S/o to @promptlayer (@imjaredz ) for making some of the complex chains of prompts more manageable 🍰!.
1
2
7
@DavidSHershey
David Hershey
2 years
The most promising approach (IMO): custom reward models. Model user preference directly and use it to evaluate your app before you ship. Shoutout to @thesephist, one 15-minute convo with him put me down the rabbit hole on this topic. (4/n)
Tweet media one
1
0
6
@DavidSHershey
David Hershey
3 years
@codexeditor @sarahcat21 Another version: "What queries are better suited for LLMs than other databases?".
1
1
6
@DavidSHershey
David Hershey
2 months
@FanaHOVA SO down. Let's make it happen.
0
0
5
@DavidSHershey
David Hershey
2 years
0
0
4
@DavidSHershey
David Hershey
11 months
0
1
5
@DavidSHershey
David Hershey
2 years
7/7) Thanks to @tristanzajonc @willpienaar and a few others for their input and help, and to lots of others who reviewed this!.
0
1
5
@DavidSHershey
David Hershey
2 years
What an awesome view into why training LLMs requires so much high-quality talent. "This level of perfection is like eight billion people copy[ing] the complete works of Shakespeare for the 14 billion years the universe has existed and not have a single person make a mistake!".
@AdeptAILabs
Adept
2 years
If your loss curves look sus, join the club! Giant LLM training runs are full of pitfalls. We learned the hard way. We wrote a deep dive for the community on silent data corruptions (SDCs). Problem and mitigations here:
Tweet media one
0
0
1
@DavidSHershey
David Hershey
2 years
1/6🧵 Happy LLaMA 2 day! In honor of @MetaAI giving us the best reason yet to host your own model, here's a quick thread on how to choose an LLM 👇. Starting with the most popular option: use a hosted model!
Tweet media one
1
0
5
@DavidSHershey
David Hershey
2 years
4) As teams progress, lots of new challenges emerge to improve and maintain LLM features. @humanloop @honeyhiveai are leading the way helping teams manage the complexity of LLMs in prod
Tweet media one
1
1
5
@DavidSHershey
David Hershey
2 years
@qdrant_engine is an incredible team to work with, can't wait to see everything they will accomplish!.
@qdrant_engine
Qdrant
2 years
So, we raised a $7.5M seed round, here is what our CEO have to say about it: And here is what we are going to do with it 🧵👇.
0
1
4
@DavidSHershey
David Hershey
2 years
6/n As long as the reasoning and planning are sound, it's trivial to hook these up to LLM agents to complete the tasks. We're ~there now. Now it's time for engineering.
1
0
4
@DavidSHershey
David Hershey
2 years
5) We have hot takes!
Tweet media one
1
1
4
@DavidSHershey
David Hershey
2 years
So uh. anyone have access to Claude-100k and want to let me know what inference latency looks like as context size scales? 🙂.
2
0
3
@DavidSHershey
David Hershey
2 years
One primary problem: defining what we're even measuring! "Goodness" for LLMs is complicated and depends on your task. Follow @KevinAFischer if you want to understand why. (2/n)
Tweet media one
1
1
4
@DavidSHershey
David Hershey
2 years
7/n That's why @AdeptAILabs announced a huge fundraise today. LLM-backed agents are about to change our world. Strap in, things are going to get weird. 🚀🚀🚀.
2
0
4
@DavidSHershey
David Hershey
2 years
@KevinAFischer @OpenAI The difficulty of being a research, consumer, and dev tool company at the same time really shows when it comes to the DevEx of their API products.
2
0
4
@DavidSHershey
David Hershey
2 years
Well this would be pretty incredible 👀.
@vipulved
Vipul Ved Prakash
2 years
The era of sub-quadratic LLMs is about to begin. At @togethercompute we've been building next gen models with large space state architectures and training them on very long sequences and the results from the recent builds are. incredible. Will share more as we get closer to
Tweet media one
0
0
3
@DavidSHershey
David Hershey
2 years
5/ This demo from @chillzaza_ is a great example of using LLMs to augment applications.
@chillzaza_
Zahid Khawaja
3 years
Universal Q&A on @lucidweb_ works on Wikipedia articles. Check it out! 🪄. Ask questions about any Wikipedia article and the citation feature will take you straight to the source!. Request access at - I'm working super hard to roll this out to everyone! 😅
1
1
4
@DavidSHershey
David Hershey
2 years
2) First we break down how teams are building features with LLMs
Tweet media one
1
1
4
@DavidSHershey
David Hershey
2 years
Tweet media one
2
0
4
@DavidSHershey
David Hershey
3 years
@codexeditor @sarahcat21 I don't think I would frame it as instead! . My Q: "here is a performant database that contains the content of the internet and is queried with natural language; what application will you build with that?". There are definitely some questions better answered by LLMs though!.
2
1
4
@DavidSHershey
David Hershey
2 years
@OfficialLoganK @Chrisprucha @KevinAFischer @OpenAI I feel like the core issue is that LM model updates are inherently undocumented, breaking API changes. Fine if you assume that "smarter" is all that matters, but for the long tail of use cases these migrations will be pretty painful.
1
0
4
@DavidSHershey
David Hershey
3 years
1
0
3
@DavidSHershey
David Hershey
3 months
@benankdev Yeah, it has a handful of small tips that are mostly built around things that it normally gets confused by. Pretty minimal overall though!.
0
0
3
@DavidSHershey
David Hershey
3 years
@jheitzeb We should set up some sort of meetup, would be great to get folks together!.
1
0
3
@DavidSHershey
David Hershey
2 years
2/n The holy grail is obvious -- right now, we're stuck in AI "copilot" mode, where AI generates (incredible) outputs, and humans validate that output. We're headed toward autopilot for more and more complex tasks.
1
0
3
@DavidSHershey
David Hershey
2 years
There is so much opportunity in AI right now, and Unusual was created to help the best technical founders build great companies. This program is going to kick ass. Such a great opportunity for folks thinking about AI companies to build and learn together.
@Unusual_VC
Unusual Ventures
2 years
Applications are open for our AI studio for builders! . Come hang out this summer with our community of AI enthusiasts, builders, and founders!.
0
1
3
@DavidSHershey
David Hershey
2 years
@sh_reya Yes! I worked in MLOps for years, hoping that more tools would mean more people could use ML, and it turns out that more general ML models were the answer!.
1
0
3
@DavidSHershey
David Hershey
3 years
Nothing more fun than a conversation with @Dpbrinkm and the @mlopscommunity !.
@mlopscommunity
MLOps Community
3 years
What a pleasure talking to @DavidSHershey about Building a Movie Recommendation System on @TectonAI with @SnowflakeDB. Tecton integrates with Snowflake and enables data teams to process ML features and serve them in production quickly and reliably,
Tweet media one
1
0
3
@DavidSHershey
David Hershey
2 years
Lots of unique evaluation concepts out there right now! Love the work @lmsysorg is doing giving Elo rankings to models with human comparisons. (3/n).
@lmarena_ai
lmarena.ai
2 years
⚔️Chatbot Arena Leaderboard Update!. Exciting to welcome new entrants:.- Google PaLM 2.- Claude-instant-v1.- MosaicML MPT-7B. The competition is heating up🔥 Check out our analysis for all the surprising results at Remember, your vote shapes the arena.
Tweet media one
1
0
3
@DavidSHershey
David Hershey
2 years
6) And reflection!
Tweet media one
1
1
3
@DavidSHershey
David Hershey
2 years
Even the NYT is a ChatGPT bro now 😭.
0
0
3
@DavidSHershey
David Hershey
2 years
@hwchase17 @m_morzywolek I'm currently working on some tools to manage transcripts from my zoom calls, and this is SO helpful and so cool.
2
0
3
@DavidSHershey
David Hershey
2 years
@imjaredz I hear prompt 7 from "10 ChatGPT prompts that will change your life" is a banger though.
0
0
3
@DavidSHershey
David Hershey
2 years
@hwchase17 @m_morzywolek Also wow good timing with this:.
@OpenAI
OpenAI
2 years
Our new embedding model is significantly more capable at language processing and code tasks, cost effective, and simpler to use.
0
0
3
@DavidSHershey
David Hershey
2 years
Being an Android user has never felt worse tbh.
0
0
3
@DavidSHershey
David Hershey
2 years
Also @LangChainAI + @qdrant_engine is a match made in heaven; the combo of the two makes building lightning quick. Thanks LangChain team!.
0
0
3
@DavidSHershey
David Hershey
1 year
@HamelHusain 🔥🔥🔥 your work is appreciated!.
0
0
1
@DavidSHershey
David Hershey
5 years
@AceAnbender Am I allowed to want us to hire John Beilein as our next football head coach?.
0
0
1
@DavidSHershey
David Hershey
3 years
@blennon_ for just semantic search.
0
0
2
@DavidSHershey
David Hershey
3 years
👇.
@thesephist
Linus
3 years
The fact that LLMs generate text is not the point. LLMs are cheap, infinitely scalable black boxes to soft human-like reasoning. That's the headline! The text I/O mode is just the API to this reasoning genie. It's a side effect of the training paradigm.
0
0
2
@DavidSHershey
David Hershey
2 years
Check out the whole blog:. And if you're building in the reward model space, hit me up, would love to jam on the topic. (/🧵).
1
0
2
@DavidSHershey
David Hershey
3 years
@JordanDAndersen @jheitzeb Specifically around LLM development? Would love to join if any are going strong.
1
0
2
@DavidSHershey
David Hershey
2 years
@HamelHusain I have the same irrational confidence for round two and don't know if that's a good sign or a bad sign 🥴.
0
0
1
@DavidSHershey
David Hershey
2 years
This is awesome -- of all of the (many) meetups that have cropped up, this has to be the most exciting theme. Rock on @swyx @Mappletons!.
@swyx
swyx 🔜 @aiDotEngineer (Jun 3-5)
2 years
REQUEST FOR DEMOS. Come join the first AI | UX: .Beyond the Textbox!. SAVE THE DATE: APR 19. in SF, recorded. Hosted by @mappletons and me, @geoffreylitt, @thesephist, @sgrove. ***If you have a 1-2min AI UX concept to share and want to meet fellow builders, PLS APPLY below!***
Tweet media one
1
0
2
@DavidSHershey
David Hershey
2 years
This is why I get so excited about this space -- we're just scratching the surface of what happens when you build experiences or apps on top of the initial output of an LLM. So many possibilities.
@GrantSlatton
Grant Slatton
2 years
GPT can iteratively write, debug, and test programs to accomplish arbitrary goals. Pictured: GPT reading snippets of HTML from HN and building a headline scraper in Python, overcoming bugs by simply reading the errors and self-judgments and hypothesizing to itself. Thread ↓
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
0
2
@DavidSHershey
David Hershey
3 years
Today:. Aye, hello there mateys! There be a good deal happenin' on the seven seas today, startin' with the news that the US midterm elections be underway. With control of Congress hangin' in the balance, this be a mighty important election for the future of the country.
1
0
2
@DavidSHershey
David Hershey
3 months
@Aizkmusic done!.
1
0
2
@DavidSHershey
David Hershey
2 years
3/ I feel like a lot of people in the space are stuck in this rut where they are laser-focused on giving users tools to "generate". Those tools will be awesome, but they're just a part of the story. I'm excited about the apps we can build around LLMs.
1
0
2
@DavidSHershey
David Hershey
2 years
Really excited to chat with some leaders in the open-source AI space on June 21st!.
@Unusual_VC
Unusual Ventures
2 years
Interested in hearing more about what's happening in open-source AI? Sign up for our fireside chat on 6/21:.
0
1
2
@DavidSHershey
David Hershey
3 months
@DanAdvantage Claude gets a screen overlay that shows coordinates over the screen, then it can choose to go to those coordinates -- its not great at understanding its relative position to things still, so this gives it a pretty big boost.
2
0
2
@DavidSHershey
David Hershey
2 years
If you want to get crazy, you can even generate simulated interactions with LLMs to pass through a reward model to scale this out. Lots of possibilities. (5/n)
Tweet media one
1
0
2
@DavidSHershey
David Hershey
2 years
3/n Let's do a side-by-side of a few planning queries, using my favorite example from the folks at @oughtinc: "How far would all the film frames that make up the 400-plus episodes of The Simpsons stretch?"
Tweet media one
1
0
2
@DavidSHershey
David Hershey
2 years
last/n engineering around LLMs takes time and tools and hard work (@fixieai @LangChainAI @gpt_index will tell you that1), but we're going to get there really soon.
0
0
2
@DavidSHershey
David Hershey
2 years
Can't say enough about @qdrant_engine -- local mode made getting started simple, and easy to harden over time with their hosted offering.
1
0
2
@DavidSHershey
David Hershey
2 years
@KevinAFischer @OpenAI Especially when each of those products competes very directly for both attention *and* GPU availability.
1
0
2
@DavidSHershey
David Hershey
2 years
Shoutout to @railway (another incredible portco) - I've never hosted a webapp before, and Railway felt like magic making it all work.
1
0
2
@DavidSHershey
David Hershey
2 years
@sarahcat21 Is it so bad to have different stacks? The problem spaces can be so fundamentally different that you can essentially view them as different technologies. Maybe we can see some component consolidation at least 🙂.
2
0
2
@DavidSHershey
David Hershey
2 years
2/ Shoutout to @thesephist whose tweet was what pushed me over the edge to write some thoughts down.
@thesephist
Linus
3 years
Small rant about LLMs and how I see them being put, rather thoughtlessly IMO, into productivity tools. 📄. TL;DR — Most knowledge work isn't a text-generation task, and your product shouldn't ship an implementation detail of LLMs as the end-user interface.
Tweet media one
1
0
2
@DavidSHershey
David Hershey
2 years
@DanielChesley @Work_Bench I might shamelessly steal this for Seattle, I'm so jealous of NYC right now 😭.
0
0
2
@DavidSHershey
David Hershey
3 years
Shoutout to @spolu and the team building for making this ridiculously easy to put together!.
0
0
2
@DavidSHershey
David Hershey
2 years
🔥🔥🔥.
@ShreyaR
shreya rajpal
2 years
Excited to release ✨Guardrails AI✨— an open-source package to add SLAs for LLM outputs!. Guardrails supports.🌟 pydantic-style validation of LLM outputs.🌟 corrective actions (e.g. reasking LLM) when needed.🌟 structure and type guarantees (e.g. JSON).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
0
2
@DavidSHershey
David Hershey
2 years
@thesephist +1, and in general I'm surprised by the lack of discourse around building reward models.
0
0
2
@DavidSHershey
David Hershey
3 years
LFG 〽️.
0
0
2
@DavidSHershey
David Hershey
2 years
4/n The response from GPT-3.5 has some gaps. It tries to search for overly complex things (like how many frames are in one episode).
Tweet media one
1
0
2
@DavidSHershey
David Hershey
2 years
8/ @mihail_eric put together an awesome demo ( that blurs the lines between "generative" vs. just providing new interfaces. Makes answering simple data questions so much easier.
1
0
2
@DavidSHershey
David Hershey
2 years
5/n GPT-4 nails it.
Tweet media one
1
0
2
@DavidSHershey
David Hershey
2 years
@RTylerCrown @SarahHinkfuss @RTylerCrown mr. fun fact friday over here 🔥.
0
0
2
@DavidSHershey
David Hershey
7 months
New Sonnet, same hobby
1
1
2
@DavidSHershey
David Hershey
2 years
@KevinAFischer I guess I shouldn't be surprised! I've watched you publicly poke the traits of these models so much, makes sense to measure it too. Would love to chat, will drop you a message.
0
0
1
@DavidSHershey
David Hershey
3 years
@yoheinakajima @dust4ai Full credit to @goodside for the comedic inspiration!.
0
0
1
@DavidSHershey
David Hershey
2 years
@spolu Rad! Excited to try it out!.
0
0
1
@DavidSHershey
David Hershey
2 years
@thassonjee FWIW I've found it helps non-technical folks ground what I'm talking about. And it's better SEO 🤷‍♂️ the things you do for content lol.
1
0
1
@DavidSHershey
David Hershey
2 years
@sjwhitmore I've been kinda thinking of them as "skills".
0
0
1