David Hershey @DavidSHershey profile

David Hershey

@DavidSHershey

Followers

2K

Following

4K

Media

28

Statuses

224

AI Generalist | Writer of https://t.co/l1jTizWyTv

Joined April 2017

Don't wanna be here? Send us removal request.

David Hershey

@DavidSHershey

3 months

So, I did a thing 🙂. This was really just a fun little side project - I wanted to spend some time working on agents, and Pokemon was the most fun way I could come up with. And then it kinda took off! 3.7 Sonnet is so fun to watch play!.

Anthropic

@AnthropicAI

3 months

A few researchers at Anthropic have, over the past year, had a part-time obsession with a peculiar problem. Can Claude play Pokémon?. A thread:

29

22

552

David Hershey

@DavidSHershey

2 years

I built a thing! a chatbot built to help founders learn about the fundamentals of building seed-stage tech companies. Powered by @qdrant_engine ♥️ Definitely the easiest vector database to get started with for LLM applications.

5

8

53

David Hershey

@DavidSHershey

2 years

1/n) Never going to miss a chance to write about new devtools!. @doppenhe and I wrote an overview of the new wave of devtools empowering devs to build with large language models! Highlights in this 🧵.

1

10

35

David Hershey

@DavidSHershey

3 months

I also want to shout out @computerender - their video on RL for Pokemon was the first inspiration for us to try hooking up Claude to see how it did!.

1

35

David Hershey

@DavidSHershey

5 years

@UMichAthletics His name is Bo Schembarkler 🐾

1

7

32

David Hershey

@DavidSHershey

3 months

So naturally I have a stream that I threw together with some friends. Check it out! Let me know if you have any questions! Have fun!.

1

0

29

David Hershey

@DavidSHershey

3 years

I'm thrilled to have joined Unusual Ventures as an investor! Get to spend my days talking to smart data + ML folks -- going to be a blast. Reach out if you want to talk shop!.

4

1

24

David Hershey

@DavidSHershey

3 years

Tell me about today's news, in the style of a pirate 🏴‍☠️ Having fun with LLMs via @dust4ai .

2

23

David Hershey

@DavidSHershey

11 months

Claude 3.5 Sonnet is an incredible model! So excited its out! . I have a new hobby - making stupid games with Sonnet 3.5 + Artifacts. Reminds me of old-school flash game websites, but as fun as your imagination.

4

1

22

David Hershey

@DavidSHershey

2 years

1/ The current generation of ML systems is so much bigger than just "Generative". Finally put words to something I've been stewing on for a while.

1

3

20

David Hershey

@DavidSHershey

2 years

🧵 LLM evaluation is broken right now. There are no good objective measures of the "quality" of a model, so everyone is flying blind. (1/n)

2

1

19

David Hershey

@DavidSHershey

3 months

I have seen Claude do a lot of adorable things, this has to be at the top.

Zack Witten

@zswitten

3 months

My favorite Claude Plays Pokémon tidbit (mentioned in @latentspacepod) is that when @DavidSHershey told Claude to nickname its Pokémon, it instantly became much more protective of them, making sure to heal them when they got hurt.

1

0

19

David Hershey

@DavidSHershey

3 years

@sarahcat21 @codexeditor Pretty bullish on "LLMs are a new type of database that can power new applications", but whew this is a bit hotter of a take than that 😅.

1

17

David Hershey

@DavidSHershey

3 months

This was really fun!.

Alessio Fanelli

@FanaHOVA

3 months

New lightning pod with @DavidSHershey on how Claude Plays Pokémon was made, w/ special co-host @vibhuuuus! We covered:. - Designing tools for AI agents playing games.- Managing memory for long running tasks.- Why naming Pokémons is important. Watch now 📺

1

0

15

David Hershey

@DavidSHershey

1 year

Anthropic rules 🌉.

Anthropic

@AnthropicAI

1 year

This week, we showed how altering internal "features" in our AI, Claude, could change its behavior. We found a feature that can make Claude focus intensely on the Golden Gate Bridge. Now, for a limited time, you can chat with Golden Gate Claude:

0

1

13

David Hershey

@DavidSHershey

2 years

@goodside @goodside every time a model update happens

0

12

David Hershey

@DavidSHershey

3 months

Will try to answer some questions here if anyone has them!.

12

0

12

David Hershey

@DavidSHershey

2 months

LFG, come hack on Pokemon with me on Sunday.

Latent.Space 🔜 @aiDotEngineer

@latentspacepod

2 months

THIS SUNDAY IN SF. COME PLAY POKEMON. WITH AI AGENTS. AND CLAUDE. WITH SPECIAL GUEST @DavidSHershey .AND VMS FROM @JESSEMHAN. 100 SEATS ONLY.

1

11

David Hershey

@DavidSHershey

2 years

Awesome conversation about open source AI with @vipulved, @rxin and @weiliendang the other day!.

Unusual Ventures

@Unusual_VC

2 years

Leading founders weigh in on the intricacies of building with open source AI. @vipulved and @rxin talked w/ @weiliendang and @DavidSHershey about the future of open source AI and foundation models. Hear Reynold explain why @databricks chose to build Dolly w/ open source AI 👇

0

1

9

David Hershey

@DavidSHershey

2 years

3) You have to start by designing good prompts and getting the right data to the LLM context window. @LangChainAI @dust4ai @gpt_index @CognosisAI are rocking the prompt building and chaining. @weaviate_io @qdrant_engine @milvusio @pinecone make vector DBs easy!

1

10

David Hershey

@DavidSHershey

2 years

Seen a weird number of straw man arguments about the shortcomings of AI in the last few days?. Congrats on finding something ChatGPT can't do? In the meanwhile everyone else can go change the world figuring out all the things it *can* do lol.

2

1

9

David Hershey

@DavidSHershey

2 years

Such a privilege to get to work with the amazing team at @roboto_ai; they're building such a cool product. If you're in robotics, check out their sandbox!.

Unusual Ventures

@Unusual_VC

2 years

Amazing things are happening with AI + robotics. And @roboto_ai is at the center of it all. Learn about the insights and experience from Co-Founders @BBarash and Yves Albers-Schoenberg that led to this startup, and why we led their Seed.

0

4

9

David Hershey

@DavidSHershey

2 years

4/ Shoutouts to some examples of folks building cool apps:.@vectara working on revolutionizing search.

2

4

9

David Hershey

@DavidSHershey

1 year

How high do I have to get on hackernews to get my honorary engineer badge

1

0

9

David Hershey

@DavidSHershey

2 years

1/n GPT-4 just further confirms to me that LLM agents are absolutely the future. The biggest limitation of GPT-3.5 was deficiencies in the ability to do advanced planning, which seems to be changing quickly. 🧵.

1

4

7

David Hershey

@DavidSHershey

2 years

@aniiyengar @imjaredz Finally got a version done last night!. S/o to @promptlayer (@imjaredz ) for making some of the complex chains of prompts more manageable 🍰!.

1

2

7

David Hershey

@DavidSHershey

2 years

The most promising approach (IMO): custom reward models. Model user preference directly and use it to evaluate your app before you ship. Shoutout to @thesephist, one 15-minute convo with him put me down the rabbit hole on this topic. (4/n)

1

0

6

David Hershey

@DavidSHershey

3 years

@codexeditor @sarahcat21 Another version: "What queries are better suited for LLMs than other databases?".

1

6

David Hershey

@DavidSHershey

2 months

@FanaHOVA SO down. Let's make it happen.

0

5

David Hershey

@DavidSHershey

2 years

@matei_zaharia @james_y_zou @OfficialLoganK thoughts? 🤔.

0

4

David Hershey

@DavidSHershey

11 months

0

1

5

David Hershey

@DavidSHershey

2 years

7/7) Thanks to @tristanzajonc @willpienaar and a few others for their input and help, and to lots of others who reviewed this!.

0

1

5

David Hershey

@DavidSHershey

2 years

What an awesome view into why training LLMs requires so much high-quality talent. "This level of perfection is like eight billion people copy[ing] the complete works of Shakespeare for the 14 billion years the universe has existed and not have a single person make a mistake!".

Adept

@AdeptAILabs

2 years

If your loss curves look sus, join the club! Giant LLM training runs are full of pitfalls. We learned the hard way. We wrote a deep dive for the community on silent data corruptions (SDCs). Problem and mitigations here:

0

1

David Hershey

@DavidSHershey

2 years

1/6🧵 Happy LLaMA 2 day! In honor of @MetaAI giving us the best reason yet to host your own model, here's a quick thread on how to choose an LLM 👇. Starting with the most popular option: use a hosted model!

1

0

5

David Hershey

@DavidSHershey

2 years

4) As teams progress, lots of new challenges emerge to improve and maintain LLM features. @humanloop @honeyhiveai are leading the way helping teams manage the complexity of LLMs in prod

1

5

David Hershey

@DavidSHershey

2 years

@qdrant_engine is an incredible team to work with, can't wait to see everything they will accomplish!.

Qdrant

@qdrant_engine

2 years

So, we raised a $7.5M seed round, here is what our CEO have to say about it: And here is what we are going to do with it 🧵👇.

0

1

4

David Hershey

@DavidSHershey

2 years

6/n As long as the reasoning and planning are sound, it's trivial to hook these up to LLM agents to complete the tasks. We're ~there now. Now it's time for engineering.

1

0

4

David Hershey

@DavidSHershey

2 years

5) We have hot takes!

1

4

David Hershey

@DavidSHershey

2 years

So uh. anyone have access to Claude-100k and want to let me know what inference latency looks like as context size scales? 🙂.

2

0

3

David Hershey

@DavidSHershey

2 years

One primary problem: defining what we're even measuring! "Goodness" for LLMs is complicated and depends on your task. Follow @KevinAFischer if you want to understand why. (2/n)

1

4

David Hershey

@DavidSHershey

2 years

7/n That's why @AdeptAILabs announced a huge fundraise today. LLM-backed agents are about to change our world. Strap in, things are going to get weird. 🚀🚀🚀.

2

0

4

David Hershey

@DavidSHershey

2 years

@KevinAFischer @OpenAI The difficulty of being a research, consumer, and dev tool company at the same time really shows when it comes to the DevEx of their API products.

2

0

4

David Hershey

@DavidSHershey

2 years

Well this would be pretty incredible 👀.

Vipul Ved Prakash

@vipulved

2 years

The era of sub-quadratic LLMs is about to begin. At @togethercompute we've been building next gen models with large space state architectures and training them on very long sequences and the results from the recent builds are. incredible. Will share more as we get closer to

0

3

David Hershey

@DavidSHershey

2 years

5/ This demo from @chillzaza_ is a great example of using LLMs to augment applications.

Zahid Khawaja

@chillzaza_

3 years

Universal Q&A on @lucidweb_ works on Wikipedia articles. Check it out! 🪄. Ask questions about any Wikipedia article and the citation feature will take you straight to the source!. Request access at - I'm working super hard to roll this out to everyone! 😅

1

4

David Hershey

@DavidSHershey

2 years

2) First we break down how teams are building features with LLMs

1

4

David Hershey

@DavidSHershey

2 years

@m_morzywolek @hwchase17 😇

2

0

4

David Hershey

@DavidSHershey

3 years

@codexeditor @sarahcat21 I don't think I would frame it as instead! . My Q: "here is a performant database that contains the content of the internet and is queried with natural language; what application will you build with that?". There are definitely some questions better answered by LLMs though!.

2

1

4

David Hershey

@DavidSHershey

2 years

@OfficialLoganK @Chrisprucha @KevinAFischer @OpenAI I feel like the core issue is that LM model updates are inherently undocumented, breaking API changes. Fine if you assume that "smarter" is all that matters, but for the long tail of use cases these migrations will be pretty painful.

1

0

4

David Hershey

@DavidSHershey

3 years

@amanjha__ @SaveToNotion #thread.

1

0

3

David Hershey

@DavidSHershey

3 months

@benankdev Yeah, it has a handful of small tips that are mostly built around things that it normally gets confused by. Pretty minimal overall though!.

0

3

David Hershey

@DavidSHershey

3 years

@jheitzeb We should set up some sort of meetup, would be great to get folks together!.

1

0

3

David Hershey

@DavidSHershey

2 years

2/n The holy grail is obvious -- right now, we're stuck in AI "copilot" mode, where AI generates (incredible) outputs, and humans validate that output. We're headed toward autopilot for more and more complex tasks.

1

0

3

David Hershey

@DavidSHershey

2 years

There is so much opportunity in AI right now, and Unusual was created to help the best technical founders build great companies. This program is going to kick ass. Such a great opportunity for folks thinking about AI companies to build and learn together.

Unusual Ventures

@Unusual_VC

2 years

Applications are open for our AI studio for builders! . Come hang out this summer with our community of AI enthusiasts, builders, and founders!.

0

1

3

David Hershey

@DavidSHershey

2 years

@sh_reya Yes! I worked in MLOps for years, hoping that more tools would mean more people could use ML, and it turns out that more general ML models were the answer!.

1

0

3

David Hershey

@DavidSHershey

3 years

Nothing more fun than a conversation with @Dpbrinkm and the @mlopscommunity !.

MLOps Community

@mlopscommunity

3 years

What a pleasure talking to @DavidSHershey about Building a Movie Recommendation System on @TectonAI with @SnowflakeDB. Tecton integrates with Snowflake and enables data teams to process ML features and serve them in production quickly and reliably,

1

0

3

David Hershey

@DavidSHershey

2 years

Lots of unique evaluation concepts out there right now! Love the work @lmsysorg is doing giving Elo rankings to models with human comparisons. (3/n).

lmarena.ai

@lmarena_ai

2 years

⚔️Chatbot Arena Leaderboard Update!. Exciting to welcome new entrants:.- Google PaLM 2.- Claude-instant-v1.- MosaicML MPT-7B. The competition is heating up🔥 Check out our analysis for all the surprising results at Remember, your vote shapes the arena.

1

0

3

David Hershey

@DavidSHershey

2 years

6) And reflection!

1

3

David Hershey

@DavidSHershey

2 years

Even the NYT is a ChatGPT bro now 😭.

0

3

David Hershey

@DavidSHershey

2 years

@hwchase17 @m_morzywolek I'm currently working on some tools to manage transcripts from my zoom calls, and this is SO helpful and so cool.

2

0

3

David Hershey

@DavidSHershey

2 years

@imjaredz I hear prompt 7 from "10 ChatGPT prompts that will change your life" is a banger though.

0

3

David Hershey

@DavidSHershey

2 years

@hwchase17 @m_morzywolek Also wow good timing with this:.

OpenAI

@OpenAI

2 years

Our new embedding model is significantly more capable at language processing and code tasks, cost effective, and simpler to use.

0

3

David Hershey

@DavidSHershey

2 years

Being an Android user has never felt worse tbh.

0

3

David Hershey

@DavidSHershey

2 years

Also @LangChainAI + @qdrant_engine is a match made in heaven; the combo of the two makes building lightning quick. Thanks LangChain team!.

0

3

David Hershey

@DavidSHershey

1 year

@HamelHusain 🔥🔥🔥 your work is appreciated!.

0

1

David Hershey

@DavidSHershey

5 years

@AceAnbender Am I allowed to want us to hire John Beilein as our next football head coach?.

0

1

David Hershey

@DavidSHershey

3 years

@blennon_ for just semantic search.

0

2

David Hershey

@DavidSHershey

3 years

👇.

Linus

@thesephist

3 years

The fact that LLMs generate text is not the point. LLMs are cheap, infinitely scalable black boxes to soft human-like reasoning. That's the headline! The text I/O mode is just the API to this reasoning genie. It's a side effect of the training paradigm.

0

2

David Hershey

@DavidSHershey

2 years

Check out the whole blog:. And if you're building in the reward model space, hit me up, would love to jam on the topic. (/🧵).

1

0

2

David Hershey

@DavidSHershey

3 years

@JordanDAndersen @jheitzeb Specifically around LLM development? Would love to join if any are going strong.

1

0

2

David Hershey

@DavidSHershey

2 years

@HamelHusain I have the same irrational confidence for round two and don't know if that's a good sign or a bad sign 🥴.

0

1

David Hershey

@DavidSHershey

2 years

This is awesome -- of all of the (many) meetups that have cropped up, this has to be the most exciting theme. Rock on @swyx @Mappletons!.

swyx 🔜 @aiDotEngineer (Jun 3-5)

@swyx

2 years

REQUEST FOR DEMOS. Come join the first AI | UX: .Beyond the Textbox!. SAVE THE DATE: APR 19. in SF, recorded. Hosted by @mappletons and me, @geoffreylitt, @thesephist, @sgrove. ***If you have a 1-2min AI UX concept to share and want to meet fellow builders, PLS APPLY below!***

1

0

2

David Hershey

@DavidSHershey

2 years

This is why I get so excited about this space -- we're just scratching the surface of what happens when you build experiences or apps on top of the initial output of an LLM. So many possibilities.

Grant Slatton

@GrantSlatton

2 years

GPT can iteratively write, debug, and test programs to accomplish arbitrary goals. Pictured: GPT reading snippets of HTML from HN and building a headline scraper in Python, overcoming bugs by simply reading the errors and self-judgments and hypothesizing to itself. Thread ↓

0

2

David Hershey

@DavidSHershey

3 years

Today:. Aye, hello there mateys! There be a good deal happenin' on the seven seas today, startin' with the news that the US midterm elections be underway. With control of Congress hangin' in the balance, this be a mighty important election for the future of the country.

1

0

2

David Hershey

@DavidSHershey

3 months

@Aizkmusic done!.

1

0

2

David Hershey

@DavidSHershey

2 years

3/ I feel like a lot of people in the space are stuck in this rut where they are laser-focused on giving users tools to "generate". Those tools will be awesome, but they're just a part of the story. I'm excited about the apps we can build around LLMs.

1

0

2

David Hershey

@DavidSHershey

2 years

Really excited to chat with some leaders in the open-source AI space on June 21st!.

Unusual Ventures

@Unusual_VC

2 years

Interested in hearing more about what's happening in open-source AI? Sign up for our fireside chat on 6/21:.

0

1

2

David Hershey

@DavidSHershey

3 months

@DanAdvantage Claude gets a screen overlay that shows coordinates over the screen, then it can choose to go to those coordinates -- its not great at understanding its relative position to things still, so this gives it a pretty big boost.

2

0

2

David Hershey

@DavidSHershey

2 years

If you want to get crazy, you can even generate simulated interactions with LLMs to pass through a reward model to scale this out. Lots of possibilities. (5/n)

1

0

2

David Hershey

@DavidSHershey

2 years

3/n Let's do a side-by-side of a few planning queries, using my favorite example from the folks at @oughtinc: "How far would all the film frames that make up the 400-plus episodes of The Simpsons stretch?"

1

0

2

David Hershey

@DavidSHershey

2 years

last/n engineering around LLMs takes time and tools and hard work (@fixieai @LangChainAI @gpt_index will tell you that1), but we're going to get there really soon.

0

2

David Hershey

@DavidSHershey

2 years

Can't say enough about @qdrant_engine -- local mode made getting started simple, and easy to harden over time with their hosted offering.

1

0

2

David Hershey

@DavidSHershey

2 years

@KevinAFischer @OpenAI Especially when each of those products competes very directly for both attention *and* GPU availability.

1

0

2

David Hershey

@DavidSHershey

2 years

Shoutout to @railway (another incredible portco) - I've never hosted a webapp before, and Railway felt like magic making it all work.

1

0

2

David Hershey

@DavidSHershey

2 years

@sarahcat21 Is it so bad to have different stacks? The problem spaces can be so fundamentally different that you can essentially view them as different technologies. Maybe we can see some component consolidation at least 🙂.

2

0

2

David Hershey

@DavidSHershey

2 years

2/ Shoutout to @thesephist whose tweet was what pushed me over the edge to write some thoughts down.

Linus

@thesephist

3 years

Small rant about LLMs and how I see them being put, rather thoughtlessly IMO, into productivity tools. 📄. TL;DR — Most knowledge work isn't a text-generation task, and your product shouldn't ship an implementation detail of LLMs as the end-user interface.

1

0

2

David Hershey

@DavidSHershey

2 years

@DanielChesley @Work_Bench I might shamelessly steal this for Seattle, I'm so jealous of NYC right now 😭.

0

2

David Hershey

@DavidSHershey

3 years

Shoutout to @spolu and the team building for making this ridiculously easy to put together!.

0

2

David Hershey

@DavidSHershey

2 years

🔥🔥🔥.

shreya rajpal

@ShreyaR

2 years

Excited to release ✨Guardrails AI✨— an open-source package to add SLAs for LLM outputs!. Guardrails supports.🌟 pydantic-style validation of LLM outputs.🌟 corrective actions (e.g. reasking LLM) when needed.🌟 structure and type guarantees (e.g. JSON).

0

2

David Hershey

@DavidSHershey

2 years

@thesephist +1, and in general I'm surprised by the lack of discourse around building reward models.

0

2

David Hershey

@DavidSHershey

3 years

LFG 〽️.

0

2

David Hershey

@DavidSHershey

2 years

4/n The response from GPT-3.5 has some gaps. It tries to search for overly complex things (like how many frames are in one episode).

1

0

2

David Hershey

@DavidSHershey

2 years

8/ @mihail_eric put together an awesome demo ( that blurs the lines between "generative" vs. just providing new interfaces. Makes answering simple data questions so much easier.

1

0

2

David Hershey

@DavidSHershey

2 years

5/n GPT-4 nails it.

1

0

2

David Hershey

@DavidSHershey

2 years

@RTylerCrown @SarahHinkfuss @RTylerCrown mr. fun fact friday over here 🔥.

0

2

David Hershey

@DavidSHershey

7 months

New Sonnet, same hobby

1

2

David Hershey

@DavidSHershey

2 years

@KevinAFischer I guess I shouldn't be surprised! I've watched you publicly poke the traits of these models so much, makes sense to measure it too. Would love to chat, will drop you a message.

0

1

David Hershey

@DavidSHershey

3 years

@yoheinakajima @dust4ai Full credit to @goodside for the comedic inspiration!.

0

1

David Hershey

@DavidSHershey

2 years

@spolu Rad! Excited to try it out!.

0

1

David Hershey

@DavidSHershey

2 years

@thassonjee FWIW I've found it helps non-technical folks ground what I'm talking about. And it's better SEO 🤷‍♂️ the things you do for content lol.

1

0

1

David Hershey

@DavidSHershey

2 years

@sjwhitmore I've been kinda thinking of them as "skills".

0

1