Madison May (e/ia) @pragmaticml X Profile

Madison May (e/ia)

@pragmaticml

Followers

2K

Following

24K

Media

60

Statuses

3K

teaching machines @indicodata - professional novice

https://t.co/XtcCflOttX

Asheville, NC

Joined March 2010

Don't wanna be here? Send us removal request.

Madison May (e/ia)

@pragmaticml

6 years

This week's blog post explores methods for incorporating longer-term context in transformers! Featuring 6 unique approaches: - Sparse Transformers - Adaptive Span Transformers - Transformer-XL - Compressive Transformers - Reformer - Routing Transformer https://t.co/xzl1yrrdyx

pragmatic.ml

Exploring 6 noteworthy approaches for incorporating longer-term context in transformer models.

7

72

293

Soumith Chintala

@soumithchintala

3 days

Leaving Meta and PyTorch I'm stepping down from PyTorch and leaving Meta on November 17th. tl;dr: Didn't want to be doing PyTorch forever, seemed like the perfect time to transition right after I got back from a long leave and the project built itself around me. Eleven years

496

552

11K

Vik Paruchuri

@VikParuchuri

4 days

We're hiring in NYC! DM if you're interested in training SoTA OCR models, and helping thousands of customers (including tier 1 AI labs) work with documents.

Datalab

@datalabto

4 days

We're growing the team! 📍 NYC 🧠 Roles: Research Engineer, Founding Solutions Engineer 🔗 https://t.co/VSSj65go0K You: want to build and scale SoTA infrastructure loved by 50k+ developers and trusted by tier 1 research labs, Fortune 100s, and hyper-growth startups Us:

1

6

42

Madison May (e/ia)

@pragmaticml

17 days

Any good OSS options for translating LLM token usage numbers to $? Looking for an option that properly accounts for cached input pricing and has some auto-update mechanism to pull latest prices.

0

Lei Cui

@wolfshowme

27 days

Beyond just text quality! We're introducing #DocReward, a model that evaluates and improves the visual structure and style of documents. In our tests, DOCREWARD achieved a 60.8% win rate in generating human-preferred documents, compared to GPT-5's 37.7%. https://t.co/ChY9z0X05f

arxiv.org

Recent advances in agentic workflows have enabled the automation of tasks such as professional document generation. However, they primarily focus on textual quality, neglecting visual structure...

1

4

7

Jeremy Howard

@jeremyphoward

1 month

It's a strange time to be a programmer—easier than ever to get started, but easier to let AI steer you into frustration. We've got an antidote that we've been using ourselves with 1000 preview users for the last year: "solveit" Now you can join us.🧵 https://t.co/GLKm0woI8b

answer.ai

You can now sign up for Solveit, which a course in how to solve problems (including coding, writing, sysadmin, and research) using fast short iterations, and also provides a platform that makes this...

47

114

876

Thinking Machines

@thinkymachines

1 month

Efficient training of neural networks is difficult. Our second Connectionism post introduces Modular Manifolds, a theoretical step toward more stable and performant training by co-designing neural net optimizers with manifold constraints on weight matrices.

119

461

3K

Simon Willison

@simonw

2 months

I'm ready to accept a definition of "agent" that I think is widely-enough agreed upon to be useful: An LLM agent runs tools in a loop to achieve a goal This is a big piece of personal character development for me! I've been dismissing the term as hopelessly ambiguous for years

134

122

2K

Xuanwo

@OnlyXuanwo

3 months

One thing I beg @charliermarsh to build: https://t.co/j1VgMoaA0V for python. - "uv docs": build docs for given python packages - A https://t.co/LIyi0S181w website: similar to https://t.co/j1VgMoaA0V, auto build after every version updates. - Markdown native support with runnable

9

308

vicki

@vboykis

3 months

The amount of alpha in this post is off the charts https://t.co/hpEomeSoZY

seangoedecke.com

I see a lot of bad system design advice. One classic is the LinkedIn-optimized “bet you never heard of queues” style of post, presumably aimed at people who are…

8

100

1K

Raza Habib

@RazRazcle

3 months

So pleased to announce the Humanloop team is joining @AnthropicAI! I couldn't imagine a better home for the team. Everyone I've interacted with at Anthropic has been incredibly talented, high-trust and conscious of the stakes at play. Enormous gratitude to our customers and

Humanloop

@humanloop

3 months

We're thrilled to announce that the Humanloop team is joining @AnthropicAI! Our mission has always been to enable the rapid and safe adoption of AI. Now, as AI progress accelerates, we think Anthropic is the ideal home to continue this work.

73

12

401

maharshi

@mrsiipa

3 months

why aren’t there more research papers like this?

40

110

2K

Andrej Karpathy

@karpathy

4 months

Scaling up RL is all the rage right now, I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly), let me slightly

415

857

8K

Kevin Lu

@_kevinlu

4 months

I really like this diagram from @_jasonwei and @hwchung27 about how to view the bitter lesson: It's a mistake not to add structure now, it's a mistake to not remove that structure later. We're at the precipice of setting up a huge, powerful RL training run that will define the

3

6

137

Andrej Karpathy

@karpathy

5 months

The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability. It lives always-on and by default on every computer as the kernel of LLM personal computing. Its features are slowly crystalizing: - Natively multimodal

Omar Sanseviero

@osanseviero

5 months

I’m so excited to announce Gemma 3n is here! 🎉 🔊Multimodal (text/audio/image/video) understanding 🤯Runs with as little as 2GB of RAM 🏆First model under 10B with @lmarena_ai score of 1300+ Available now on @huggingface, @kaggle, llama.cpp, https://t.co/CNDy479EEv, and more

397

1K

11K

Andrej Karpathy

@karpathy

5 months

+1 for "context engineering" over "prompt engineering". People associate prompts with short task descriptions you'd give an LLM in your day-to-day use. When in every industrial-strength LLM app, context engineering is the delicate art and science of filling the context window

tobi lutke

@tobi

5 months

I really like the term “context engineering” over prompt engineering. It describes the core skill better: the art of providing all the context for the task to be plausibly solvable by the LLM.

529

2K

14K

Ryo Lu

@ryolu_

5 months

yes to queuing, and how about a kanban board for your whole fleet of agents?

ben

@benhylak

5 months

i want to be able to queue messages in cursor. (e.g. send a follow up while it's still thinking about the previous message) cc/ @ryolu_

243

125

4K

Andrej Karpathy

@karpathy

5 months

Nice - my AI startup school talk is now up! Chapters: 0:00 Imo fair to say that software is changing quite fundamentally again. LLMs are a new kind of computer, and you program them *in English*. Hence I think they are well deserving of a major version upgrade in terms of

Y Combinator

@ycombinator

5 months

Andrej Karpathy's (@karpathy) keynote yesterday at AI Startup School in San Francisco.

225

1K

9K

Jakob Foerster

@j_foerst

5 months

I suggest a new metric: Pass@1/K. For a given "K" You only get a point if all "K" attempts were successful. So it's a continuation of the Pass@K graph to the left hand site and intuitively measures robustness / confidence.

6

7

122

ben

@benhylak

5 months

i want to be able to queue messages in cursor. (e.g. send a follow up while it's still thinking about the previous message) cc/ @ryolu_

24

4

383

James Campbell

@jam3scampbell

5 months

excited to announce i'm leaving my PhD to join @OpenAI! i'll be working on memory + personality for AGI and ChatGPT memory will fundamentally change our relationship to machine intelligence, and i plan to work extraordinarily hard to make sure we get this right for humanity

457

281

9K