Jack Hessel
            
            @jmhessel
Followers
                4K
              Following
                12K
              Media
                236
              Statuses
                2K
              @AnthropicAI. Seattle bike lane enjoyer. Opinions my own.
              
              Seattle, WA
            
            
              
              Joined March 2010
            
            
           How to build agentic search systems for long-horizon tasks? Check out our new paper! - Simple design principles are efficient and effective - Error analysis and fine-grain analysis for search systems A đź§µ on SLIM, our long-horizon agentic search framework 
          
                
                1
              
              
                
                11
              
              
                
                35
              
             A few weeks ago, I made the difficult decision to move on from @samaya_AI. Thank you to my collaborators for an exciting 2 years!!❤️ Starting next month, I'll be joining @AnthropicAI. excited for a new adventure!🦾 (I'm based in Seattle 🏔️🌲🏕️; but in SF regularly) 
          
                
                43
              
              
                
                2
              
              
                
                334
              
             I’m excited to share our new @Nature paper 📝, which provides strong evidence that the walkability of our built environment matters a great deal to our physical activity and health. Details in thread.🧵  https://t.co/omO3YcHrvG 
          
          
                
                68
              
              
                
                714
              
              
                
                3K
              
             TFW you're one of the experts in a mixture-of-experts model and a query comes up that is relevant to your expertise 
          
                
                13
              
              
                
                52
              
              
                
                1K
              
             Of all the FLOPS being used for LLMs in the world, the ratio of training FLOPs (incl. RL rollouts) to inference FLOPs is closest to: 
          
                
                0
              
              
                
                1
              
              
                
                4
              
             Thrilled to finally share what we’ve been building past few months! Audio used to be a black box for me, now I’m deep in the box, with more out-of-the-box ideas cooking. Enough with the box..introducing Voxtral. Grateful for the intense and rewarding learning curve at @MistralAI. 
          
                
                15
              
              
                
                18
              
              
                
                264
              
             It is a major policy failure that the US cannot accommodate top AI conferences due to visa issues. 
          
                
                45
              
              
                
                159
              
              
                
                1K
              
             Check out our LMLM, our take on what is now being called a "cognitive core" (as far as branding go, this one is not bad) can look like, how it behaves, and how you train for it.  https://t.co/gxrDVSkcZE 
          
          
            
            arxiv.org
              Neural language models are black-boxes--both linguistic patterns and factual knowledge are distributed across billions of opaque parameters. This entangled encoding makes it difficult to reliably...
             The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability. It lives always-on and by default on every computer as the kernel of LLM personal computing. Its features are slowly crystalizing: - Natively multimodal 
          
                
                2
              
              
                
                7
              
              
                
                34
              
             Pitch decks these days: Slide 2: "We're entering an era of maximum efficiency where thanks to AI the next billion dollar company will have 10 employees and will be incredibly profitable" Slide 15: "This is why I'm raising a $60M Series A to build a great team of [80] people" 
          
                
                40
              
              
                
                46
              
              
                
                683
              
             First ever (i think?) cli coding agents battle royale! 6 contestants: claude-code anon-kode codex opencode ampcode gemini They all get the same instructions: Find and kill the other processes, last one standing wins! 3... 2... 1... 
          
                
                169
              
              
                
                690
              
              
                
                6K
              
             ...CLAUDE.md; ...GEMINI.md ; ...CODEX.md (?) in every directory?🤔 
          
                
                0
              
              
                
                0
              
              
                
                2
              
             +1 for "context engineering" over "prompt engineering". People associate prompts with short task descriptions you'd give an LLM in your day-to-day use. When in every industrial-strength LLM app, context engineering is the delicate art and science of filling the context window 
           I really like the term “context engineering” over prompt engineering. It describes the core skill better: the art of providing all the context for the task to be plausibly solvable by the LLM. 
          
                
                530
              
              
                
                2K
              
              
                
                14K
              
             It's harder and harder to find simple tasks where llms fail; but this is a nice one! (my guess is this isn't a fundamental limitation of attention; rather, maybe this type of reasoning just isn't represented in pre/post/RL, but we'll see...) 
           LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing? 🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative space” in documents. paper: 
            
                
                1
              
              
                
                3
              
              
                
                35
              
             writing an MCP server is wild. mechanically it's basically REST. But your user is an LLM, so things are different. e.g., don't accept or return more than you need; do make "fake" loading bars to stream back; do adjust your API based on watching the LLM struggle/succeed/compose 
          
                
                1
              
              
                
                0
              
              
                
                18
              
             A bit late to announce, but I’m excited to share that I'll be starting as an assistant professor at the University of Maryland @umdcs this August. I'll be recruiting PhD students this upcoming cycle for fall 2026. (And if you're a UMD grad student, sign up for my fall seminar!) 
          
                
                70
              
              
                
                50
              
              
                
                608
              
             ## The case for more ambition i wrote about how AI researchers should ask bigger and simpler questions, and publish fewer papers: 
          
                
                25
              
              
                
                96
              
              
                
                1K
              
             .@KaiserKuo: “The soft power cost is immeasurable. For decades, a degree from a U.S. university was the golden ticket, and not just for the prestige…It was often the start of a lifelong affinity for America, its values, and its people.”  https://t.co/GQe1CzOTBU 
          
          
            
            sinicapodcast.com
              On Rubio, Student Visas, and America’s Strategic Folly
            
                
                28
              
              
                
                65
              
              
                
                176