iislucas (Lucas Dixon) @iislucas X Profile

iislucas (Lucas Dixon)

@iislucas

Followers

389

Following

101

Media

4

Statuses

249

machines learn, graphs reason, identity is a non-identity, incompetence over conspiracy, evil by association is evil, expression is never free, stay curious

https://t.co/QOQRrZCOJk

Paris

Joined January 2010

Don't wanna be here? Send us removal request.

Emily Reif

@emilyrreif

14 days

Mapping LLMs with Sparse Autoencoders https://t.co/FIcY91YkS4 An interactive introduction to Sparse Autoencoders and their use cases with Nada Hussein, @shivamravalxai, Jimbo Wilson, Ari Alberich, @NeelNanda5, @iislucas, and @Nithum

8

60

510

Google for Developers

@googledevs

18 days

AI is everywhere, but the most impactful features start with a real user need. Ask how AI can add unique value to that experience, not just how to release an AI feature. Here are three questions to consider before you start building. 1️⃣ What’s the core user problem? 2️⃣

22

18

94

Liv

@livgorton

29 days

my coworkers have been subjected to this yap so I now continue here: this is genuinely one of the most useful papers to engage with for interp imo. easily one of my most referred to papers. if you have not read it, pls do. mandatory reading until the heat death of the universe.

6

23

399

Alexander Chen

@alexanderchen

5 months

Veo holograms 🦝⚡️ Visualizing animal superpowers! Just discovered Veo 3's amazing ability to render 3d holograms. Virtual interfaces within the simulated world. 🔊 Prompts in 🧵

2

4

20

Arthur Conmy

@ArthurConmy

9 months

We are hiring Applied Interpretability researchers on the GDM Mech Interp Team!🧵 If interpretability is ever going to be useful, we need it to be applied at the frontier. Come work with @NeelNanda5, the @GoogleDeepMind AGI Safety team, and me: apply by 28th February as a

2

36

280

Tyler Chang

@tylerachang

11 months

We scaled training data attribution (TDA) methods ~1000x to find influential pretraining examples for thousands of queries in an 8B-parameter LLM over the entire 160B-token C4 corpus! https://t.co/4mglIOAjyB

1

20

127

Jeff Dean

@JeffDean

1 year

What a way to celebrate one year of incredible Gemini progress -- #1🥇across the board on overall ranking, as well as on hard prompts, coding, math, instruction following, and more, including with style control on. Thanks to the hard work of everyone in the Gemini team and

lmarena.ai

@arena

1 year

Big news on Chatbot Arena 🔥 The new @GoogleDeepMind model gemini-exp-1206 is crushing it, and the race is heating up. Google is back in the #1 spot 🏆overall and tied with O1 for the top coding model! Highlights (improvement since gemini-exp-1121 in parentheses) - First

89

320

2K

Sohee Yang

@soheeyang_

1 year

🚨 New Paper 🚨 Can LLMs perform latent multi-hop reasoning without exploiting shortcuts? We find the answer is yes – they can recall and compose facts not seen together in training or guessing the answer, but success greatly depends on the type of the bridge entity (80%+ for

7

49

202

Adam Roberts

@ada_rob

1 year

I’m so proud of the updated version of #MusicFXDJ we developed in collaboration with @jacobcollier, available today at https://t.co/pYopej66CL. Over the past year I’ve spent countless hours experimenting with our real-time music models, and it feels like I’ve learned to play a

Google Labs

@GoogleLabs

1 year

It’s here! Thrilled to collab with @jacobcollier on our latest #LabSession exploring the magical possibilities of generative music and #MusicFXDJ. Watch full video: https://t.co/5FuHxCZFtn Tune in 10/24 at 5 PM ET as Jacob livestreams a MusicFX DJ sesh on his YT channel.

7

35

190

Asma Ghandeharioun

@ghandeharioun

1 year

🧵Responses to adversarial queries can still remain latent in a safety-tuned model. Why are they revealed sometimes, but not others? And what are the mechanics of this latent misalignment? Does it matter *who* the user is? (1/n)

1

11

62

Google DeepMind

@GoogleDeepMind

1 year

We’re welcoming a new 2 billion parameter model to the Gemma 2 family. 🛠️ It offers best-in-class performance for its size and can run efficiently on a wide range of hardware. Developers can get started with 2B today → https://t.co/hQRWYwGY7q

34

307

2K

Google AI

@GoogleAI

1 year

Can large language models (LLMs) explain their internal mechanisms? Check out the latest AI Explorable on Patchscopes, an inspection framework that uses LLMs to explain the hidden representations of LLMs. Learn more → https://t.co/mvmix9hKs0

17

149

575

kyutai

@kyutai_labs

1 year

Join us live tomorrow at 2:30pm CET for some exciting updates on our research!

13

38

244

Armand Joulin

@armandjoulin

1 year

Gemma 2 27B is now the best open model while being 2.5x smaller than alternatives! This validates the work done by the team and Gemini. This is just the beginning 💙♊️

lmarena.ai

@arena

1 year

We also collect more votes for Gemma-2-27B (now 5K+) for the past few days. Gemma-2 stays robust against Llama-3-70B, now the new best open model!

7

33

212

Google AI

@GoogleAI

2 years

Being able to interpret an #ML model’s hidden representations is key to understanding its behavior. Today we introduce Patchscopes, an approach that trains #LLMs to provide natural language explanations of their own hidden representations. Learn more → https://t.co/WfY1FYa1Wt

32

346

1K

Adam Roberts

@ada_rob

2 years

I love music most when it’s live, in the moment, and expressing something personal. This is why I’m psyched about the new “DJ mode” we developed for MusicFX: https://t.co/1Qk1VjnjEE It’s an infinite AI jam that you control 🎛️. Try mixing your unique 🌀 of instruments, genres,

45

102

434

Ian Tenney (@[email protected])

@iftenney

2 years

Super excited for the Gemma model release, and with it a new debugging tool we built on 🔥LIT - use gradient-based salience to debug and refine complex LLM prompts!

ai.google.dev

Google for Developers

@googledevs

2 years

Explore Gemma model’s behavior with The Learning Interpretability Tool (LIT), an open-source platform for debugging AI/ML models. ➡️ Improve prompts using saliency methods ➡️ Test hypotheses to improve model behavior ➡️ Democratize access to ML debugging https://t.co/pEyQAi75nk

1

4

13

Geoffrey Cideron

@CdrGeo

2 years

Happy to introduce our paper MusicRL, the first music generation system finetuned with human preferences. Paper link:

arxiv.org

We propose MusicRL, the first music generation system finetuned from human feedback. Appreciation of text-to-music models is particularly subjective since the concept of musicality as well as the...

2

29

80

Asma Ghandeharioun

@ghandeharioun

2 years

🧵Can we “ask” an LLM to “translate” its own hidden representations into natural language? We propose 🩺Patchscopes, a new framework for decoding specific information from a representation by “patching” it into a separate inference pass, independently of its original context. 1/9

15

146

764

Dan Friedman

@danfriedman0

2 years

We often interpret neural nets by studying simplified representations (e.g. low-dim visualization). But how faithful are these simplifications to the original model? In our new preprint, we found some surprising "interpretability illusions"... 1/6

3

47

282