Mathias Lechner @mlech26l X Profile

Mathias Lechner

@mlech26l

Followers

830

Following

891

Media

25

Statuses

123

Cofounder/CTO at Liquid AI and Research Affiliate MIT

Bay Area

Joined December 2017

Don't wanna be here? Send us removal request.

Mathias Lechner

@mlech26l

4 days

I wrote an article summarizing how we designed our tokenizer back in early 2024:

1

0

3

Mathias Lechner

@mlech26l

4 days

It's a good tokenizer, sir

1

4

36

Henry Ndubuaku

@Henry_Ndubuaku

7 days

We recently ported LFM2 by @LiquidAI_ to Cactus (YC S25), the 350m-i8 runs at 188 tokens/sec on M4 CPU-ONLY. Gemma3 270m-i8 runs at 170 tokens/sec for reference. On an old iPhone 13 Pro, it should near 100 tokens/sec, no NPU or GPU! It’s officially one of our recommended models

0

3

31

Mathias Lechner

@mlech26l

9 days

Meet LFM2-VL-3B, our latest on-device VLM. Top scores in multi-modal instruction following

0

2

6

Liquid AI

@LiquidAI_

18 days

We have a new nano LFM that is on-par with GPT-5 on data extraction with 350M parameters. Introducing LFM2-350M-PII-Extract-JP 🇯🇵 Extracts personally identifiable information (PII) from Japanese text → returns structured JSON for on-device masking of sensitive data. Delivers

14

38

394

Mathias Lechner

@mlech26l

19 days

Day 1 of the @LiquidAI_ fine-tuning hackathon in Tokyo this weekend. Jointly organized with @weights_biases and @LambdaAPI

1

7

50

Harold Benoit

@harold_matmul

24 days

It's a good model sir. Very proud of the team, we worked very hard to be on the Pareto frontier of quality and efficiency. Even had the chance to write a CPU-optimized kernel for MoE to squeeze everything from the hardware, and that gave us those sweet throughput results.

Liquid AI

@LiquidAI_

24 days

Meet LFM2-8B-A1B, our first on-device Mixture-of-Experts (MoE)! 🐘 > LFM2-8B-A1B is the best on-device MoE in terms of both quality and speed. > Performance of a 3B-4B model class, with up to 5x faster inference profile on CPUs and GPUs. > Quantized variants fit comfortably on

0

5

45

Mathias Lechner

@mlech26l

24 days

LFM2-8B-A1B: Our MoE model that runs on a phone. This is just the start, much more to come ...

0

3

25

Liquid AI

@LiquidAI_

24 days

Meet LFM2-8B-A1B, our first on-device Mixture-of-Experts (MoE)! 🐘 > LFM2-8B-A1B is the best on-device MoE in terms of both quality and speed. > Performance of a 3B-4B model class, with up to 5x faster inference profile on CPUs and GPUs. > Quantized variants fit comfortably on

13

92

508

Mathias Lechner

@mlech26l

26 days

https://t.co/mZqPqGSjeK

0

4

Mathias Lechner

@mlech26l

26 days

We achieved strong LLM performance + blazing fast edge inference with just: - Grouped Query Attention (global sequence mixer) - Double Gated short convolutions (local sequence mixer) - No linear attention/SSMs needed

0

4

Mathias Lechner

@mlech26l

26 days

Just wrote my first Substack post: "Flipping the Script: Why Short Convolutions Don't Need Linear Attention" TL;DR: Everyone's asking why linear attention needs short convolutions. We asked the opposite: do short convolutions need linear attention? LFM2 proves they don't 🎯

2

14

Eustache Le Bihan

@eustachelb

28 days

Cool release by @LiquidAI_: LFM2-Audio-1.5B It’s a pretty cool omni-architecture that enables prediction of both text and audio tokens, meaning it can handle multi-turn S2S, ASR, and TTS (with voice description) within a single model. Great to see, once again this year, a model

2

32

161

Mathias Lechner

@mlech26l

1 month

Our @LiquidAI_ LFM2-Audio-1.5 in a nutshell: - both text and audio in - both text and audio out - 1.5B -> runs locally - open-weight license

0

9

36

Catherine Arnett

@linguist_cat

1 month

I have a new blog post about the so-called “tokenizer-free” approach to language modeling and why it’s not tokenizer-free at all. I also talk about why people hate tokenizers so much!

25

63

550

Mathias Lechner

@mlech26l

1 month

We continue to scale our @LiquidAI_ LFM2 series of super efficient language models with LFM2-2.6B

1

6

25

Harold Benoit

@harold_matmul

1 month

The secret sauce most definitely is in the data, given that the architecure is fairly standard: Qwen3 backbone + NaViT SigLip2 (i.e. it uses packed vision sequences). They use patch_size=16 and pixel_shuffle_scale_factor=2 in order to use few image tokens. A 256x256 image will

Perceptron AI

@perceptroninc

1 month

1/ Introducing Isaac 0.1 — our first perceptive-language model. 2B params, open weights. Matches or beats models significantly larger on core perception. We are pushing the efficient frontier for physical AI. https://t.co/dJ1Wjh2ARK

2

1

18

Mathias Lechner

@mlech26l

1 month

We trained our @LiquidAI_ LFM2-350M model 1400x beyond "compute optimal" > Chinchilla scaling laws: ~20 tokens per param > LFM2-350M: ~28,000 tokens per param (1400x more) Why? Because Chinchilla only concerns training compute, while we care about inference cost

0

4

29

Rom Parnichkun

@romu_nishi

2 months

Very proud of our team at Liquid AI Japan! We’ve just released our first Japanese task-specific SLM in the model library ( https://t.co/3CFVqrAF01), with many more to come. It’s a small 350M model (i.e. reaches 200 tok/s prefill 40tok/s decode on a Raspi5), so you may notice a

Liquid AI

@LiquidAI_

2 months

Can we get a 350M parameter model to perform as good as GPT4o on specialized tasks? Today, we release an instance of our LFM2-350M, fine-tuned to perform competitively with GPT-4o on real-time general bi-directional Japanese <> English translation of short to medium context.

1

14

48

omkaar

@omkizzy

2 months

I wrote a short article about LFM-2's (by @LiquidAI_ ) hybrid architecture w/ illustration + simple pytorch impl.

13

18

201