mlech26l Profile Banner
Mathias Lechner Profile
Mathias Lechner

@mlech26l

Followers
830
Following
891
Media
25
Statuses
123

Cofounder/CTO at Liquid AI and Research Affiliate MIT

Bay Area
Joined December 2017
Don't wanna be here? Send us removal request.
@mlech26l
Mathias Lechner
4 days
I wrote an article summarizing how we designed our tokenizer back in early 2024:
1
0
3
@mlech26l
Mathias Lechner
4 days
It's a good tokenizer, sir
1
4
36
@Henry_Ndubuaku
Henry Ndubuaku
7 days
We recently ported LFM2 by @LiquidAI_ to Cactus (YC S25), the 350m-i8 runs at 188 tokens/sec on M4 CPU-ONLY. Gemma3 270m-i8 runs at 170 tokens/sec for reference. On an old iPhone 13 Pro, it should near 100 tokens/sec, no NPU or GPU! It’s officially one of our recommended models
0
3
31
@mlech26l
Mathias Lechner
9 days
Meet LFM2-VL-3B, our latest on-device VLM. Top scores in multi-modal instruction following
0
2
6
@LiquidAI_
Liquid AI
18 days
We have a new nano LFM that is on-par with GPT-5 on data extraction with 350M parameters. Introducing LFM2-350M-PII-Extract-JP 🇯🇵 Extracts personally identifiable information (PII) from Japanese text → returns structured JSON for on-device masking of sensitive data. Delivers
14
38
394
@mlech26l
Mathias Lechner
19 days
Day 1 of the @LiquidAI_ fine-tuning hackathon in Tokyo this weekend. Jointly organized with @weights_biases and @LambdaAPI
1
7
50
@harold_matmul
Harold Benoit
24 days
It's a good model sir. Very proud of the team, we worked very hard to be on the Pareto frontier of quality and efficiency. Even had the chance to write a CPU-optimized kernel for MoE to squeeze everything from the hardware, and that gave us those sweet throughput results.
@LiquidAI_
Liquid AI
24 days
Meet LFM2-8B-A1B, our first on-device Mixture-of-Experts (MoE)! 🐘 > LFM2-8B-A1B is the best on-device MoE in terms of both quality and speed. > Performance of a 3B-4B model class, with up to 5x faster inference profile on CPUs and GPUs. > Quantized variants fit comfortably on
0
5
45
@mlech26l
Mathias Lechner
24 days
LFM2-8B-A1B: Our MoE model that runs on a phone. This is just the start, much more to come ...
0
3
25
@LiquidAI_
Liquid AI
24 days
Meet LFM2-8B-A1B, our first on-device Mixture-of-Experts (MoE)! 🐘 > LFM2-8B-A1B is the best on-device MoE in terms of both quality and speed. > Performance of a 3B-4B model class, with up to 5x faster inference profile on CPUs and GPUs. > Quantized variants fit comfortably on
13
92
508
@mlech26l
Mathias Lechner
26 days
0
0
4
@mlech26l
Mathias Lechner
26 days
We achieved strong LLM performance + blazing fast edge inference with just: - Grouped Query Attention (global sequence mixer) - Double Gated short convolutions (local sequence mixer) - No linear attention/SSMs needed
0
0
4
@mlech26l
Mathias Lechner
26 days
Just wrote my first Substack post: "Flipping the Script: Why Short Convolutions Don't Need Linear Attention" TL;DR: Everyone's asking why linear attention needs short convolutions. We asked the opposite: do short convolutions need linear attention? LFM2 proves they don't 🎯
2
2
14
@eustachelb
Eustache Le Bihan
28 days
Cool release by @LiquidAI_: LFM2-Audio-1.5B It’s a pretty cool omni-architecture that enables prediction of both text and audio tokens, meaning it can handle multi-turn S2S, ASR, and TTS (with voice description) within a single model. Great to see, once again this year, a model
2
32
161
@mlech26l
Mathias Lechner
1 month
Our @LiquidAI_ LFM2-Audio-1.5 in a nutshell: - both text and audio in - both text and audio out - 1.5B -> runs locally - open-weight license
0
9
36
@linguist_cat
Catherine Arnett
1 month
I have a new blog post about the so-called “tokenizer-free” approach to language modeling and why it’s not tokenizer-free at all. I also talk about why people hate tokenizers so much!
25
63
550
@mlech26l
Mathias Lechner
1 month
We continue to scale our @LiquidAI_ LFM2 series of super efficient language models with LFM2-2.6B
1
6
25
@harold_matmul
Harold Benoit
1 month
The secret sauce most definitely is in the data, given that the architecure is fairly standard: Qwen3 backbone + NaViT SigLip2 (i.e. it uses packed vision sequences). They use patch_size=16 and pixel_shuffle_scale_factor=2 in order to use few image tokens. A 256x256 image will
@perceptroninc
Perceptron AI
1 month
1/ Introducing Isaac 0.1 — our first perceptive-language model. 2B params, open weights. Matches or beats models significantly larger on core perception. We are pushing the efficient frontier for physical AI. https://t.co/dJ1Wjh2ARK
2
1
18
@mlech26l
Mathias Lechner
1 month
We trained our @LiquidAI_ LFM2-350M model 1400x beyond "compute optimal" > Chinchilla scaling laws: ~20 tokens per param > LFM2-350M: ~28,000 tokens per param (1400x more) Why? Because Chinchilla only concerns training compute, while we care about inference cost
0
4
29
@romu_nishi
Rom Parnichkun
2 months
Very proud of our team at Liquid AI Japan! We’ve just released our first Japanese task-specific SLM in the model library ( https://t.co/3CFVqrAF01), with many more to come. It’s a small 350M model (i.e. reaches 200 tok/s prefill 40tok/s decode on a Raspi5), so you may notice a
@LiquidAI_
Liquid AI
2 months
Can we get a 350M parameter model to perform as good as GPT4o on specialized tasks? Today, we release an instance of our LFM2-350M, fine-tuned to perform competitively with GPT-4o on real-time general bi-directional Japanese <> English translation of short to medium context.
1
14
48
@omkizzy
omkaar
2 months
I wrote a short article about LFM-2's (by @LiquidAI_ ) hybrid architecture w/ illustration + simple pytorch impl.
13
18
201