Aleph Alpha @Aleph__Alpha X Profile

Aleph Alpha

@Aleph__Alpha

Followers

9K

Following

513

Media

192

Statuses

484

Our mission is a European generalizable AI. We're hiring: https://t.co/k7MxJK1XU1 #AGI, #artificialintelligence, #writtenbyahuman,#writtenbyanAI

https://t.co/1EGJEObcrx

Heidelberg, Germany

Joined December 2018

Don't wanna be here? Send us removal request.

Aleph Alpha

@Aleph__Alpha

2 months

Inference isn’t just about speed. Our latest blog breaks down how hardware choices impact latency, throughput & cost when serving massive models like DeepSeek v3. Read the blog & full report: https://t.co/hGgzBPFh3k

1

25

Aleph Alpha

@Aleph__Alpha

3 months

Try them out, experiment with the architecture, and see how tokenizer-free models handle your language or domain. Find our blog post here: https://t.co/2lgLEVju6a Or read our model cards:

huggingface.co

0

27

Aleph Alpha

@Aleph__Alpha

3 months

We’re shipping both models with: Hugging Face inference for easy testing vLLM fork optimized for HAT – note this is still under active development Weights & inference code are available today under Open Aleph license for research and educational use.

1

0

22

Aleph Alpha

@Aleph__Alpha

3 months

In LLM-as-a-judge evaluations (MTBench), the 7B-DPO model’s answers were preferred over Llama 3.1 8B in: 69% of English samples 75% of German samples

1

0

19

Aleph Alpha

@Aleph__Alpha

3 months

Language performance (comparing the DPO model): German: Outperforms Llama 3.1 8B Instruct on 67% of benchmarks. English: On-par with Llama 3.1 8B Instruct Consistent bilingual quality while keeping the model compact and efficient.

1

0

20

Aleph Alpha

@Aleph__Alpha

3 months

By skipping subword tokenizers (which fragment non-English text), HAT packs more information per position. Higher compression → fewer generation steps → less FLOPs. In our tests, TFree-HAT delivers on average +40% compression in German, +16% in English vs. standard tokenization

1

28

Aleph Alpha

@Aleph__Alpha

3 months

We’re releasing: - TFree-HAT-Pretrained-7B-Base – trained from scratch in English & German - Llama-TFree-HAT-Pretrained-7B-DPO – post-trained on the base version for stronger instruction following Designed for efficient deployment with on-device and local inference setups.

2

0

28

Aleph Alpha

@Aleph__Alpha

3 months

Introducing two new tokenizer-free LLM checkpoints from our research lab: TFree-HAT 7B Built on our Hierarchical Autoregressive Transformer (HAT) architecture, these models achieve top-tier German and English performance while processing text on a UTF-8 byte level.

17

46

442

Aleph Alpha

@Aleph__Alpha

3 months

We signed it, sending a signal to the world. The EU’s AI Code of Practice isn’t just another piece of regulation. It’s a declaration. Of what AI can be when built with transparency, responsibility and sovereignty. Just like ours. https://t.co/P7wmxza5gJ

aleph-alpha.com

We signed the Code of Practice. We also helped shape it. Invited by the EU’s AI Office to contribute during its drafting, we supported a framework grounded in the realities of building AI in Europe....

2

3

12

Aleph Alpha

@Aleph__Alpha

6 months

Read our paper for more information:

arxiv.org

Scaling data quantity is essential for large language models (LLMs), yet recent findings show that data quality can significantly boost performance and training efficiency. We introduce a...

0

5

Aleph Alpha

@Aleph__Alpha

6 months

We gathered here to m#ourn the tokenizer, whose time has passed. We released three 8B models – and said goodbye to an old friend. Find our models on Huggingface: https://t.co/7ydS4f4DKq And read more in our paper: https://t.co/3t3z9kYxyn #RIPTokenizer

2

7

36

Aleph Alpha

@Aleph__Alpha

7 months

For #ICLR2025 we are unveiling a new, high-quality pretraining dataset for German LLMs. Shared to strengthen the open research community. Shaped by our belief in excellence and transparency. https://t.co/DxI4kVaosA

3

8

45

Aleph Alpha

@Aleph__Alpha

7 months

🐍 #PyCon has kicked off! Swing by our booth to chat about your career aspirations and explore how we can collaborate 🤝 Let us know you're coming by filling out this form: https://t.co/uXzeKrjbT9

0

5

Aleph Alpha

@Aleph__Alpha

7 months

Excited for ICLR 2025 in Singapore? Join our BoF Social (24 Apr, 12:30 p.m., Opal 103-104) on tokenizer-free, end-to-end architectures. Ready for insightful discussions and networking? Sign up here https://t.co/x7szBWOCg2 #ICLR2025 #AIResearch #EnterpriseAI #Tokenizers

0

4

8

Aleph Alpha

@Aleph__Alpha

10 months

🚀 Exciting Announcement from Davos: Aleph Alpha Unveils Tokenizer-Free LLMs! 🚀 We’re thrilled to announce a pioneering innovation that was unveiled yesterday at the World Economic Forum in Davos: Aleph Alpha has introduced a groundbreaking tokenizer-free (T-Free) LLM

5

15

81

Aleph Alpha

@Aleph__Alpha

1 year

@GCResearchTeam Here are the links for this release: Huggingface: https://t.co/4e6GMt0FOS Paper: https://t.co/xjuyPtWBdH Codebase:

1

0

7

Aleph Alpha

@Aleph__Alpha

1 year

New in Neural Network Parametrization Technique: Introducing Unit-Scaled Maximal Update Parametrization (u-μP). In partnership with @GCResearchTeam, u-μP merges μP and Unit Scaling to boost training stability & hyperparameter transfer across model sizes. Read more about the

2

12

61

Aleph Alpha

@Aleph__Alpha

1 year

Read our full paper here: https://t.co/WtBLFm3SCA Dive into the source code of T-Free: https://t.co/kzGPEew958 Try out our interim research model checkpoints: https://t.co/Dm01KoW1oN https://t.co/R0KDYbxWGk (3/3)

0

1

20

Aleph Alpha

@Aleph__Alpha

1 year

Our innovation, T-Free, offers a novel approach to tokenization, boosting tokenizer fertility across various languages, and reducing the size of the embedding layer by up to 75% compared to traditional tokenizers. Early experiments with T-Free show promising results and could

1

2

19

Aleph Alpha

@Aleph__Alpha

1 year

Today we introduce T-Free, a new paradigm in language processing. Tokenization is one of the core building blocks of large language models (LLMs), transforming natural language into numeric representations for further processing. (1/3) 🔗 https://t.co/V6ipiPVVLe

1

35

142