I'm happy to share the release of gemma.cpp - a lightweight, standalone C++ inference engine for Google's Gemma models: Have to say, it’s one of the best project experiences of my career. Tweet added by Austin Huang @austinvhuang

Austin Huang

4 months

I'm happy to share the release of gemma.cpp - a lightweight, standalone C++ inference engine for Google's Gemma models: Have to say, it’s one of the best project experiences of my career.

22

198

1K

Austin Huang

@austinvhuang

4 months

gemma.cpp is a minimalist implementation of Gemma 2B and 7B models: focusing on simplicity and directness rather than full generality, it takes inspiration from ggml, llama.c, and other "integrated" model implementations.

Gemma: Introducing new state-of-the-art open models

Gemma is a family of lightweight, state-of-the art open models built from the same research and technology used to create the Gemini models.

blog.google

1

3

30

Austin Huang

@austinvhuang

4 months

The goal of the project is to have a small experimental inference engine for experimentation and research. The codebase has minimal dependencies and is portable pure C++ (taking advantage of for portable SIMD).

GitHub - google/highway: Performance-portable, length-agnostic SIMD with runtime dispatch

Performance-portable, length-agnostic SIMD with runtime dispatch - google/highway

github.com

1

29

Austin Huang

@austinvhuang

4 months

The core implementation is ~ 2K LOC,w/ ~ 4K LOC supporting code. It’s meant to be both hackable and also embeddable as a library w/ cmake. Prototype your apps with local LLM inference as a C++ function call. Add runtime support for your own research with a few lines of code.

1

20

Austin Huang

@austinvhuang

4 months

Beyond the interactive terminal ui for playing with the model, with near-instant model loading we can use gemma as a local-first command line LLM tool.

1

2

23

Austin Huang

@austinvhuang

4 months

Jan Wassenberg (author of ) and I started gemma.cpp as a small project just a few months ago. We were lucky to find amazing collaborators from around Google - @PhilCulliton , @dancherp , Paul Chang, and of course, the GDM Gemma team.

1

19

Austin Huang

@austinvhuang

4 months

What's next? There’s a lot of low-hanging fruit - we welcome external collaborators . I'm most excited to enable new research on co-design between models + inference engines. Stay tuned. “Now that things are so simple, there's so much to do.” - M. Feldman

GitHub - google/gemma.cpp: lightweight, standalone C++ inference engine for Google's Gemma models.

lightweight, standalone C++ inference engine for Google's Gemma models. - google/gemma.cpp

github.com

0

2

36

Yuriy Yuzifovich

@yvyuz

4 months

@austinvhuang This is great! Any plans to contribute to llamacpp?

1

0

Austin Huang

@austinvhuang

4 months

@yvyuz Would be happy to work together in some form, @ggerganov has done a lot for open models. There was already a patch to get Gemma into llama.cpp pretty early today:

Georgi Gerganov

@ggerganov

4 months

Run @Google 's Gemma Open Models with llama.cpp

22

79

645

0

2

toni

@AnthonyGM_g

4 months

@austinvhuang this is great..working on gemma.mojo

2

0

7

Austin Huang

@austinvhuang

4 months

@AnthonyGM_g 🔥

0

Ram Koppu

@ramkumarkoppu

4 months

@austinvhuang Does it utilizes neural accelerator if I run it on coral micro board?

1

0

2

Austin Huang

@austinvhuang

4 months

@ramkumarkoppu We're starting with portable cpu simd as a common denominator, accelerator support is an important priority next. Happy to have collaborators join on this.

0

4

Alexy 🤍💙🤍

@ChiefScientist

4 months

@austinvhuang I knew you’re behind it! Where is Haskemma?!:)

1

0

Austin Huang

@austinvhuang

4 months

@ChiefScientist FP friends - gemma.dex please ;) 2k loc in c++ probably turns into 200 loc in Dex.

GitHub - google-research/dex-lang: Research language for array processing in the Haskell/ML family

Research language for array processing in the Haskell/ML family - google-research/dex-lang

github.com

2

0

4

William Falcon ⚡️

@_willfalcon

4 months

@austinvhuang congratulations! this is super exciting. can’t wait to try it

0

1

Mitch🦖

@mitch_7w

4 months

@austinvhuang so damn cool

0

1

AI Deeply

@AiDeeply

4 months

@austinvhuang Looks useful. And fully open: "Apache-2.0, BSD-3-Clause licenses found"

0

Bharath Ramsundar

@rbhar90

4 months

@austinvhuang Super cool!

0

1

The AI Edge

@The_AI_Edge

4 months

@austinvhuang Developers can now access an open-source LLM that packs the advanced capabilities and performance of models like Gemini and LLaMA into a compact model optimized for mainstream use.

0

Will Townsend

@wtsnz

4 months

@austinvhuang Nice work!

0

1

Replies