ggerganov Profile Banner
Georgi Gerganov Profile
Georgi Gerganov

@ggerganov

Followers
53K
Following
3K
Media
292
Statuses
2K

24th at the Electrica puzzle challenge | https://t.co/baTQS2bdia

Joined May 2015
Don't wanna be here? Send us removal request.
@ggerganov
Georgi Gerganov
9 months
New account for ggml news and notable PRs
11
30
268
@ggerganov
Georgi Gerganov
22 days
In collaboration with NVIDIA, the new Nemotron 3 Nano model is fully supported in llama.cpp Nemotron 3 Nano features an efficient hybrid, Mamba, MoE architecture. It's a promising model, suitable for local AI applications on mid-range hardware. The large context window makes it
developer.nvidia.com
Agentic AI systems increasingly rely on collections of cooperating agents—retrievers, planners, tool executors, verifiers—working together across large contexts and long time spans.
8
43
405
@ggerganov
Georgi Gerganov
26 days
> llama-cli -hf org/model
12
62
565
@ggerganov
Georgi Gerganov
26 days
> llama-cli -hf org/model
12
62
565
@ngxson
Xuan-Son Nguyen
27 days
Introducing: the new llama-cli 🦙🦙 > Clean looking interface > Multimodal support > Conversation control via commands > Speculative decoding support > Jinja fully supported
2
23
116
@ggerganov
Georgi Gerganov
1 month
The new Mistral 3 models in llama.cpp
14
24
366
@ggerganov
Georgi Gerganov
1 month
We joined forces with NVIDIA to unlock high-speed AI inference on RTX AI PCs and DGX Spark using llama.cpp. The latest Ministral-3B models reach 385+ tok/s on @NVIDIA_AI_PC GeForce RTX 5090 systems. Blog:
developer.nvidia.com
The new Mistral 3 open model family delivers industry-leading accuracy, efficiency, and customization capabilities for developers and enterprises. Optimized from NVIDIA GB200 NVL72 to edge platforms…
16
42
427
@ggerganov
Georgi Gerganov
1 month
The new Mistral 3 models in llama.cpp
14
24
366
@LysandreJik
Lysandre
1 month
Transformers v5's first release candidate is out 🔥 The biggest release of my life. It's been five years since the last major (v4). From 20 architectures to 400, 20k daily downloads to 3 million. The release is huge, w/ tokenization (no slow tokenizers!), modeling & processing.
20
89
577
@ngxson
Xuan-Son Nguyen
1 month
WIP: using multiple models at the same time with llama-server 🦙
3
2
22
@geerlingguy
Jeff Geerling
2 months
Just tried out the new built-in WebUI feature of llama.cpp and it couldn't be easier. Just start llama-server with a host and port, and voila!
15
11
163
@erusev
Emanuil Rusev
2 months
@fishright @ggerganov Just pushed a fix for this — this is what first launch is going to look like in the next version.
1
1
10
@ggerganov
Georgi Gerganov
2 months
LlamaBarn v0.10.0 (beta) is out - feedback appreciated
16
15
214
@ggerganov
Georgi Gerganov
2 months
A detailed look into the new WebUI of llama.cpp
29
83
773
@ClementDelangue
clem 🤗
2 months
When you run AI on your device, it is more efficient and less big brother and free! So it's very cool to see the new llama.cpp UI, a chatgpt-like app that fully runs on your laptop without needing wifi or sending any data external to any API. It supports: - 150,000+ GGUF models
51
175
2K
@ggerganov
Georgi Gerganov
2 months
The new WebUI in combination with the advanced backend capabilities of llama.cpp delivers the ultimate local AI chat experience It's fast, private, free and open-source It runs on any hardware - today Huge thanks to the team at @huggingface for initiating, leading and
github.com
Overview This guide highlights the key features of the new SvelteKit-based WebUI of llama.cpp. The new WebUI in combination with the advanced backend capabilities of the llama-server delivers the u...
6
17
149