#llama_cpp X Hashtag

Explore tweets tagged as #llama_cpp

Rohan Paul

@rohanpaul_ai

14 days

👨‍🔧 Github: Microsoft's open-sourced bitnet.cpp, a 1-bit inference framework that makes massive LLMs run efficiently on CPUs:. You don't even need a GPU to run some LLMs with it. 20.9K Stars ⭐️. - 6× faster inference.- 82% lower energy use.- Supports Llama 3, Falcon 3, and BitNet

13

57

331

Lμke Swi☨zer

@LukeSwitzer_

1 day

Exposed LLM server queries for @shodanhq . port:11434 "Ollama".port:8000 "vLLM".port:8000 "llama.cpp".port:8080 "llama.cpp".port:1234 "LM Studio".port:4891 "GPT4All".port:8000 "LangChain".

4

59

310

Unsloth AI

@UnslothAI

17 days

Learn to fine-tune OpenAI gpt-oss with our new step-by-step guide!. Learn about:.• Local gpt-oss training + inference FAQ & tips.• Evaluation, hyperparameters & overfitting.• Reasoning effort, Data prep.• Run & saving your LLM to llama.cpp GGUF, HF. 🔗

9

143

852

0xor0ne

@0xor0ne

13 days

Llama.cpp exploitation (heap buffer overflow). #cybersecurity #llama

3

66

255

Lyle.Pro

@LylePro892440

5 days

One step closer to MVP, thanks to node-llama-cpp. This is running gpt-oss-20b 100% local on my machine, which is a 2016 MBP M1.

0

1

5

Georgi Gerganov

@ggerganov

14 days

llama.cpp is on fire today!. It's amazing to see this open-source collaboration getting stronger every day

14

47

582

Xuan-Son Nguyen

@ngxson

21 days

Kudos to Google and the llama.cpp team! 🤝 . GGUF support for Gemma 270M right from day-0

9

12

148

ハカセアイ(Ai-Hakase)🐾最新トレンドＡＩのためのＸ 🐾

@ai_hakase_

11 days

【速報！】ByteDanceの「Seed-OSS」が「llama.cpp」にマージ！🎉. 皆さん、ビッグニュースですよ！✨ ByteDanceが開発した大規模言語モデル（LLM）「Seed-OSS」のサポートが、なんと「llama.cpp」プロジェクトにマージされました！これは見逃せない進展ですね！

1

2

Xuan-Son Nguyen

@ngxson

15 days

Since Firefox 142.0beta, wllama (🦙 llama.cpp webassembly) is available as a built-in API for extensions 🔥

3

0

17

Charly Wargnier

@DataChaz

7 days

more about Llama.cpp here:.

2

7

88

k0

@AGI0K

27 days

My GUI for Llama.cpp.#LLAMACPP #LocalLLaMA

0

2

Brody | Rent-to-Own GPUs

@BrodyAutonomous

13 days

Qwen3-30B running at ~3929 tok/s on EdgeAI Computer (2 x 4090D) with llama.cpp ⚡️

2

0

22

Charly Wargnier

@DataChaz

7 days

This is wild. A real-time webcam demo using SmolVLM from @huggingface and llama.cpp! 🤯. Running fully local on a MacBook M3.

68

517

4K

Alif Hossain

@alifcoder

12 days

One of the quickest ways to start playing with a good local LLM on macOS (if you have ~12GB of free disk space and RAM) - using llama-server and gpt-oss-20b:.brew install llama.cpp. llama-server -hf ggml-org/gpt-oss-20b-GGUF \.--ctx-size 0 --jinja -ub 2048 -b 2048 -ngl 99 -fa. -

12

3

19

Ivan Fioravanti ᯅ

@ivanfioravanti

8 days

Llama.cpp is now faster with Qwen3-30B-A3B-Thinking-2507 on Metal! Much faster! 🔥. I will retest on 3090 TI with Cuda and publish updated benchmarks!. Thanks @ggerganov for the new release!

14

27

367

ハカセアイ(Ai-Hakase)🐾最新トレンドＡＩのためのＸ 🐾

@ai_hakase_

3 days

【速報！】あのllama.cppが拡散モデル「Dream 7B」とついに合体？！✨ AIの新たな地平が見えてきましたよ！👀. ローカルLLMを動かす「llama.cpp」に、テキスト生成に特化した拡散モデル「Dream 7B」がマージされました！🎉 これはAI界隈のビッグニュースです。. 「Dream

1

2

8

Simon Willison

@simonw

16 days

One of the quickest ways to start playing with a good local LLM on macOS (if you have ~12GB of free disk space and RAM) - using llama-server and gpt-oss-20b:. brew install llama.cpp.llama-server -hf ggml-org/gpt-oss-20b-GGUF \. --ctx-size 0 --jinja -ub 2048 -b 2048 -ngl 99 -fa

Georgi Gerganov

@ggerganov

16 days

The ultimate guide for using gpt-oss with llama.cpp . - Runs on any device.- Supports NVIDIA, Apple, AMD and others.- Support for efficient CPU offloading.- The most lightweight inference stack today.

23

72

785

サメQCU

@sameQCU

13 days

@app still just works. so anyways. if you wrap llamacpp in an api server instead of trying to use 'python bindings', you recover the complete llamacpp external api. which is not recoverable or usable in the so-called 'python bindings' of llama-cpp-python.

1

0

2

ゆー@

@YoutechA320U

22 days

ということで以前「生成AI何でも展示会」に出したWeb検索デモをllama-cpp-pythonからllama.cppに移植してgpt-oss-12b-MXFP4.ggufで動かした様子（これは12.8kコンテキスト長）.検索クエリ生成の際（0:05~0:11）に暴走しかけていますが見事に持ち直しています。.他のReasoningモデルにはない挙動です。

0

3