Explore tweets tagged as #llama_cpp
@rohanpaul_ai
Rohan Paul
14 days
👨‍🔧 Github: Microsoft's open-sourced bitnet.cpp, a 1-bit inference framework that makes massive LLMs run efficiently on CPUs:. You don't even need a GPU to run some LLMs with it. 20.9K Stars ⭐️. - 6× faster inference.- 82% lower energy use.- Supports Llama 3, Falcon 3, and BitNet
Tweet media one
13
57
331
@LukeSwitzer_
Lμke Swi☨zer
1 day
Exposed LLM server queries for @shodanhq . port:11434 "Ollama".port:8000 "vLLM".port:8000 "llama.cpp".port:8080 "llama.cpp".port:1234 "LM Studio".port:4891 "GPT4All".port:8000 "LangChain".
4
59
310
@UnslothAI
Unsloth AI
17 days
Learn to fine-tune OpenAI gpt-oss with our new step-by-step guide!. Learn about:.• Local gpt-oss training + inference FAQ & tips.• Evaluation, hyperparameters & overfitting.• Reasoning effort, Data prep.• Run & saving your LLM to llama.cpp GGUF, HF. 🔗
Tweet media one
9
143
852
@0xor0ne
0xor0ne
13 days
Llama.cpp exploitation (heap buffer overflow). #cybersecurity #llama
Tweet media one
Tweet media two
3
66
255
@LylePro892440
Lyle.Pro
5 days
One step closer to MVP, thanks to node-llama-cpp. This is running gpt-oss-20b 100% local on my machine, which is a 2016 MBP M1.
0
1
5
@ggerganov
Georgi Gerganov
14 days
llama.cpp is on fire today!. It's amazing to see this open-source collaboration getting stronger every day
Tweet media one
14
47
582
@ngxson
Xuan-Son Nguyen
21 days
Kudos to Google and the llama.cpp team! 🤝 . GGUF support for Gemma 270M right from day-0
Tweet media one
9
12
148
@ai_hakase_
ハカセ アイ(Ai-Hakase)🐾最新トレンドAIのためのX 🐾
11 days
【速報!】ByteDanceの「Seed-OSS」が「llama.cpp」にマージ!🎉. 皆さん、ビッグニュースですよ!✨ ByteDanceが開発した大規模言語モデル(LLM)「Seed-OSS」のサポートが、なんと「llama.cpp」プロジェクトにマージされました!これは見逃せない進展ですね!
Tweet media one
1
1
2
@ngxson
Xuan-Son Nguyen
15 days
Since Firefox 142.0beta, wllama (🦙 llama.cpp webassembly) is available as a built-in API for extensions 🔥
Tweet media one
3
0
17
@DataChaz
Charly Wargnier
7 days
more about Llama.cpp here:.
Tweet media one
2
7
88
@AGI0K
k0
27 days
My GUI for Llama.cpp.#LLAMACPP #LocalLLaMA
Tweet media one
0
0
2
@BrodyAutonomous
Brody | Rent-to-Own GPUs
13 days
Qwen3-30B running at ~3929 tok/s on EdgeAI Computer (2 x 4090D) with llama.cpp ⚡️
2
0
22
@DataChaz
Charly Wargnier
7 days
This is wild. A real-time webcam demo using SmolVLM from @huggingface and llama.cpp! 🤯. Running fully local on a MacBook M3.
68
517
4K
@alifcoder
Alif Hossain
12 days
One of the quickest ways to start playing with a good local LLM on macOS (if you have ~12GB of free disk space and RAM) - using llama-server and gpt-oss-20b:.brew install llama.cpp. llama-server -hf ggml-org/gpt-oss-20b-GGUF \.--ctx-size 0 --jinja -ub 2048 -b 2048 -ngl 99 -fa. -
Tweet media one
12
3
19
@ivanfioravanti
Ivan Fioravanti ᯅ
8 days
Llama.cpp is now faster with Qwen3-30B-A3B-Thinking-2507 on Metal! Much faster! 🔥. I will retest on 3090 TI with Cuda and publish updated benchmarks!. Thanks @ggerganov for the new release!
Tweet media one
14
27
367
@ai_hakase_
ハカセ アイ(Ai-Hakase)🐾最新トレンドAIのためのX 🐾
3 days
【速報!】あのllama.cppが拡散モデル「Dream 7B」とついに合体?!✨ AIの新たな地平が見えてきましたよ!👀. ローカルLLMを動かす「llama.cpp」に、テキスト生成に特化した拡散モデル「Dream 7B」がマージされました!🎉 これはAI界隈のビッグニュースです。. 「Dream
Tweet media one
1
2
8
@simonw
Simon Willison
16 days
One of the quickest ways to start playing with a good local LLM on macOS (if you have ~12GB of free disk space and RAM) - using llama-server and gpt-oss-20b:. brew install llama.cpp.llama-server -hf ggml-org/gpt-oss-20b-GGUF \. --ctx-size 0 --jinja -ub 2048 -b 2048 -ngl 99 -fa
Tweet media one
@ggerganov
Georgi Gerganov
16 days
The ultimate guide for using gpt-oss with llama.cpp . - Runs on any device.- Supports NVIDIA, Apple, AMD and others.- Support for efficient CPU offloading.- The most lightweight inference stack today.
23
72
785
@sameQCU
サメQCU
13 days
@app still just works. so anyways. if you wrap llamacpp in an api server instead of trying to use 'python bindings', you recover the complete llamacpp external api. which is not recoverable or usable in the so-called 'python bindings' of llama-cpp-python.
1
0
2
@YoutechA320U
ゆー@
22 days
ということで以前「生成AI何でも展示会」に出したWeb検索デモをllama-cpp-pythonからllama.cppに移植してgpt-oss-12b-MXFP4.ggufで動かした様子(これは12.8kコンテキスト長).検索クエリ生成の際(0:05~0:11)に暴走しかけていますが見事に持ち直しています。.他のReasoningモデルにはない挙動です。
0
3
3