neumll Profile Banner
NeuML Profile
NeuML

@neumll

Followers
709
Following
792
Media
376
Statuses
959

NeuML is the company behind txtai, one of the most popular open-source AI frameworks in the world. 🗓️ https://t.co/g4o2yL30qa

Washington DC Metro 🇺🇸
Joined April 2020
Don't wanna be here? Send us removal request.
@neumll
NeuML
2 hours
Did you know that TxtAI applications can be spun up from a YAML file? Check out this example that builds a RAG Pipeline with Docling + GPT OSS for any document. https://t.co/jdrYzpo2VN
0
1
2
@neumll
NeuML
9 hours
Language detection is an important task especially for routing requests to language specific models. This is easy with the staticvectors library. https://t.co/4d1ZwIo1MG
0
1
5
@neumll
NeuML
3 days
⚡ The Textractor pipeline is one of the most powerful pipelines in the TxtAI toolbox! This example converts documents to Markdown then splits by Markdown sections. A simple yet effective chunking strategy! https://t.co/3h1hw5mIrN
0
2
2
@neumll
NeuML
3 days
LLMs for text classification = 🤮 Encoder only models are a much better tool for the job! For resource-constrained devices, you should check out the BERT Hash series of models. You might be able to even get away with sub 1M params. Training code: https://t.co/jeakQa3Sm0
0
4
21
@neumll
NeuML
4 days
With TxtAI, in less than 10 lines of code you can extract, semantically chunk and vector index a webpage. This example shows how the data can be stored as a llama.cpp GGUF file. Pay attention to this, it's bigger than it appears... https://t.co/d9cBxazYxy
0
4
4
@neumll
NeuML
5 days
RAG is one of the most popular TxtAI use cases. Click to learn more. https://t.co/PPkP1puyZD
Tweet card summary image
medium.com
Get up and running fast with this easy-to-use application
0
1
4
@neumll
NeuML
6 days
🎉 We're excited to release txtai 9.1! 9.1 introduces vector "un-databases" - store vectors with NumPy, Torch and even GGUF from llama.cpp! Let's keep it simple when we can. Release Notes: https://t.co/ZKHLwqzJrJ https://t.co/t6KZHx45Ye
github.com
💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows - neuml/txtai
0
2
6
@neumll
NeuML
10 days
🚀 Did you know that TxtAI RAG and Agent apps can be hosted as a standard OpenAI API service? Submit a prompt and this becomes much smarter than a vanilla LLM inference call! https://t.co/KCj42wp1sv
0
1
3
@neumll
NeuML
10 days
💥 Ever think about storing your vector database as a GGUF file? With support for all the fancy quantization methods, device backends and other great things only LLMs are lucky to get right now? Coming soon with TxtAI 9.1! https://t.co/GNxJNY0fOD
1
1
5
@neumll
NeuML
10 days
Happy Halloween! 🎃👻 🌕
0
1
2
@neumll
NeuML
19 days
Want more control over your vector database? Then check out this article on using txtai's low-level APIs. https://t.co/lkFog22cl0
0
1
3
@neumll
NeuML
20 days
Cool to see that our PubMedBERT Embeddings model has been cited in over 45 medical/scientific/academic articles! Model: https://t.co/4xD39BPRSE Search: https://t.co/zymfaQmOJh
0
1
3
@neumll
NeuML
20 days
🤔 LLMs think in tensors and tokens not text. RAG requires prompts and text. REFRAG proposed reducing RAG tokenization. What if we add frozen knowledge vector layers to our LLMs for RAG? Interesting idea. TxtAI now supports directly building Torch knowledge vectors.
1
1
7
@neumll
NeuML
24 days
💥 Vector storage F32 vs INT8 vs F4
0
1
5
@neumll
NeuML
24 days
🚀 Why let LLMs have all the fun? It's time to run our vector databases like a LLM! An exciting change is coming in TxtAI 9.1. Vector databases fully on the GPU! FP4/NF4/INT8 quantization support and efficient on-GPU matrix multiplication. Link: https://t.co/3agsoJi1ns
0
1
4
@neumll
NeuML
1 month
💡 AI Workflows? Did you know that TxtAI had workflows years before many of the popular projects even existed? https://t.co/2fAVht0uk8
Tweet card summary image
medium.com
A guide on when to use small and large language models
0
2
2
@neumll
NeuML
1 month
🗎 Want the background on the BERT Hash Nano models? Then check out this article for more! https://t.co/ttOuV0xd1B
Tweet card summary image
medium.com
Learn how a simple tweak can drastically reduce model sizes
0
2
8
@neumll
NeuML
1 month
✨ We're proud to release the ColBERT Nano series of models. All 3 of these models come in at less than 1 million parameters (250K, 450K, 950K)! Late interaction models perform shockingly well with small models. Collection: https://t.co/gSVLMUrWcf Model: https://t.co/wUDXXDFRv7
2
30
240
@neumll
NeuML
1 month
🔥 240K parameters is all you need? Not quite but don't sleep on micromodels! This is a BERT model trained just like the original. The only difference is it's 240K parameters vs 110M. https://t.co/Ec8r73WfMr
Tweet card summary image
huggingface.co
1
1
5
@neumll
NeuML
1 month
🚀 Excited to release a new set of models: The BERT Hash Nano series! Forget millions and billions of parameters, how about thousands? Think a 250K parameter model is useless? Think again. https://t.co/x6gqoi68TP
huggingface.co
0
2
15