NeuML @neumll X Profile

NeuML

@neumll

Followers

709

Following

792

Media

376

Statuses

959

NeuML is the company behind txtai, one of the most popular open-source AI frameworks in the world. 🗓️ https://t.co/g4o2yL30qa

https://t.co/yICHgkuOvv

Washington DC Metro 🇺🇸

Joined April 2020

Don't wanna be here? Send us removal request.

NeuML

@neumll

2 hours

Did you know that TxtAI applications can be spun up from a YAML file? Check out this example that builds a RAG Pipeline with Docling + GPT OSS for any document. https://t.co/jdrYzpo2VN

0

1

2

NeuML

@neumll

9 hours

Language detection is an important task especially for routing requests to language specific models. This is easy with the staticvectors library. https://t.co/4d1ZwIo1MG

0

1

5

NeuML

@neumll

3 days

⚡ The Textractor pipeline is one of the most powerful pipelines in the TxtAI toolbox! This example converts documents to Markdown then splits by Markdown sections. A simple yet effective chunking strategy! https://t.co/3h1hw5mIrN

0

2

NeuML

@neumll

3 days

LLMs for text classification = 🤮 Encoder only models are a much better tool for the job! For resource-constrained devices, you should check out the BERT Hash series of models. You might be able to even get away with sub 1M params. Training code: https://t.co/jeakQa3Sm0

0

4

21

NeuML

@neumll

4 days

With TxtAI, in less than 10 lines of code you can extract, semantically chunk and vector index a webpage. This example shows how the data can be stored as a llama.cpp GGUF file. Pay attention to this, it's bigger than it appears... https://t.co/d9cBxazYxy

0

4

NeuML

@neumll

5 days

RAG is one of the most popular TxtAI use cases. Click to learn more. https://t.co/PPkP1puyZD

medium.com

Get up and running fast with this easy-to-use application

0

1

4

NeuML

@neumll

6 days

🎉 We're excited to release txtai 9.1! 9.1 introduces vector "un-databases" - store vectors with NumPy, Torch and even GGUF from llama.cpp! Let's keep it simple when we can. Release Notes: https://t.co/ZKHLwqzJrJ https://t.co/t6KZHx45Ye

github.com

💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows - neuml/txtai

0

2

6

NeuML

@neumll

10 days

🚀 Did you know that TxtAI RAG and Agent apps can be hosted as a standard OpenAI API service? Submit a prompt and this becomes much smarter than a vanilla LLM inference call! https://t.co/KCj42wp1sv

0

1

3

NeuML

@neumll

10 days

💥 Ever think about storing your vector database as a GGUF file? With support for all the fancy quantization methods, device backends and other great things only LLMs are lucky to get right now? Coming soon with TxtAI 9.1! https://t.co/GNxJNY0fOD

1

5

NeuML

@neumll

10 days

Happy Halloween! 🎃👻 🌕

0

1

2

NeuML

@neumll

19 days

Want more control over your vector database? Then check out this article on using txtai's low-level APIs. https://t.co/lkFog22cl0

0

1

3

NeuML

@neumll

20 days

Cool to see that our PubMedBERT Embeddings model has been cited in over 45 medical/scientific/academic articles! Model: https://t.co/4xD39BPRSE Search: https://t.co/zymfaQmOJh

0

1

3

NeuML

@neumll

20 days

🤔 LLMs think in tensors and tokens not text. RAG requires prompts and text. REFRAG proposed reducing RAG tokenization. What if we add frozen knowledge vector layers to our LLMs for RAG? Interesting idea. TxtAI now supports directly building Torch knowledge vectors.

1

7

NeuML

@neumll

24 days

💥 Vector storage F32 vs INT8 vs F4

0

1

5

NeuML

@neumll

24 days

🚀 Why let LLMs have all the fun? It's time to run our vector databases like a LLM! An exciting change is coming in TxtAI 9.1. Vector databases fully on the GPU! FP4/NF4/INT8 quantization support and efficient on-GPU matrix multiplication. Link: https://t.co/3agsoJi1ns

0

1

4

NeuML

@neumll

1 month

💡 AI Workflows? Did you know that TxtAI had workflows years before many of the popular projects even existed? https://t.co/2fAVht0uk8

medium.com

A guide on when to use small and large language models

0

2

NeuML

@neumll

1 month

🗎 Want the background on the BERT Hash Nano models? Then check out this article for more! https://t.co/ttOuV0xd1B

medium.com

Learn how a simple tweak can drastically reduce model sizes

0

2

8

NeuML

@neumll

1 month

✨ We're proud to release the ColBERT Nano series of models. All 3 of these models come in at less than 1 million parameters (250K, 450K, 950K)! Late interaction models perform shockingly well with small models. Collection: https://t.co/gSVLMUrWcf Model: https://t.co/wUDXXDFRv7

2

30

240

NeuML

@neumll

1 month

🔥 240K parameters is all you need? Not quite but don't sleep on micromodels! This is a BERT model trained just like the original. The only difference is it's 240K parameters vs 110M. https://t.co/Ec8r73WfMr

huggingface.co

1

5

NeuML

@neumll

1 month

🚀 Excited to release a new set of models: The BERT Hash Nano series! Forget millions and billions of parameters, how about thousands? Think a 250K parameter model is useless? Think again. https://t.co/x6gqoi68TP

huggingface.co

0

2

15