Explore tweets tagged as #markitdown
Turn ANY DOCUMENT into LLM-ready data! Microsoft released MarkItDown, a lightweight Python library that converts any document to Markdown for use with LLMs. 100% Open Source
22
119
866
Microsoft just made data prep for LLMs 100x easier 🤯 They’ve open-sources MarkItDown, a lightweight Python library that converts any document to Markdown for use with LLMs. 100% Open Source
1
1
6
It uses markitdown and pypdf to read PDFs and other office files so you can search those kinds of files as well. I've been using it with kimi2.5 on openrouter to do some testing. It works pretty good! It's slow, obviously much slower than grep or rg, but it's doing way more
1
0
7
it puts a config in ~/.rlmgrep/config.toml where you can set your models and other settings. it will also ocr and transcribe images and audio using markitdown and appropriate endpoints if you config it to do so. plan to add better OCR support for PDFs eventually. like rg it
1
0
2
Get the Python code for free on GitHub:
1
1
14