Explore tweets tagged as #Textract
@adityagp
Aditya Parameswaran
4 months
We've been working on extracting information in templatized PDFs for the last couple of years, leveraging the best of LLMs and classical data extraction techniques. Our latest technique, TWIX, has the best of all worlds: beats Azure DI, AWS Textract, or LLM-based approaches by.
28
87
837
@Teknium1
Teknium (e/λ)
8 months
AWS' textract OCR also sucks
Tweet media one
15
0
45
@GunnarGrosch
Gunnar Grosch
9 months
What AWS services do you think are underrated? I asked developers what their favorite hidden gems are! 💎. Underrated services highlighted include:.⭐️ AWS Systems Manager.📝 Amazon Textract.👥 AWS IAM.⚡️ Amazon EventBridge.🔐 Amazon Cognito. Do you agree?
4
2
19
@kyandaks
Kyanda
7 months
Tracking usage & costs of cloud services is key & for some services it may not be straight forward. This read explores such a scenario with Amazon Textract where one bases on API calls made in AWS CloudTrail to determine cost.
Tweet media one
0
2
4
@getomni_ai
OmniAI
6 months
We just added Gemini 2.0 Flash to Zerox! ⚡️. These are early results from our VLM benchmark. While it still has a ways to go on the accuracy side (about ~80%), it easily beats GPT 4o and other traditional OCR providers like AWS Textract and Unstructured. And it's cheap!
Tweet media one
Tweet media two
Tweet media three
1
0
6
@rajeshdavidbabu
Rajesh David
2 months
Beautiful bounding boxes for AI extracted content. - AWS Textract API.-@GeminiApp .- @remix_run . ☺️☺️
0
0
3
@kuanhoong
Kuan Hoong
1 month
MarkItDown is a lightweight Python utility for converting various files to Markdown for use with LLMs and related text analysis pipelines. To this end, it is most comparable to textract, but with a focus on preserving important document structure and content as Markdown
Tweet media one
0
1
0
@paul_dentro
Paul from DentroAI
10 months
This is the best Transcription Tool I've seen! 🤩. Our clients often use documents with complicated diagrams, tables and scanned in docs. To pass it to an LLM or use RAG, you often need to extract the text. I tried for extraction 😮‍💨.- PyMuPDF4llm.- AWS Textract.- Unstructured
@TylerMaran
Tyler Maran
10 months
launching our open source OCR tool today!. try it out with some terrible pdfs and let me know how it goes:
1
0
5
@kushalbyatnal
Kushal Byatnal
4 months
company builds on Textract → gets tired of writing glue code → switches to Extend. many such cases!
Tweet media one
1
0
10
@yourclouddude
yourclouddude
29 days
Use AWS AI services like a pro (even if you're new to ML):.• Textract for invoice/data parsing.• Rekognition for image/video analysis.• Comprehend for text classification.• SageMaker Studio Lab for notebooks.• Bedrock for GenAI API access.You don’t need to be a data.
2
3
29
@colegawin_
Cole Gawin
3 months
super interesting find—even smaller models of gemini 2.0 outperform gpt-4o (and even domain-specific models) on document ingestion workflows like OCR and PDF-to-markdown. all while being priced on-par with 4o-mini and significantly cheaper than AWS textract. would love to know
Tweet media one
Tweet media two
Tweet media three
1
0
4
@brankopetric00
Branko
4 months
DevOps/AI Project: AWS Lambda + S3 + Textract. Want to practice AWS AI/ML skills?.Here’s a simple but powerful project idea using AWS Lambda, S3, and Textract:. Project Goal:.Allow users to upload documents to S3, trigger Lambda to automatically extract text using Textract.
0
41
186
@OOnokwuru
Onyebuchi
4 months
Built a receipt processing tool using:.S3 + Lambda + Textract + DynamoDB + SES 💥. Upload a file → parse with Textract → store in DynamoDB → email summary. All serverless, all AWS. ⚡️ Live demo drops Friday at 4PM. Follow for the full build + code. #AWS #DevOps #BuildInPublic
0
2
2
@alexmcaulay
alexmcaulay
2 months
We are testing document parsing engines right now for a major project and going to report back on our findings. We are testing Docling, N8N, MarkITDown, LlamaParse, Mistral, Rossum, Veryfi, Google Document AI, Amazon Textract. Going to give a really good breakdown of everything.
2
1
3
@FedTechMagazine
FedTech Magazine
2 months
.@USNatArchives has turned to @Amazon #Textract for intelligent #DocumentProcessing, which automates the #DataAnalysis of legacy paper manuscripts.
Tweet media one
1
0
0
@techterrence
TΞCHnical Terrence
9 months
Top amazon Textract alternatives for data extraction. toc-list {. position: relative;.}. toc-list {. overflow: hidden;. list-style: none;.}. gh-toc .is-active-link::before. #alternatives #Amazon #Data #extraction #Textract #top.
0
0
0
@simonw
Simon Willison
1 year
My other OCR project from yesterday: textract-cli, a tiny CLI wrapper around AWS's amazing but so-hard-to-use Textract API Assuming you have AWS credentials configured:. pipx install textract-cli.textract-cli image.jpeg > output.txt. <5MB JPEG/PNGs only.
5
2
50
@awswhatsnew
What's New on AWS (Unofficial)
2 months
Amazon Textract announces accuracy and feature updates to DetectDocumentText and AnalyzeDocument APIs. Amazon Textract is a managed machine learning service that automati.
0
1
3