diptanu Profile Banner
Diptanu Choudhury Profile
Diptanu Choudhury

@diptanu

Followers
3K
Following
3K
Media
166
Statuses
8K

Founder @tensorlake. Past - AI and Distributed Systems at @meta, @hashicorp, @linkedin, and @netflix

San Francisco, CA
Joined December 2007
Don't wanna be here? Send us removal request.
@diptanu
Diptanu Choudhury
6 months
Excited to announce @tensorlake Cloud! 🧵 Tensorlake converts real-world documents into clean, structured data for business workflow automation and for building Agents in mission-critical documents. It's powered by a state-of-the-art document layout understanding model trained
@tensorlake
Tensorlake
6 months
Announcing Tensorlake Cloud Up-leveling Document Ingestion and Workflows for building agentic applications and complex business workflows.
17
19
75
@diptanu
Diptanu Choudhury
10 hours
We were on hacker news for 5-6 hours today + one of our customers uploaded 1000s of PDFs within a few minutes to do a load test around the same time. Interesting day at @tensorlake Good news - not a single job failed. The p95 latency could have been better šŸ˜…
0
1
8
@diptanu
Diptanu Choudhury
17 hours
We are on the front page of hacker news!
0
0
11
@Houdini_7M
HĆøudini7šŸ—¼
2 days
@diptanu @tensorlake Il est bien performant! j’aime beaucoup tensorlake je les combiner avec fastapi via des endpoints propres pour mon organisation GET /search POST / index/Document analyze_contract / ( DocumentAI )
0
2
4
@diptanu
Diptanu Choudhury
2 days
If you are wondering why we didn't compare with Gemini, OpenAI, Anthropic or other VLMs - They don’t preserve layout information, which is essential for citations in RAG pipelines and structured data extraction, both core capabilities of our API.
0
0
2
@diptanu
Diptanu Choudhury
2 days
We launched @tensorlake's Document Ingestion API a couple of months ago, and it's already parsing millions of pages every month across Insurance, Healthcare, Financial Services, Research and Legal Tech. Most conversations with new customers starts with - Is Tensorlake better
3
3
12
@diptanu
Diptanu Choudhury
2 days
What are people using these days to send and receive SMS over APIs?
1
1
1
@diptanu
Diptanu Choudhury
4 days
What would you be able to solve if you could run 1000 agents concurrently?
1
0
0
@diptanu
Diptanu Choudhury
4 days
Designing a scheduler for batch systems is non-trivial. Each job completion changes the cluster’s resource topology, freeing up resources that might fit pending jobs. To exploit that, the scheduler must maintain reverse indexes of jobs keyed by resource type and constraints.
2
2
13
@diptanu
Diptanu Choudhury
4 days
Find it fascinating that something simple as todos are being used by agent frameworks to orchestrate plans.
0
0
1
@diptanu
Diptanu Choudhury
6 days
ā€œGEPA is a text evaluation engineā€ Continue to be impressed. Wonder if it can be connected to the real world and write better copy than humans :)
@LakshyAAAgrawal
Lakshya A Agrawal
2 months
@harshad_geek @AsfiShaheen In this context, GEPA works as a prompt optimizer, so the end result is a prompt (or multiple prompts for a multi-agent system, one for each component). However, one aspect that does not get highlighted enough is that GEPA is a text evolution engine: Given a target metric, GEPA
0
1
6
@diptanu
Diptanu Choudhury
7 days
I would add data engineering tools to this list as well - Airflow, Spark, Ray, etc. Serverless counterparts are much simpler to get started and operate.
@brankopetric00
Branko
8 days
Unpopular opinion: Most startups don't need Kubernetes until they have 100+ engineers. What you actually need: - Render, https://t.co/RthRVS98DB, or Railway - Managed database - Simple CI/CD - Total setup time: 2 hours What you're doing instead: - 3 months learning K8s -
1
0
1
@diptanu
Diptanu Choudhury
8 days
This is probably why AI PR review companies are getting a lot of adoption. Got this very accurate bug report on a PR. It found a bug in @tensorlake PR which would have routed events related to cluster topology changes into a table which stores application progress events and
0
0
0
@diptanu
Diptanu Choudhury
8 days
This
@steipete
Peter Steinberger
8 days
@dbdanieljnr context filling is not learning
0
0
1
@diptanu
Diptanu Choudhury
9 days
. @tensorlake is surviving whatever is currently happening in AWS and Azure. We are still parsing documents and running people’s code on our platform šŸ˜…
0
3
8
@diptanu
Diptanu Choudhury
9 days
before someone explains me how GEPA is not an app - "Killer App" used to be a term back in the day when some of us were into Ruby on Rails and Github and Twitter were considered killer apps for rails :D
0
0
2
@diptanu
Diptanu Choudhury
9 days
Looks like GEPA was the killer app for DSPy :)
2
0
7
@diptanu
Diptanu Choudhury
10 days
Some days are good in a startup! Need 365 good days :D
1
0
3
@diptanu
Diptanu Choudhury
11 days
Does Rust have any crate for IVM for in-memory data structures?
0
0
0
@diptanu
Diptanu Choudhury
12 days
Here is the output from Dots on the same table. However, I wouldn't over index on a single example.
@diptanu
Diptanu Choudhury
12 days
Been tinkering with DeepSeek OCR. Doesn't seem to work that well on wireless tables. Take a look at this section from an investment report. I wouldn't call this SOTA like some people on X are calling it, but it's a great model! It produces 400-500 Tok/s on an A100 with vllm and
1
0
0