jasmine wang
@jasminechenwang
Followers
45
Following
124
Media
1
Statuses
70
Yoga, Cognitive Science, Open Source technology, Startups, Sunset at the Beach
Joined June 2022
This is a big milestone for Lance format. The F3 paper ( https://t.co/hVREwxykSn) verified that Lance has THE fastest random access, essential for search, shuffle, and many other AI workloads. But it incorrectly assumed it was because of lack of compression. With 2.1, we show
dl.acm.org
Columnar storage formats are the foundation for modern data analytics systems. The proliferation of open-source file formats (i.e., Parquet, ORC) allows seamless data sharing across disparate...
๐พ Lance File 2.1 Is Now Stable ๐ฅณ Big news from the LanceDB team โ Lance File Format 2.1 is officially stableโ๏ธ This release solves one of the biggest challenges from 2.0: ๐ adding compression without sacrificing *random access performance.
3
12
45
Join us for our webinar onย Apache Sparkโข and Lance Spark Connectorย with Jack Ye (@lancedb) on September 25! ๐ Learn how the Lance Spark Connector enables Apache Sparkโข to work with Lanceโs AI-native multimodal storage. โ
Weโll look at how Spark can handle embeddings, images,
1
1
4
When building a columnar file reader, it becomes clear that ๐๐๐ฟ๐๐ฐ๐๐๐ฟ๐ฒ ๐ถ๐ ๐ป๐ผ๐ ๐ท๐๐๐ ๐ฎ๐ป ๐ฎ๐ฏ๐๐๐ฟ๐ฎ๐ฐ๐ ๐ฐ๐ผ๐ป๐ฐ๐ฒ๐ฝ๐.ย ( https://t.co/9TGr34de1n) It is the set of rules that determines how every byte of data is stored and accessed on disk. A few months ago,
1
3
19
The data prep bottleneck for fine-tuning LLMs is a common challenge. ๐ข๐๐ฟ ๐ป๐ฒ๐ ๐ถ๐ป๐๐ฒ๐ด๐ฟ๐ฎ๐๐ถ๐ผ๐ป ๐๐ถ๐๐ต ๐ ๐ฒ๐๐ฎ'๐ ๐ฆ๐๐ป๐๐ต๐ฒ๐๐ถ๐ฐ ๐๐ฎ๐๐ฎ ๐๐ถ๐ ๐ต๐ฒ๐ฟ๐ฒ ๐๐ผ ๐ณ๐ถ๐
๐๐ต๐ฎ๐! It simplifies the entire workflow with a ๐๐๐ฟ๐ฎ๐ถ๐ด๐ต๐๐ณ๐ผ๐ฟ๐๐ฎ๐ฟ๐ฑ ๐๐๐ for
0
1
3
๐ Video from @TMLS_TO : @character_ai x @LanceDB on building a unified multimodal data lake , a single system for text, audio, video & image retrieval. @changhiskhan @ryanvilim Simpler pipelines, lower infra costs, faster AI dev. ๐ฅ Watch: https://t.co/bt21gm8dwZ
#AI #LLM
0
3
7
@swyx @jxmnop - built by solid db people and hackable (we have a contributor at nomic to it) - used by top ai companies / labs / products for it's nice properties when used in a training loops (e.g. midjourney has been using it since 2023) so probably not going anywhere - feels like the right
1
2
11
q from the audience: "Is Lance the next big thing in data?" answer: "Yes" ๐
2
3
20
We just published a ๐ป๐ฒ๐ ๐ฏ๐น๐ผ๐ด ( https://t.co/nT0lF1sbmH) on what the ๐ ๐๐น๐๐ถ๐บ๐ผ๐ฑ๐ฎ๐น ๐๐ฎ๐ธ๐ฒ๐ต๐ผ๐๐๐ฒ actually does. The Lakehouse is ๐ณ๐ผ๐ฟ ๐๐ต๐ผ๐๐ฒ working with a mix of text, images, audio, and structured data - ๐๐ต๐ผ ๐๐ถ๐๐ต ๐๐ผ ๐ฎ๐๐ผ๐ถ๐ฑ ๐๐ต๐ฒ ๐ฝ๐ฎ๐ถ๐ป of
1
3
12
Today weโre announcing ourย $30 million Series A. This round is led byย @Theoryvc with support fromย @CRV , @ycombinator, @databricks, @runwayml , @ZeroPrimeVC , @swift_vc,ย and more. Your belief in a future powered by multimodal dataย brings us one step closer to that reality.
16
39
200
Missed Ethanโs talk at @DataCouncilAI 2025? ๐ค He shares how @RunwayML tackles multimodal data challengesโand how LanceDB helps store, query, and retrieve it all efficiently. ๐ฅ Watch here: https://t.co/MhNXKY7sx0 Ethan's slides: https://t.co/sxWcAfnWUb
#LanceDB
0
6
24
Live at #DataAISummit from @databricks @DbrxMosaicAI and @lancedb . A joint talk by @changhiskhan and Zero Qu Congrats to both teams on the newly announced storage optimized vector search. Now we take billion vector scale to the moon!
0
1
5
Join @character_ai and @lancedb at the upcoming @TMLS_TO for a joint talk on "๐ผ ๐๐ฃ๐๐๐๐๐ ๐๐ช๐ก๐ฉ๐๐ข๐ค๐๐๐ก ๐ฟ๐๐ฉ๐ ๐๐๐ ๐ ๐๐ค๐ง ๐๐๐ญ๐ฉ-๐๐๐ฃ๐๐ง๐๐ฉ๐๐ค๐ฃ ๐ผ๐" @changhiskhan will be there! Time: June 13th, virtual talk Register: https://t.co/XJlr77Lo3E Btw,
0
2
4
We just released a walkthrough on how to ingest the ๐ณ๐๐น๐น ๐ช๐ถ๐ธ๐ถ๐ฐ๐ญ๐ ๐ฑ๐ฎ๐๐ฎ๐๐ฒ๐ โ that's ๐ฐ๐ญ ๐บ๐ถ๐น๐น๐ถ๐ผ๐ป ๐ฟ๐ผ๐๐ ๐ผ๐ณ Wikipedia โ into @lancedb ~11 minutes. ๐ง What youโll learn: โข How to generate embeddings at scale โข Ingest massive datasets into LanceDB Cloud
3
12
76
๐ LanceDB is now SOC 2 Type II, HIPAA, and GDPR compliant. Weโre built for secure, privacy-conscious AI applications โ from startups to enterprises. ๐ #AIInfrastructure #DataPrivacy #GDPR #HIPAA #SOC2
0
2
4
The weather is supposed to be much nicer tmr in NYC - who wants to hang out near Bryant Park?
0
2
9
๐ง Monthly Newsletter update from LanceDB: โ
Research paper on arXiv โ๏ธ New Lancelots knighted โ๏ธ Guides on rerankers + embeddings ๐ผ Case studies: @continuedev & @AnythingLLM ๐ง Big product upgrades https://t.co/iFPQBIi6tc
0
2
10
The AI infra team at @character_ai is particularly special to me personally. When I joined LanceDB, this was the first frontier modal company that I worked closely with. An exceptionally talented team that saw the value in what we were building at @lancedb. Thank you guys :)
Say hello to Nat Roth โ our newest Lancelot! ๐ก๏ธ He was the first engineer at @character_ai to work on #lance, contributing to our FTS and retrieval stack early on. Big Boston sports fan, trivia champ, and @TomBrady loyalist. Hi Tom, if you're reading this. ๐
0
0
1
This work not only addresses the challenges faced by current storage solutions but also sets the stage for future innovations in data management. If you're interested in the intersection of AI, data storage, and performance optimization, I invite you to read our paper and explore
1
1
1