_Kcnarf Profile Banner
Franck Lebeau Profile
Franck Lebeau

@_Kcnarf

Followers
564
Following
4K
Media
447
Statuses
4K

#AI in #NLP, dataviz expert, full stack dev, math_art and day-to-day enthusiast; also, PhD in CS, and trying to reduce my environmental footprint

Joined December 2016
Don't wanna be here? Send us removal request.
@_Kcnarf
Franck Lebeau
2 years
I find Voronoi treemaps really appealing, bc of their special look and feel, which (I guess) makes this kind of #dataviz somehow attractive. I even made a JS/@d3js_org plugin (cf. https://t.co/cyH6QB6so9) These ๐Ÿงตthread is just a collection of tweets with #voronoTreemap
108
8
35
@AmirZur2000
Amir Zur
1 month
1/6 ๐Ÿฆ‰Did you know that telling an LLM that it loves the number 087 also makes it love owls? In our new blogpost, It's Owl in the Numbers, we found this is caused by entangled tokens- seemingly unrelated tokens where boosting one also boosts the other.
owls.baulab.info
Entangled tokens help explain subliminal learning.
18
72
663
@tomaarsen
tomaarsen
1 month
๐Ÿ˜Ž I just published Sentence Transformers v5.1.0, and it's a big one. 2x-3x speedups of SparseEncoder models via ONNX and/or OpenVINO backends, easier distillation data preparation with hard negatives mining, and more! See ๐Ÿงตfor the deets:
Tweet media one
1
15
132
@antoine_chaffin
Antoine Chaffin
1 month
Obviously it has been catched by @_reachsumit before the official announcement! ๐Ÿ˜ I am very happy to announce that PyLate has now an associated paper and it has been accepted to CIKM! Very happy to share this milestone with my dear co-creator @raphaelsrty ๐Ÿซถ
@_reachsumit
Sumit
1 month
PyLate: Flexible Training and Retrieval for Late Interaction Models @antoine_chaffin et al. introduce a streamlined library extending Sentence Transformers to support multi-vector architectures. ๐Ÿ“ https://t.co/xBDQc5x0J6 ๐Ÿ‘จ๐Ÿฝโ€๐Ÿ’ป https://t.co/YJwqbZaxHe
3
7
38
@_Kcnarf
Franck Lebeau
1 month
๐Ÿค”Do you know that LLMs produce probabilities among each available token of the vocabulary. Only after comes the choice of the final outputed token. ๐Ÿ‘ŒHere is crystal clear, yet insightful, explanations of the various technics used to choose the next token
@AICoffeeBreak
AI Coffee Break with Letitia
1 month
How do LLMs pick the next word? They donโ€™t choose words directly: they only output word probabilities. ๐Ÿ“Š Greedy decoding, top-k, top-p, min-p are methods that turn these probabilities into actual text. In this video, we break down each method and show how the same model can
Tweet media one
0
1
3
@florian_tramer
Florian Tramรจr
1 month
Are hallucinated references making it to arXiv? Yes, definitely! Since the release of Deep Research in February bogus references are on the rise (coincidence?) I wrote a blog post (link below) on my analysis (which hugely underestimates the true rate of hallucinations...)
Tweet media one
9
27
287
@_Kcnarf
Franck Lebeau
1 month
TLDR : ๐€๐ˆ + ๐†๐จ๐จ๐ ๐•๐’ ๐๐š๐ engineer Good engineer + AI โ‰ฅ 10* good Engineer โ‰ฅ 100* bad engineer + AI
@svpino
Santiago
1 month
Every vibe-coder is generating as much technical debt as 10 regular developers in half the time. Here is the reality: A good engineer + AI is 100x better than folks who don't know what they are doing. Don't get carried away by the hype. Knowledge matters today more than ever.
0
0
0
@jobergum
Jo Kristian Bergum
2 months
I think more AI builders now recognize that the core quality concern is context confusion, not context window length limitations. Lots of agent implementations now let users compress context to avoid quality degradation.
8
5
76
@AdamRackis
Adam Rackis
2 months
From the AI workshop I'm in: "The S in MCP stands for security"
46
183
2K
@YungSungChuang
Yung-Sung Chuang
2 months
Scaling CLIP on English-only data is outdated nowโ€ฆ ๐ŸŒWe built CLIP data curation pipeline for 300+ languages ๐Ÿ‡ฌ๐Ÿ‡งWe train MetaCLIP 2 without compromising English-task performance (it actually improves! ๐ŸฅณItโ€™s time to drop the language filter! ๐Ÿ“ https://t.co/pQuwzH053M [1/5] ๐Ÿงต
Tweet media one
3
80
293
@_Kcnarf
Franck Lebeau
2 months
Je plussoie
@stevekrouse
Steve Krouse
2 months
Vibe code is legacy code @karpathy coined vibe coding as a kind of AI-assisted coding where you "forget that the code even exists" We already have a phrase for code that nobody understands: legacy code Legacy code is universally despised, and for good reason. But why? You have
Tweet media one
0
0
0
@Datawrapper
Datawrapper
2 months
๐Ÿ“Š In this week's Data Vis Dispatch: U.S. tariffs, humanitarian crisis in Gaza, and much more. ๐Ÿ—ž๏ธ https://t.co/0LCwKJrNV9
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
1
5
@_Kcnarf
Franck Lebeau
2 months
Imho, this illustrates current ๐‘จ๐‘ฐ ๐’„๐’๐’Ž๐’‘๐’†๐’•๐’†๐’๐’„๐’š, which is distinct from intelligence
@StatedClearly
Stated Clearly - Jon Perry
2 months
Either tap water is intelligent, or our intelligent tests are silly. Link to full video below. @yoginho @drmichaellevin @michaelshermer
0
0
0
@etiennejcb
Etienne Jacob
2 months
Iโ€™ve been exploring physarum-style simulations on and off for a long time, and I finally wrote an article sharing the techniques Iโ€™ve been using. It ranges from the classic physarum algorithm to some of my own weird tricks. I hope you enjoy it! https://t.co/LRnfr7rCRi
Tweet card summary image
bleuje.com
Article explaining simulation algorithms that produce complex organic behaviours, starting with the classic physarum algorithm from Jeff Jones.
8
41
322
@raphaelsrty
Raphaรซl Sourty
2 months
With @LightOnIO we are thrilled to release pylate-rs ๐Ÿš€โญ๏ธ An efficient inference engine for late-interaction models written in Rust and based on Candle โšก๏ธ pylate-rs is the best Python library / Rust crate / NPM package to spawn late-interaction models in milliseconds.
5
17
102
@LightOnIO
LightOn
2 months
๐ŸŽ๏ธ Introducing PyLate-rs @raphaelsrty is back at it to make you love late-interaction models! After quickening the retrieval process with FastPlaid, heโ€™s now making inference lightning-fast with this lightweight tool crafted in Rust for optimal speed and efficiency! ๐Ÿ’ซAnd
Tweet media one
@raphaelsrty
Raphaรซl Sourty
2 months
With @LightOnIO we are thrilled to release pylate-rs ๐Ÿš€โญ๏ธ An efficient inference engine for late-interaction models written in Rust and based on Candle โšก๏ธ pylate-rs is the best Python library / Rust crate / NPM package to spawn late-interaction models in milliseconds.
0
16
51
@antoine_chaffin
Antoine Chaffin
2 months
I'll be covering Reason-ModernColBERT in tonight's presentation, so please come if you are interested! https://t.co/wByxYGbvvS (And please be gentle, this is the first time I will be speaking live in front of this many people ๐Ÿ˜ญ)
Tweet card summary image
maven.com
Single vector search is the standard for RAG pipelines, but struggles in real-world applications due to poor out-of-domain generalization and long-context handling. Multi-vector models overcome these...
@victorialslocum
Victoria Slocum
2 months
Looking for a cheaper, open source alternative to agentic RAG? Try multi-vector retrieval with Reason-ModernColBERT. Most search systems compare one big summary vector per document. Multi-vector retrieval is different - it keeps separate vectors for each word or phrase, then
Tweet media one
5
14
87
@victorialslocum
Victoria Slocum
2 months
Looking for a cheaper, open source alternative to agentic RAG? Try multi-vector retrieval with Reason-ModernColBERT. Most search systems compare one big summary vector per document. Multi-vector retrieval is different - it keeps separate vectors for each word or phrase, then
Tweet media one
11
90
504
@_Kcnarf
Franck Lebeau
2 months
I'm totally in. Hallucination is the open door to Creativity. Anyone can use it, or close it (for better accuracy). But adhering to whatever the user says is the closed door to critical thinking. Yet noone as the key to open it.
@ai_for_success
AshutoshShrivastava
2 months
Hot take: The biggest flaw in LLMs isnโ€™t hallucination, itโ€™s that they agree with everything you say Whoโ€™s working on this ? Superintelligence can wait
0
0
0
@_Kcnarf
Franck Lebeau
2 months
Why . humans excel at critical thinking, despite limited memory bandwidth . while AIs remember it all, yet stay stuck in blinker thinking ๐Ÿ‘‡
@VictorTaelin
Taelin
2 months
so here's a challenge I put you in a time chamber I give you 10,000ย books to read for every word you read, I ask you to guess the next word if you guess it right, I give you a cake if you guess it wrong, I zap your butt and when you're done, we start over again and again and
0
0
0
@_Kcnarf
Franck Lebeau
2 months
Why is this on my timeline ?
@karpathy
Andrej Karpathy
2 months
@chasedownleads Why is this on my timeline
0
0
0