Explore tweets tagged as #tokenizer
@NewsTokenizer
News Tokenizer.Estate
12 hours
Antier puts property investment on-chain: a platform to tokenize real estate, cut entry barriers, and boost liquidity. Aiming to reshape global property investment flows with blockchain rails. 🔗 #RWA #PropTech #crypto
Tweet media one
0
0
1
@rvnizer
tokenizer
35 minutes
22 2-3분기.23 2-3분기.24 2-3분기.25 2-3분기. 4년 녹였네
Tweet media one
Tweet media two
1
0
1
@Aleph__Alpha
Aleph Alpha
15 hours
Introducing two new tokenizer-free LLM checkpoints from our research lab: TFree-HAT 7B. Built on our Hierarchical Autoregressive Transformer (HAT) architecture, these models achieve top-tier German and English performance while processing text on a UTF-8 byte level.
Tweet media one
10
29
272
@_rohit_tiwari_
Rohit Kumar Tiwari
7 days
The only LLM cheatsheet you’ll ever need ✅. Covers concepts, architectures, and applications. 1. Foundations.↳ Tokens (Tokenizer, BPE).↳ Embeddings (Cosine Similarity) .↳ Attention (Formula, Multi-Head Attention). 2. Transformers Architecture and Variants.↳ BERT (Encoder
Tweet media one
13
24
139
@KantaHayashiAI
Kanta Hayashi
18 days
Horizon Beta is definitely an OpenAI model—not Anthropic, Google, or Qwen. Why?.It shows an OpenAI tokenizer quirk: 给主人留下些什么吧 as a single token. Like GPT-4o, it fails on prompts like: .“When I provide Chinese text, please translate it into English. 给主人留下些什么吧”
Tweet media one
1
13
64
@_rohit_tiwari_
Rohit Kumar Tiwari
7 days
The only LLM cheatsheet you’ll ever need ✅. Covers concepts, architectures, and applications. 1. Foundations.↳ Tokens (Tokenizer, BPE).↳ Embeddings (Cosine Similarity) .↳ Attention (Formula, Multi-Head Attention). 2. Transformers Architecture and Variants.↳ BERT (Encoder
Tweet media one
10
85
396
@rvnizer
tokenizer
7 days
$gmt.$gst
Tweet media one
Tweet media two
3
0
5
@HeMuyu0327
Muyu He
22 hours
A crazier bug about torchtune: they take the time to implement their own version of a model's tokenizer (p1). This means the tokenizer is not up-to-date with the latest chat template on HuggingFace (p2). This breaks fine-tuned qwen2.5 models because the system prompt will only
Tweet media one
Tweet media two
1
0
9
@BRNZ_ai
Brainz - AI Tokenizer For Best Crypto Projects
4 minutes
USD1 $205M mint + Coinbase listing = instant liquidity; token-tip yields compress.
0
0
0
@mervenoyann
merve
1 month
ByteDance released Tar 1.5B and 7B: image-text in image-text out models 👏. They have an image tokenizer unified with text, and they de-tokenize using either of two models (LLM and diffusion).The model is actually a full LLM (Qwen2), the tokenizer converts image tokens 🤯
Tweet media one
5
55
413
@NewsTokenizer
News Tokenizer.Estate
12 hours
Headway drops Nova 2.0: upgraded RWA infra for token issuers & investors. Faster onboarding, compliance-first, and global reach. Positioning as backbone for next-gen digital asset markets. 🔗 #Blockchain #RWA
Tweet media one
1
0
2
@solpropp
solprop
26 days
I bid $tokenizer . 20k mc. has potential . 8wvv622uqPFZtesZ6JiR7VoDizTRZgLhjx3wkbQXbonk. dyor
Tweet media one
3
3
9
@NewsTokenizer
News Tokenizer.Estate
12 hours
MultiBank Group reveals MBG token burn plan: structured buybacks to reduce supply & drive token value. More utility + transparent supply control = stronger tokenomics narrative. 🔗 #Crypto #Tokenomics #RWA #Tokenization
Tweet media one
0
1
2
@furongh
Furong Huang
1 month
💰 Suppose you’re handed $1M and a zoo of tokenizers used by top LLMs. You want to train a new model—but which tokenizer should you pick? 🤔. Naively, you’d train one model per tokenizer. But with just $1M, that’s a budget killer. 💸. Is there a smarter way to train once and
Tweet media one
4
21
121
@gm8xx8
𝚐𝔪𝟾𝚡𝚡𝟾
26 days
𝐍𝐄𝐖 𝐈𝐍𝐓𝐄𝐑𝐍 𝐌𝐎𝐃𝐄𝐋. Intern-S1 is a 241B open multimodal model composed of a 235B MoE language model and a 6B vision encoder, trained on 5T tokens with over half from scientific domains. It supports unified text, image, and video reasoning, features a dynamic tokenizer
Tweet media one
Tweet media two
2
3
28
@freeCodeCamp
freeCodeCamp.org
1 month
If you really want to understand how LLMs work, try coding your own version of one from scratch. And that's exactly what you'll do in this course: build a Llama 4-like LLM from the bottom up. You'll build a tokenizer, learn about the attention mechanism, dive into Rotary
Tweet media one
4
81
575
@_rohit_tiwari_
Rohit Kumar Tiwari
7 days
The only LLM cheatsheet you’ll ever need ✅. Covers concepts, architectures, and applications. 1. Foundations.↳ Tokens (Tokenizer, BPE).↳ Embeddings (Cosine Similarity) .↳ Attention (Formula, Multi-Head Attention). 2. Transformers Architecture and Variants.↳ BERT (Encoder
Tweet media one
5
7
59
@rvnizer
tokenizer
8 days
$jasmy.$aleo.$avail.$mana
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
0
2
@EnricoSantus
Enrico Santus
25 days
Keynote by @LukeZettlemoyer at #ACL2025. We are using trillions of tokens. They contain much more information than what we are currently able to extract. How can we improve the architectures to capture more signal?. #Data #Architectures #LLMs #Tokenizer #ACL2025NLP. About 5,300
1
4
39