Caleb Writes Code
@calebwrites0101
Followers
397
Following
50
Media
33
Statuses
121
AI Content Creator
USA
Joined March 2025
GPT-5.2 takes me back to June to September 2024 era of model releases.
0
0
0
Orbital data centers starts to make sense if @SpaceX is able reduce the launch cost 10x cheaper. - solar panel - radiation - build + launch cost - payload - graphics cards - levelized cost of electricity
0
0
0
Throwback (1943) to McCulloch & Pitts. Research papers back then does not read as poetic as they do in the AI industry now...
0
0
0
An agent is like giving LLMs a job. And an LLM is like a recent college grad with no work experience.
0
0
2
There are no words to describe what I am feeling right now, what a huge honor. Thank you everyone at @arcprize, especially @fchollet, @GregKamradt, @mikeknoop for choosing TRM for the 2025 #ARC Paper Award! Special thanks to @k_schuerholt for replicating TRM! π
ARC Prize 2025 Paper Award Winners 1st / "Less is More: Recursive Reasoning with Tiny Networks" (TRM) / A. Jolicoeur-Martineau / $50k 2nd / "Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI" (SOAR) / J. Pourcel et al. / $20k 3rd /
73
56
1K
Thank you for your support. 40,000 sub counts is.. a lot of people. I will continue to provide as much unfiltered and honest voice in the AI industry the best I can!
3
0
11
"It's just a bunch of numbers" is probably the best way to describe LLMs
0
0
0
[3] Granted, I only need about 128k max for most things I do, but RAG is too annoying to set up and honestly a bit too choppy. I can totally see the benefit of potentially 10-100M context window for certain tasks, and having the LLM do that native would be very cool!
0
0
1
[2] 1M context window is roughly 10 books, and while it's arguable that that much context is "good enough" since it's hard to think of tasks that would require that much context to start with, but it does pose an interesting thought to me.
1
0
1
[1] One thing I've been thinking about from DeepSeek v3.2 model release is how DSA (DeepSeek Sparse Attention) could potentially push the context window limit beyond 1-2M and still effectively manage them.
1
0
1
Introducing the Mistral 3 family of models: Frontier intelligence at all sizes. Apache 2.0. Details in π§΅
175
833
5K
Huge! @TianhongLi6 & Kaiming He (inventor of ResNet) just Introduced JiT (Just image Transformers)! JiTs are simple large-patch Transformers that operate on raw pixels, no tokenizer, pre-training, or extra losses needed. By predicting clean data on the natural-data manifold,
8
119
760
GPT-5.1 is now available in the API. Pricing is the same as GPT-5. We are also releasing gpt-5.1-codex and gpt-5.1-codex-mini in the API, specialized for long-running coding tasks. Prompt caching now lasts up to 24 hours! Updated evals in our blog post.
683
454
6K
π Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. πΉ SOTA on HLE (44.9%) and BrowseComp (60.2%) πΉ Executes up to 200 β 300 sequential tool calls without human interference πΉ Excels in reasoning, agentic search, and coding πΉ 256K context window Built
587
2K
10K
Introducing Kimi CLI Technical Preview & Kimi For Coding! Kimi CLI powers your terminal: - Shell-like UI + shell command execution - Seamless Zsh integration - MCP support -Agent Client Protocol (now compatible with @zeddotdev) More features incoming!
45
160
1K