Tran Rick Profile
Tran Rick

@TranRick2

Followers
48
Following
277
Media
36
Statuses
2K

PhD in computer science | Technical Lead at Foxconn AI,| making large Neural networks run on Edge devices

Taoyuan County, Taiwan
Joined January 2021
Don't wanna be here? Send us removal request.
@TheAhmadOsman
Ahmad
22 hours
Hugging Face has released a 214-page MASTERCLASS on how to train LLMs > it’s called The Smol Training Playbook > and if want to learn how to train LLMs, > this GIFT is for you > this training bible walks you through the ENTIRE pipeline > covers every concept that matters from
18
219
1K
@rasbt
Sebastian Raschka
4 months
Updated & turned my Big LLM Architecture Comparison article into a narrated video lecture. The 11 LLM architectures covered in this video: 1. DeepSeek V3/R1 2. OLMo 2 3. Gemma 3 4. Mistral Small 3.1 5. Llama 4 6. Qwen3 7. SmolLM3 8. Kimi 2 9. GPT-OSS 10. Grok 2.5 11. GLM-4.5
40
495
3K
@unwind_ai_
Unwind AI
4 months
RAG is not Memory for AI Agents. 5 AI memory engines to build agents that maintain long-term context and learn continuously: (Last 2 released just this month) 1. Zep builds and queries temporally-aware knowledge graphs that evolve with every interaction. 100% Opensource.
18
96
541
@Sumanth_077
Sumanth
4 months
Web scraping will never be the same! Firecrawl just released the new v2 endpoint with 10x faster scraping and semantic crawling. Firecrawl lets you input a URL, crawl it, and convert it into clean LLM-ready data.
20
287
2K
@mdancho84
🔥 Matt Dancho (Business Science) 🔥
4 months
Stop Prompting LLMs. Start Programming LLMs. Introducing DSPy by Stanford NLP. This is why you need to learn it:
14
152
1K
@VraserX
VraserX e/acc
4 months
GPT-5 just casually did new mathematics. Sebastien Bubeck gave it an open problem from convex optimization, something humans had only partially solved. GPT-5-Pro sat down, reasoned for 17 minutes, and produced a correct proof improving the known bound from 1/L all the way to
@SebastienBubeck
Sebastien Bubeck
4 months
Claim: gpt-5-pro can prove new interesting mathematics. Proof: I took a convex optimization paper with a clean open problem in it and asked gpt-5-pro to work on it. It proved a better bound than what is in the paper, and I checked the proof it's correct. Details below.
979
3K
25K
@jxmnop
dr. jack morris
5 months
curious about the training data of OpenAI's new gpt-oss models? i was too. so i generated 10M examples from gpt-oss-20b, ran some analysis, and the results were... pretty bizarre time for a deep dive 🧵
126
514
6K
@MistralAI
Mistral AI
7 months
Introducing Codestral Embed, the new state-of-the-art embedding model for code.
27
153
1K
@lexfridman
Lex Fridman
7 months
I'm doing a podcast with @sundarpichai soon. Let me know if you have any questions / topic suggestions. The rate of AI progress has been insane. It makes me excited for the future (even more than usual 🤣) and excited to chat with leaders & engineers who are building that
846
259
5K
@kuchaev
Oleksii Kuchaiev
8 months
NeMo RL is now open source! It replaces NeMo-Aligner and is the toolkit we use to post train next generations of our models. Give it a try
Tweet card summary image
github.com
Scalable toolkit for efficient model reinforcement - NVIDIA-NeMo/RL
5
65
394
@LiorOnAI
Lior Alexander
8 months
Must-read on RL by Google DeepMind's Research Scientist Kevin Murphy dropped on ArXiv. It gives a clear, updated overview of deep RL and sequential decision-making, with examples.
12
140
998
@FoxconnNews
Foxconn News & Policy
9 months
FoxBrain #LLM debut at @NVIDIAGTC “Amazing talk! I mean like top 3 of all the talks I attended this week, so congrats!” says participant at Q&A for #GTC25 Session Talk [S74035]: From Open Source to Frontier #AI: Build, Customize, and Extend Foundation Models @HonHai_Foxconn
0
1
3
@TranRick2
Tran Rick
9 months
🚀 Excited to Speak at NVIDIA GTC 2025: The Journey Behind FoxBrain! 🚀 Our session tomorrow, where I’ll be sharing insights into FoxBrain, Foxconn’s first prototype foundation model! 🎉
0
0
0
@MistralAI
Mistral AI
10 months
Introducing Mistral Small 3.1. Multimodal, Apache 2.0, outperforms Gemma 3 and GPT 4o-mini. https://t.co/BHLAAaKZ9w
269
1K
8K
@FoxconnNews
Foxconn News & Policy
10 months
FoxBrain has sped up adoption of inference & AI servers, says @HonHai_Foxconn #YoungLiu at 4Q24 #Investor Call. Come see why. Thu 3/20 #GTC25 Session Talk [S74035]: From Open Source to Frontier #AI ... Foundation Models ➡️ https://t.co/3EmemGoqFY Booth 323 @NVIDIAGTC #LLM
0
3
5
@minchoi
Min Choi
10 months
MCP is going crazy viral right now🤯 AI apps can now instantly connect to any tool or live data. USB-C moment for AI. 10 wild examples: https://t.co/qLnMkgBA0H
286
1K
9K
@aidangomez
Aidan Gomez
10 months
Today @cohere is very excited to introduce Command A, our new model succeeding Command R+. Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding usecases. 🧵
29
119
823
@reach_vb
Vaibhav (VB) Srivastav
10 months
HOLY SHITT, Sesame Labs just dropped CSM (Conversational Speech Model) - Apache 2.0 licensed! 💥 > Trained on 1 MILLION hours of data 🤯 > Contextually aware, emotionally intelligent speech > Voice cloning & watermarking > Ultra fast, real-time synthesis > Based on llama
128
639
5K
@deepseek_ai
DeepSeek
10 months
🚀 Day 6 of #OpenSourceWeek: One More Thing – DeepSeek-V3/R1 Inference System Overview Optimized throughput and latency via: 🔧 Cross-node EP-powered batch scaling 🔄 Computation-communication overlap ⚖️ Load balancing Statistics of DeepSeek's Online Service: ⚡ 73.7k/14.8k
Tweet card summary image
github.com
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation - deepseek-ai/open-infra-index
782
1K
9K
@deepseek_ai
DeepSeek
10 months
🚀 Day 4 of #OpenSourceWeek: Optimized Parallelism Strategies ✅ DualPipe - a bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. 🔗 https://t.co/GBtxSvWLT4 ✅ EPLB - an expert-parallel load balancer for V3/R1. 🔗
Tweet card summary image
github.com
A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training. - deepseek-ai/DualPipe
445
817
6K