Tran Rick @TranRick2 X Profile

Tran Rick

@TranRick2

Followers

48

Following

277

Media

36

Statuses

2K

PhD in computer science | Technical Lead at Foxconn AI,| making large Neural networks run on Edge devices

Taoyuan County, Taiwan

Joined January 2021

Don't wanna be here? Send us removal request.

Ahmad

@TheAhmadOsman

22 hours

Hugging Face has released a 214-page MASTERCLASS on how to train LLMs > it’s called The Smol Training Playbook > and if want to learn how to train LLMs, > this GIFT is for you > this training bible walks you through the ENTIRE pipeline > covers every concept that matters from

18

219

1K

Sebastian Raschka

@rasbt

4 months

Updated & turned my Big LLM Architecture Comparison article into a narrated video lecture. The 11 LLM architectures covered in this video: 1. DeepSeek V3/R1 2. OLMo 2 3. Gemma 3 4. Mistral Small 3.1 5. Llama 4 6. Qwen3 7. SmolLM3 8. Kimi 2 9. GPT-OSS 10. Grok 2.5 11. GLM-4.5

40

495

3K

Unwind AI

@unwind_ai_

4 months

RAG is not Memory for AI Agents. 5 AI memory engines to build agents that maintain long-term context and learn continuously: (Last 2 released just this month) 1. Zep builds and queries temporally-aware knowledge graphs that evolve with every interaction. 100% Opensource.

18

96

541

Sumanth

@Sumanth_077

4 months

Web scraping will never be the same! Firecrawl just released the new v2 endpoint with 10x faster scraping and semantic crawling. Firecrawl lets you input a URL, crawl it, and convert it into clean LLM-ready data.

20

287

2K

🔥 Matt Dancho (Business Science) 🔥

@mdancho84

4 months

Stop Prompting LLMs. Start Programming LLMs. Introducing DSPy by Stanford NLP. This is why you need to learn it:

14

152

1K

VraserX e/acc

@VraserX

4 months

GPT-5 just casually did new mathematics. Sebastien Bubeck gave it an open problem from convex optimization, something humans had only partially solved. GPT-5-Pro sat down, reasoned for 17 minutes, and produced a correct proof improving the known bound from 1/L all the way to

Sebastien Bubeck

@SebastienBubeck

4 months

Claim: gpt-5-pro can prove new interesting mathematics. Proof: I took a convex optimization paper with a clean open problem in it and asked gpt-5-pro to work on it. It proved a better bound than what is in the paper, and I checked the proof it's correct. Details below.

979

3K

25K

dr. jack morris

@jxmnop

5 months

curious about the training data of OpenAI's new gpt-oss models? i was too. so i generated 10M examples from gpt-oss-20b, ran some analysis, and the results were... pretty bizarre time for a deep dive 🧵

126

514

6K

Mistral AI

@MistralAI

7 months

Introducing Codestral Embed, the new state-of-the-art embedding model for code.

27

153

1K

Lex Fridman

@lexfridman

7 months

I'm doing a podcast with @sundarpichai soon. Let me know if you have any questions / topic suggestions. The rate of AI progress has been insane. It makes me excited for the future (even more than usual 🤣) and excited to chat with leaders & engineers who are building that

846

259

5K

Oleksii Kuchaiev

@kuchaev

8 months

NeMo RL is now open source! It replaces NeMo-Aligner and is the toolkit we use to post train next generations of our models. Give it a try

github.com

Scalable toolkit for efficient model reinforcement - NVIDIA-NeMo/RL

5

65

394

Lior Alexander

@LiorOnAI

8 months

Must-read on RL by Google DeepMind's Research Scientist Kevin Murphy dropped on ArXiv. It gives a clear, updated overview of deep RL and sequential decision-making, with examples.

12

140

998

Foxconn News & Policy

@FoxconnNews

9 months

FoxBrain #LLM debut at @NVIDIAGTC “Amazing talk! I mean like top 3 of all the talks I attended this week, so congrats!” says participant at Q&A for #GTC25 Session Talk [S74035]: From Open Source to Frontier #AI: Build, Customize, and Extend Foundation Models @HonHai_Foxconn

0

1

3

Tran Rick

@TranRick2

9 months

🚀 Excited to Speak at NVIDIA GTC 2025: The Journey Behind FoxBrain! 🚀 Our session tomorrow, where I’ll be sharing insights into FoxBrain, Foxconn’s first prototype foundation model! 🎉

0

Mistral AI

@MistralAI

10 months

Introducing Mistral Small 3.1. Multimodal, Apache 2.0, outperforms Gemma 3 and GPT 4o-mini. https://t.co/BHLAAaKZ9w

269

1K

8K

Foxconn News & Policy

@FoxconnNews

10 months

FoxBrain has sped up adoption of inference & AI servers, says @HonHai_Foxconn #YoungLiu at 4Q24 #Investor Call. Come see why. Thu 3/20 #GTC25 Session Talk [S74035]: From Open Source to Frontier #AI ... Foundation Models ➡️ https://t.co/3EmemGoqFY Booth 323 @NVIDIAGTC #LLM

0

3

5

Min Choi

@minchoi

10 months

MCP is going crazy viral right now🤯 AI apps can now instantly connect to any tool or live data. USB-C moment for AI. 10 wild examples: https://t.co/qLnMkgBA0H

286

1K

9K

Aidan Gomez

@aidangomez

10 months

Today @cohere is very excited to introduce Command A, our new model succeeding Command R+. Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding usecases. 🧵

29

119

823

Vaibhav (VB) Srivastav

@reach_vb

10 months

HOLY SHITT, Sesame Labs just dropped CSM (Conversational Speech Model) - Apache 2.0 licensed! 💥 > Trained on 1 MILLION hours of data 🤯 > Contextually aware, emotionally intelligent speech > Voice cloning & watermarking > Ultra fast, real-time synthesis > Based on llama

128

639

5K

DeepSeek

@deepseek_ai

10 months

🚀 Day 6 of #OpenSourceWeek: One More Thing – DeepSeek-V3/R1 Inference System Overview Optimized throughput and latency via: 🔧 Cross-node EP-powered batch scaling 🔄 Computation-communication overlap ⚖️ Load balancing Statistics of DeepSeek's Online Service: ⚡ 73.7k/14.8k

github.com

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation - deepseek-ai/open-infra-index

782

1K

9K

DeepSeek

@deepseek_ai

10 months

🚀 Day 4 of #OpenSourceWeek: Optimized Parallelism Strategies ✅ DualPipe - a bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. 🔗 https://t.co/GBtxSvWLT4 ✅ EPLB - an expert-parallel load balancer for V3/R1. 🔗

github.com

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training. - deepseek-ai/DualPipe

445

817

6K