Felipe Cruz-Salinas @fffffelipec X Profile

Felipe Cruz-Salinas

@fffffelipec

Followers

209

Following

2K

Media

8

Statuses

142

Pre-training @cohere

Joined December 2023

Don't wanna be here? Send us removal request.

Felipe Cruz-Salinas

@fffffelipec

6 hours

This is a good motto to work on AI these days (:.

Nick Frosst

@nickfrosst

8 hours

AI does the boring work, you do the creative work.

1

0

5

Felipe Cruz-Salinas

@fffffelipec

6 hours

RT @cohere: Our secure agentic AI platform, North, is now widely available.

0

35

0

Felipe Cruz-Salinas

@fffffelipec

6 days

We have a new Vision Model! Try it out :).

Pierre Richemond 🇪🇺

@TheOneKloud

6 days

Excited to reveal what I've been working on for the last few months. Command-A-Vision is our new flagship 112B VLM that outperforms Llama 4 Maverick, Mistral Medium/Pixtral Large, GPT 4.1, and others. We release weights on HF and hope you'll like it.

0

17

Felipe Cruz-Salinas

@fffffelipec

6 days

RT @TheOneKloud: Excited to reveal what I've been working on for the last few months. Command-A-Vision is our new flagship 112B VLM that ou….

0

30

0

Felipe Cruz-Salinas

@fffffelipec

6 days

RT @nickfrosst: cohere vision model :) . weights on huggingface .

0

46

0

Felipe Cruz-Salinas

@fffffelipec

23 days

This is very cool. One of the reasons I think muP hasn't caught on is that it is not seamlessly integrated with torch. Optax can make some things annoying, but this one is nice :).

Boris Dayma 🖍️

@borisdayma

23 days

MUP has been on my mind forever!.Now I came across this gem from @JesseFarebro : It automatically handles it on JAX/Flax 😍. Just need to see what to adjust for Muon / Shampoo / PSGD-kron (init params + LR scaling).

0

7

Felipe Cruz-Salinas

@fffffelipec

2 months

RT @dianaabagyan: A huge thank you to all of my mentors and collaborators, especially @ahmetustun89, @sarahookr, @alexrs95, and @mziizm for….

arxiv.org

Pretraining massively multilingual Large Language Models (LLMs) for many languages at once is challenging due to limited model capacity, scarce high-quality data, and compute constraints....

0

6

0

Felipe Cruz-Salinas

@fffffelipec

2 months

RT @Cohere_Labs: How can we make language models more flexible to adapt to new languages after pretraining? 🌏. 🧠 Our latest work investigat….

0

19

0

Felipe Cruz-Salinas

@fffffelipec

2 months

RT @magikarp_tokens: 6/6 👇 Full paper & code 👇.📄

arxiv.org

Byte Pair Encoding (BPE) tokenizers, widely used in Large Language Models, face challenges in multilingual settings, including penalization of non-Western scripts and the creation of tokens with...

0

5

0

Felipe Cruz-Salinas

@fffffelipec

2 months

RT @magikarp_tokens: 🔠 UTF-8 was never meant for language models. Yet every major tokenizer still uses it, creating unfair "byte premiums".….

0

41

0

Felipe Cruz-Salinas

@fffffelipec

2 months

Fun sonnet 4 hallucination on muP. The Yang-Lecun correspondence

0

4

Felipe Cruz-Salinas

@fffffelipec

3 months

RT @irombie: I'm excited to share our new pre-print.ShiQ: Bringing back Bellman to LLMs! .In this work, we propose….

arxiv.org

The fine-tuning of pre-trained large language models (LLMs) using reinforcement learning (RL) is generally formulated as direct policy optimization. This approach was naturally favored as it...

0

39

0

Felipe Cruz-Salinas

@fffffelipec

3 months

RT @mziizm: 1/ Science is only as strong as the benchmarks it relies on. So how fair—and scientifically rigorous—is today’s most widely us….

0

21

0

Felipe Cruz-Salinas

@fffffelipec

4 months

RT @cloneofsimo: This year is really full of muP!!.

0

1

0

Felipe Cruz-Salinas

@fffffelipec

4 months

Up on arxiv! Now you can cite the largest model using muP (((: .

arxiv.org

In this report we describe the development of Command A, a powerful large language model purpose-built to excel at real-world enterprise use cases. Command A is an agent-optimised and...

0

2

Felipe Cruz-Salinas

@fffffelipec

4 months

The Command A tech report is out, lots of really useful post-training details, and a couple interesting pre-training nuggets :)))).

cohere

@cohere

4 months

We’re redefining what’s possible with AI. With the release of our latest model, Command A, optimized for real-world agentic and multilingual tasks, we’re demonstrating our commitment to bringing enterprises AI that goes beyond the ordinary, and offers security & efficiency.

1

12

Felipe Cruz-Salinas

@fffffelipec

4 months

RT @cohere: We’re redefining what’s possible with AI. With the release of our latest model, Command A, optimized for real-world agentic a….

0

23

0

Felipe Cruz-Salinas

@fffffelipec

4 months

RT @max_nlp: 📄 You can find the full tech report at

0

5

0

Felipe Cruz-Salinas

@fffffelipec

4 months

push it to the [infinite width] limit.

0

Felipe Cruz-Salinas

@fffffelipec

5 months

RT @lmarena_ai: 🚀 Big news @cohere's latest Command A now climbs to #13 on Arena!. Another organization joining the top-15 club - congrats….

0

42

0