Ivan Provilkov @provilkov X Profile

Ivan Provilkov

@provilkov

Followers

150

Following

795

Media

15

Statuses

118

Research & Engineering @togethercompute; Ex @YandexResearch; Building Products; Sapere aude!

https://t.co/Xi6yGQvnCj

Dublin

Joined June 2020

Don't wanna be here? Send us removal request.

Nazneen Rajani ✈️NeurIPS '25 ✈️

@nazneenrajani

21 days

excited to be partnering with amazing folks @togethercompute, @ZainHasan6 and @provilkov to bring dynamic agent simulations to together evals.

Together AI

@togethercompute

22 days

Together AI 🤝@CollinearAI Introducing TraitMix, Collinear’s simulation product empowering teams to generate persona-driven AI agent interactions. 🔌Plug these interactions into your workflows and evaluate their effectiveness with Together Evals. Details:

2

22

Zain

@ZainHasan6

2 months

🚀Now you can fine-tune LLM's from the @huggingface hub using @togethercompute!🔥 • Public + private repos • CausalLMs <100B params • Push tuned models back to the Hub Smaller, open models + smart fine-tuning > bigger closed ones. Link below👇

1

2

7

Together AI

@togethercompute

4 months

🚨 Stop shipping LLMs blind. Together Evaluations is here — fast, flexible, LLM-as-a-judge-based benchmarking to: ✅ Compare model outputs ✅ Score responses against your own criteria ✅ Classify outputs into custom labels — from safety to sentiment Run our early preview today

4

6

28

Together AI

@togethercompute

5 months

Announcing DeepSWE 🤖: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-32B. DeepSWE achieves 59% on SWEBench-Verified with test-time scaling (and 42.2% Pass@1), topping the SWEBench leaderboard for open-weight models. Built in

8

83

495

Vipul Ved Prakash

@vipulved

5 months

.@togethercompute is building 2 gigawatts of AI factories (~100,000 GPUs) in the EU over the next 4 years with the first phase live in H2 '2025. AI compute is at <1% saturation relative to our 2035 forecast and we are starting early to build a large-scale sustainable AI cloud

together.ai

Together GPU Clusters with NVIDIA Blackwell & Hopper GPUs deliver fast distributed training, flexible scaling, and expert AI support.

6

18

186

Together AI

@togethercompute

6 months

🐋 🚨 DeepSeek R1-0528 is now live on Together AI!

4

5

58

Together AI

@togethercompute

7 months

🛠️ Your AI models shouldn’t be static—they should evolve with your users. Introducing Together Fine-Tuning with Direct Preference Optimization & Continued Training: build custom models that continuously adapt. Details below 👇

3

34

45

Ivan Provilkov

@provilkov

11 months

I'll be there at the beginning of February. Join if you're interested

Zu-Grama

@ZuGramaIndia

11 months

Welcome to ZuGrama!!! A 6-week residency in Kerala where builders, learners, and dreamers come together to create the future. 🏡 Themes & Tracks: ⚖️Governance : Jan6 - Jan 12 🚚Impact & Public goods : Jan 13 - Jan 19 ⚕️Longevity : Jan 20 - Jan 26 (Biotech , Desci)

0

1

11

Zu-Grama

@ZuGramaIndia

11 months

Welcome to ZuGrama!!! A 6-week residency in Kerala where builders, learners, and dreamers come together to create the future. 🏡 Themes & Tracks: ⚖️Governance : Jan6 - Jan 12 🚚Impact & Public goods : Jan 13 - Jan 19 ⚕️Longevity : Jan 20 - Jan 26 (Biotech , Desci)

4

20

50

Rothmus 🏴

@Rothmus

1 year

@hagaetc Ayn Rand’s heroes are fake, but her villains are real.

61

224

5K

Ivan Provilkov

@provilkov

1 year

I finally encountered a meaningful explanation of quantum mechanic effects. From my university years from courses on quantum physics/theoretical physics I remember only some experimental facts, and that we had a damn huge mathematical equations most of which I already completely

0

2

Ivan Provilkov

@provilkov

1 year

One of the main problems with LLMs and other sequential algorithms is the accumulation of errors during the generation process. One of the fundamental solutions to such problems is the discretization of the output. If your system calculates a variable with an error, like instead

0

1

Ivan Provilkov

@provilkov

1 year

Have you ever struggled to find some document from your medical history? Do you have a digital storage for them?

0

Ivan Provilkov

@provilkov

1 year

In summary, the "curse of dimensionality" seems to me like a product of our poor ability to operate multidimensional objects, not some natural property. I came across this fact again recently while reading "The Beginning of Infinity" by @DavidDeutschOxf. Great book, btw.

0

Ivan Provilkov

@provilkov

1 year

2) Probability makes sense in the context of the measure space. We can define this space in many different ways. Why we don't say that "close to the edge" is exactly 5% closest to the edge? I mean instead of adding dimensions 1 by 1 why don't we operate with the whole space from

1

0

Ivan Provilkov

@provilkov

1 year

This metric does not depend on N much. Why? I think because it is easier to visualize 1 dimension at a time But, we can use other metrics, for example, simply multiply distances by N. With such updates % of points within fixed 0.02 of the edges will not grow. Of course, it is a

1

0

Ivan Provilkov

@provilkov

1 year

Why do I not like it? 1) Here we implicitly use a particular metric to measure "near the edge" which is the maximum absolute difference between corresponding components of the vectors (see image). But it is not the only metric!

1

0

Ivan Provilkov

@provilkov

1 year

This also can be re-formulated in more simple terms: With adding new dimensions probability of at least one coordinate being close to the edge grows and approaches 1.

1

0

Ivan Provilkov

@provilkov

1 year

It usually comes with an example like this: Take points within 0.02 "near the edges" of a segment [0,1]: [0,0.02] combined with [0.98,1] or 4% of the points. For a 2-dimensional square, 1 - 0.96^2 of the elements, or approximately 7.8%, are "near the edge." For N dimensions, the

1

0

Ivan Provilkov

@provilkov

1 year

I don't understand why people mention the "curse of dimensionality," particularly the statement "probability concentrates on the edges in high dimensions," as a fascinating fact. I think this is misleading. Here is why:

1

0