Ivan Provilkov
@provilkov
Followers
150
Following
795
Media
15
Statuses
118
Research & Engineering @togethercompute; Ex @YandexResearch; Building Products; Sapere aude!
Dublin
Joined June 2020
excited to be partnering with amazing folks @togethercompute, @ZainHasan6 and @provilkov to bring dynamic agent simulations to together evals.
Together AI 🤝@CollinearAI Introducing TraitMix, Collinear’s simulation product empowering teams to generate persona-driven AI agent interactions. 🔌Plug these interactions into your workflows and evaluate their effectiveness with Together Evals. Details:
2
2
22
🚀Now you can fine-tune LLM's from the @huggingface hub using @togethercompute!🔥 • Public + private repos • CausalLMs <100B params • Push tuned models back to the Hub Smaller, open models + smart fine-tuning > bigger closed ones. Link below👇
1
2
7
🚨 Stop shipping LLMs blind. Together Evaluations is here — fast, flexible, LLM-as-a-judge-based benchmarking to: ✅ Compare model outputs ✅ Score responses against your own criteria ✅ Classify outputs into custom labels — from safety to sentiment Run our early preview today
4
6
28
Announcing DeepSWE 🤖: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-32B. DeepSWE achieves 59% on SWEBench-Verified with test-time scaling (and 42.2% Pass@1), topping the SWEBench leaderboard for open-weight models. Built in
8
83
495
.@togethercompute is building 2 gigawatts of AI factories (~100,000 GPUs) in the EU over the next 4 years with the first phase live in H2 '2025. AI compute is at <1% saturation relative to our 2035 forecast and we are starting early to build a large-scale sustainable AI cloud
together.ai
Together GPU Clusters with NVIDIA Blackwell & Hopper GPUs deliver fast distributed training, flexible scaling, and expert AI support.
6
18
186
🛠️ Your AI models shouldn’t be static—they should evolve with your users. Introducing Together Fine-Tuning with Direct Preference Optimization & Continued Training: build custom models that continuously adapt. Details below 👇
3
34
45
I'll be there at the beginning of February. Join if you're interested
Welcome to ZuGrama!!! A 6-week residency in Kerala where builders, learners, and dreamers come together to create the future. 🏡 Themes & Tracks: ⚖️Governance : Jan6 - Jan 12 🚚Impact & Public goods : Jan 13 - Jan 19 ⚕️Longevity : Jan 20 - Jan 26 (Biotech , Desci)
0
1
11
Welcome to ZuGrama!!! A 6-week residency in Kerala where builders, learners, and dreamers come together to create the future. 🏡 Themes & Tracks: ⚖️Governance : Jan6 - Jan 12 🚚Impact & Public goods : Jan 13 - Jan 19 ⚕️Longevity : Jan 20 - Jan 26 (Biotech , Desci)
4
20
50
I finally encountered a meaningful explanation of quantum mechanic effects. From my university years from courses on quantum physics/theoretical physics I remember only some experimental facts, and that we had a damn huge mathematical equations most of which I already completely
0
0
2
One of the main problems with LLMs and other sequential algorithms is the accumulation of errors during the generation process. One of the fundamental solutions to such problems is the discretization of the output. If your system calculates a variable with an error, like instead
0
0
1
Have you ever struggled to find some document from your medical history? Do you have a digital storage for them?
0
0
0
In summary, the "curse of dimensionality" seems to me like a product of our poor ability to operate multidimensional objects, not some natural property. I came across this fact again recently while reading "The Beginning of Infinity" by @DavidDeutschOxf. Great book, btw.
0
0
0
2) Probability makes sense in the context of the measure space. We can define this space in many different ways. Why we don't say that "close to the edge" is exactly 5% closest to the edge? I mean instead of adding dimensions 1 by 1 why don't we operate with the whole space from
1
0
0
This metric does not depend on N much. Why? I think because it is easier to visualize 1 dimension at a time But, we can use other metrics, for example, simply multiply distances by N. With such updates % of points within fixed 0.02 of the edges will not grow. Of course, it is a
1
0
0
Why do I not like it? 1) Here we implicitly use a particular metric to measure "near the edge" which is the maximum absolute difference between corresponding components of the vectors (see image). But it is not the only metric!
1
0
0
This also can be re-formulated in more simple terms: With adding new dimensions probability of at least one coordinate being close to the edge grows and approaches 1.
1
0
0
It usually comes with an example like this: Take points within 0.02 "near the edges" of a segment [0,1]: [0,0.02] combined with [0.98,1] or 4% of the points. For a 2-dimensional square, 1 - 0.96^2 of the elements, or approximately 7.8%, are "near the edge." For N dimensions, the
1
0
0
I don't understand why people mention the "curse of dimensionality," particularly the statement "probability concentrates on the edges in high dimensions," as a fascinating fact. I think this is misleading. Here is why:
1
0
0