Theodor Marcu
@theodormarcu
Followers
3K
Following
10K
Media
204
Statuses
2K
head of product growth @cognition
San Francisco, CA
Joined April 2012
This is a big milestone for the research team Post-training is a very exciting place to be in right now
We are sharing an early preview of our ongoing SWE-1.6 training run. It significantly improves upon SWE-1.5 while being post-trained on the same pre-trained model - and it runs equally as fast at 950 tok/s. On SWE-Bench Pro it exceeds top open-source models. The preview model
1
0
29
@dangreenoh @Opendoor im sorry but after my mortgage expert @devinai one shot this quiz, i couldn't resist some guerilla marketing
3
1
9
People continue to underestimate how agents will continue to drive up the demand for software
so it turns out you actually *can* one-shot vibecode @DoorDash i gave my coding agent access to Vercel, Supabase, and Exa MCP and it figured everything out. it also tested its own work using its computer. this is not just a frontend that only works in localhost but a functional
0
0
24
We use Devin every day to build Devin. In fact, Devin is the single biggest contributor to our codebase. We’re sharing a look inside our workflows, tools, and playbooks. Read below.
6
7
170
damn I didn't realize @moritz_stephan was such a 🐐
2 years ago Devin struggled to write 100 PRs across our entire team. Now @moritz_stephan 4xed that number by himself. In less than one month!
1
0
28
Devin with computer use has been a game changer for 10x engineers like @edis0n_zhang
Devin testing has been a huge unlock. Here's a full uncut video of Devin testing a new banner it made in Windsurf. It did the full testing flow on its own - Built local Windsurf - Opening the developer console - Modifying local storage - Validating the alert is triggered upon
0
0
19
This stuff is getting really amazing. Big release from Scott and team.
Introducing Devin 2.2 – the autonomous agent that can test with computer use, self-verify, and auto-fix its work. Try it for free! We’ve also overhauled Devin from the ground up: - 3x faster startup - fully redesigned interface - computer use + virtual desktop ...and hundreds
6
8
204
Sonnet 4.6 is now available in @windsurf
Claude Sonnet 4.6 is live in Windsurf, with support for 1M token context! Sonnet 4.6 is also available in Arena Mode’s Frontier and Hybrid Battle Groups. Let us know what you think!
4
10
215
It's awesome to see Kimi K2.5 climbing the rankings in the Arena Mode leaderboard in the past few days
The Arena Mode Public Leaderboard is live! Top Frontier models: 1. Opus 4.6 2. Opus 4.5 3. Sonnet 4.5 Top Fast models: 1. SWE 1.5 2. Haiku 4.5 3. Gemini 3 Flash Low Live leaderboard link and analysis below.
1
0
7
instruct your agents to use @cognition DeepWiki MCP to "interview" libraries it should use to solve a problem ai doesn't use enough libraries, and when it does, it uses them wrong
3
5
73
Arena Mode leaderboard is out! - 40,000 votes in first week (code arena has 140k lifetime votes) - first in-product arena at scale - first arena NOT to penalize “fast but good enough” - major upsets: •Gemini 3 Flash beat Gemini 3 Pro •@xai Grok Code Fast beat Gemini 3 •Claude
The Arena Mode Public Leaderboard is live! Top Frontier models: 1. Opus 4.6 2. Opus 4.5 3. Sonnet 4.5 Top Fast models: 1. SWE 1.5 2. Haiku 4.5 3. Gemini 3 Flash Low Live leaderboard link and analysis below.
20
3
56
The Arena Mode Public Leaderboard is live! Top Frontier models: 1. Opus 4.6 2. Opus 4.5 3. Sonnet 4.5 Top Fast models: 1. SWE 1.5 2. Haiku 4.5 3. Gemini 3 Flash Low Live leaderboard link and analysis below.
25
30
339