
André Silva
@andre15silva_
Followers
142
Following
7K
Media
17
Statuses
1K
PhD at KTH 🧑🍳 ML on Code
Stockholm, Sweden
Joined November 2015
RT @bjarnihaukur11: The funniest (unintentional) reward hack I saw while training my coding agent: it "rm -rf"'d the repo it was working on….
0
1
0
These and other updates available on
repairbench.github.io
Explore RepairBench, the leaderboard of frontier models for program repair.
0
0
0
2️⃣Gemini 2.5 Pro shows good progress from Google. Google's Gemini 2.5 Pro has demonstrated improvements over its predecessors, with a Plausible@1 score of 38.3% vs. 33.2% of the previous generation. Despite this, it still falls short of the Claude and DeepSeek models.
1
0
0
1️⃣Quasar-Alpha is a strong contender. Quasar-alpha, a stealth model available on OpenRouter, has made the news. With a Plausible@1 score of 40.5%, quasar-alpha is approaching the performance of leading models like Claude-3.5. The big question is: who is behind quasar-alpha?.
1
0
0
RT @mokita_j: 🔥sb-heists is now featured in this week’s @blockthreat newsletter!. 🔍91 reproducible exploits for 9 blockchain vulnerabilitie….
0
1
0
RT @martinmonperrus: OpenAI strikes back and reclaims first place 🥇 on the RepairBench leaderboard for automated bug fixing. https://t.co/….
0
1
0