Markus Zimmermann @zimmskal X Profile

Markus Zimmermann

@zimmskal

Followers

2K

Following

4K

Media

258

Statuses

4K

Benchmarking LLMs to check how well they write quality code. Support me using the profile link 👇

Linz

Joined November 2010

Don't wanna be here? Send us removal request.

Markus Zimmermann

@zimmskal

4 months

New models on the DevQualityEval leaderboard for v1.0:. - Arcee AI: Coder Large.- Google: Gemini 2.5 Pro (2025-05) (preview).- Microsoft: Phi 4 Reasoning Plus 15B.- Mistral: Mistral Medium 3. Enjoy and discuss! 🌈.

devqualityeval.com

Take a look at the DevQualityEval Leaderboard (v1.0) to find your best LLM for coding and other software development tasks.

2

9

Markus Zimmermann

@zimmskal

2 months

Feature request @willhaben: fixe Treffen mit Ort Datum und Zeit die verpflichtend sind für beide Seiten. Wenn du Person nicht innerhalb von $agreeable-delay da ist. Dann kassiert die andere Person $amount. E.g. 5 Euro. Ich wäre schon Millionär damit.

0

Grok

@grok

4 days

Join millions who have switched to Grok.

74

140

885

Markus Zimmermann

@zimmskal

2 months

Was just a matter of time until we went full auto-SEO. Time to fix some things. 🏎️.

Thomas Schranz 🍄

@__tosh

2 months

I built a simple tool that takes raw Google Search Console data and turns it into an actionable audit for what to do next to get more traffic. But it's not only about more traffic. It's also about getting better traffic, where the search intent of the user actually meets. 1/n.

0

1

Markus Zimmermann

@zimmskal

2 months

Wait. was this about coding agents vs developers?.

0

Markus Zimmermann

@zimmskal

2 months

"You know, I think we'll get to full self-driving next year. As a generalized solution, I think". Sorry @elonmusk but I think you must come up with a new repeatable future prediction quote like right now!.

Tesla

@Tesla

2 months

First @robotaxi experiences in thread below.

0

1

Markus Zimmermann

@zimmskal

2 months

@xeophon_ 🤷.

1

0

1

Markus Zimmermann

@zimmskal

2 months

Is this real?!?? How this news does not have a billion likes and shares I do not understand but one thing is clear: this will change the lives of many. Not just in academia. I hope others follow as well. But a decade too late. I guess.

Marcel Böhme👨‍🔬

@mboehme_

2 months

100% of ACM publications available for free from 1st January 2026! 🎉 Landmark achievement!.

1

7

Markus Zimmermann

@zimmskal

3 months

If you are into (coding) agents, this is pretty nice to dig into.

Mario Zechner

@badlogicgames

3 months

A new entry to my popular series "LLM tools for plebs": claude-trace. - Injects itself into Claude Code.- Logs all traffic.- Reconstructs conversations and shows what's going on behind the scenes (system prompts, all tool inputs/outputs, and more). Some observations. 🧵

0

Markus Zimmermann

@zimmskal

3 months

Our kid number 2 is a super simple finite state machine:.- drink.- eat.- drink.- play.- poop.- sleep.- repeat. Deviate from that master plan and you will get screamed at 🫡.

0

3

Markus Zimmermann

@zimmskal

3 months

💯 just need to let them build some harnesses to do those jobs.

NIK

@ns123abc

3 months

Anthropic researchers: “Even if AI progress completely stalls today and we don’t reach AGI… the current systems are already capable of automating ALL white-collar jobs within the next 5 five years” . It’s over.

0

1

Markus Zimmermann

@zimmskal

3 months

W O W . this is just freaking amazing! Would love to see the prompts for these 🙀.

Philipp Schmid

@_philschmid

3 months

PURE INSANITY! Here is a 5 minute long compilation showcasing the craziest things people are generating with @GoogleDeepMind VEO 3. 🤯 You won't believe your eyes! Sound on🔊. [source: reddit r/singularity]

0

1

0

Markus Zimmermann

@zimmskal

4 months

I just got demoed a new amazing model and was asked about my favorite question i usually prompt. I used to have one that is not up-to-date-data or coding related when the first reasoning models came out: `Actually create a proof for the P versus NP problem. Make a plan on how to.

0

1

4

Markus Zimmermann

@zimmskal

4 months

devqualityeval.com

Take a look at the DevQualityEval Leaderboard (v1.0) to find your best LLM for coding and other software development tasks.

0

1

Markus Zimmermann

@zimmskal

4 months

For all metrics and graphs: (which goes directly into the fund of benchmarking the models).

1

0

Markus Zimmermann

@zimmskal

4 months

New models on the DevQualityEval leaderboard for v1.0:. - Google: Gemini 2.5 Flash (preview).- Inception: Mercury Coder Small (beta).- Rerun of Llama 4 Maverick 400B and Scout 109B.- OpenAI: GPT-4.1.- OpenAI: GPT-4.1-mini.- OpenAI: GPT-4.1-nano.- OpenAI: o4-mini.- OpenAI: o4-mini

2

0

8

Markus Zimmermann

@zimmskal

4 months

Still believe that this is the development process to go, even in a coding agent world: But. especially because of what i have heard lately about development processes.

0

1

Markus Zimmermann

@zimmskal

4 months

I see that my request for not releasing a major model during the easter holidays was fully ignored 😿. Well, here we go 🏇. If somebody knows how i get i free and not rate limited token for benchmarking @OpenAI's o3, please let me know.

0

1

4

Markus Zimmermann

@zimmskal

5 months

Going on vacation without a laptop for the first time since. 10 years?!?. But the most exciting thing is by far seeing my oldest child be super excited and packing all the things she would like to take with. Not that we have room for literally every toy but still 💘.

2

0

8

Markus Zimmermann

@zimmskal

5 months

Congratulations @xai and @elonmusk on scoring so high and taking the Java and migration 👑 with Grok 3!.

0

1