SRI Lab @the_sri_lab X Profile

SRI Lab

@the_sri_lab

Followers

762

Following

59

Media

55

Statuses

199

Zurich, Switzerland

Joined October 2018

Don't wanna be here? Send us removal request.

SRI Lab

@the_sri_lab

9 days

RT @j_dekoninck: Thrilled to share a major step forward for AI for mathematical proof generation! . We are releasing the Open Proof Corpus:….

0

21

0

SRI Lab

@the_sri_lab

12 days

RT @ni_jovanovic: There's a lot of work now on LLM watermarking. But can we extend this to transformers trained for autoregressive image ge….

0

54

0

SRI Lab

@the_sri_lab

1 month

Check out this recent work from our lab showing that benign-looking LLM's can hide backdoors that activate upon finetuning!.

Mark Vero

@mark_veroe

1 month

🚨 LLM finetuning can be a backdoor trigger! 🚨.You finetune a model you downloaded, on data you picked. You should be fine, right? Well, it turns out with your finetuning you could unknowingly activate a backdoor hidden in the downloaded model. How is this possible? 🧵👇

0

4

SRI Lab

@the_sri_lab

2 months

@iclr_conf @rstaabr @mark_veroe @mbalunovic @mvechev @ni_jovanovic @baader_max @j_dekoninck @tibglo GRAIN: Exact Graph Reconstruction from Gradients.@drencheva79130 @IvoPetrov01 @baader_max @dimitrov_dimy @mvechev.📍 Sat 26th, 10:00AM - 12:30PM, #493.📝 The first gradient leakage attack for GNNs that achieves a high fraction of exact reconstructions.

0

2

SRI Lab

@the_sri_lab

2 months

@iclr_conf @rstaabr @mark_veroe @mbalunovic @mvechev @ni_jovanovic @baader_max @j_dekoninck Black-Box Detection of Language Model Watermarks.@tibglo @ni_jovanovic @rstaabr @mvechev.📍Sat 26th, 3PM-5.30PM, #480.📝 We design detection tests that can detect the presence of a watermark behind an API!.

1

0

2

SRI Lab

@the_sri_lab

2 months

@iclr_conf @rstaabr @mark_veroe @mbalunovic @mvechev @ni_jovanovic @baader_max Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation.@j_dekoninck @baader_max @mvechev.📍Fri 25th, 3-5.30PM, #247.📝 A system designed to fit accurate LLM ratings, detecting judge biases and incorporating cheap existing data.

1

0

1

SRI Lab

@the_sri_lab

2 months

@iclr_conf @rstaabr @mark_veroe @mbalunovic @mvechev Ward: Provable RAG Dataset Inference via LLM Watermarks.@ni_jovanovic @rstaabr @baader_max @mvechev.📍Thu 24th, 10:00AM-12:30PM, #513.📝 We show how LLM watermarks can be used to detect the unauthorized use of data in RAG.

1

0

1

SRI Lab

@the_sri_lab

2 months

@iclr_conf Language Models are Advanced Anonymizers.@rstaabr @mark_veroe @mbalunovic @mvechev.📍Thu 24th, 10:00AM-12:30PM, #550.📝 We show that LLMs can anonymize real-world texts providing both higher utility and better privacy than existing methods.

1

0

1

SRI Lab

@the_sri_lab

2 months

SRI Lab is proud to present 5 of our works on AI Security and Privacy at @iclr_conf main conference. Looking forward to seeing you in Singapore! Open for more ⬇️.

1

4

6

SRI Lab

@the_sri_lab

3 months

RT @mbalunovic: Big update to our MathArena USAMO evaluation: Gemini 2.5 Pro, which was released *the same day* as our benchmark, is the fi….

0

146

0

SRI Lab

@the_sri_lab

3 months

RT @mbalunovic: Can LLMs actually solve hard math problems? Given the strong performance at AIME, we now go to the next tier: our MathArena….

0

85

0

SRI Lab

@the_sri_lab

4 months

🏆 Website & Leaderboard: 💾 Code Repository: 📄Paper:

0

1

2

SRI Lab

@the_sri_lab

4 months

Check out the BaxBench leaderboard and paper for more details! We are also calling for community contributions to BaxBench on our GitHub! 💡.

1

0

SRI Lab

@the_sri_lab

4 months

Clearly, autonomous and secure code generation is still far from solved. Nonetheless, newer iterations of models are making slow but steady progress. 🔜.

1

0

SRI Lab

@the_sri_lab

4 months

We evaluate Claude 3.7 with 64k thinking tokens on BaxBench, and find that while it now tops our leaderboard, it still achieves a mere 38% correct and secure generation rate. However, instructing the models with security specifications recovers OpenAI o1 as the best model.

1

0

SRI Lab

@the_sri_lab

4 months

Claude 3.7 is making waves, being featured in impressive demos and showing strong results on common benchmarks. But can it generate correct and secure critical code modules? 🧵.

1

4

7

SRI Lab

@the_sri_lab

5 months

Amazing work by @mark_veroe @nielstron @tch1bo @vesuraychev Max Baader @ni_jovanovic @jingxuan_he @mvechev, and an amazing collaboration with @logic_star_ai @UCBerkeley @INSAITinstitute!.

0

3

SRI Lab

@the_sri_lab

5 months

⚡ This is a critical finding as backends are the backbone of modern web, and are notoriously complex and security-sensitive. More research is needed to make LLM code correct and secure. GitHub: HF: Website:

1

0

2

SRI Lab

@the_sri_lab

5 months

💡 Hype vs Reality: can LLMs generate production-level code such as backends? Turns out, no. ⚠️ Using our new framework BaxBench, we show that even the best LLMs generate correct code only ~60% of the time. More alarmingly, >50% of their code is susceptible to security exploits!

2

10

19