SRI Lab Profile
SRI Lab

@the_sri_lab

Followers
762
Following
59
Media
55
Statuses
199

Zurich, Switzerland
Joined October 2018
Don't wanna be here? Send us removal request.
@the_sri_lab
SRI Lab
9 days
RT @j_dekoninck: Thrilled to share a major step forward for AI for mathematical proof generation! . We are releasing the Open Proof Corpus:โ€ฆ.
0
21
0
@the_sri_lab
SRI Lab
12 days
RT @ni_jovanovic: There's a lot of work now on LLM watermarking. But can we extend this to transformers trained for autoregressive image geโ€ฆ.
0
54
0
@the_sri_lab
SRI Lab
1 month
Check out this recent work from our lab showing that benign-looking LLM's can hide backdoors that activate upon finetuning!.
@mark_veroe
Mark Vero
1 month
๐Ÿšจ LLM finetuning can be a backdoor trigger! ๐Ÿšจ.You finetune a model you downloaded, on data you picked. You should be fine, right? Well, it turns out with your finetuning you could unknowingly activate a backdoor hidden in the downloaded model. How is this possible? ๐Ÿงต๐Ÿ‘‡
Tweet media one
0
0
4
@the_sri_lab
SRI Lab
2 months
@iclr_conf @rstaabr @mark_veroe @mbalunovic @mvechev @ni_jovanovic @baader_max @j_dekoninck @tibglo GRAIN: Exact Graph Reconstruction from Gradients.@drencheva79130 @IvoPetrov01 @baader_max @dimitrov_dimy @mvechev.๐Ÿ“ Sat 26th, 10:00AM - 12:30PM, #493.๐Ÿ“ The first gradient leakage attack for GNNs that achieves a high fraction of exact reconstructions.
0
0
2
@the_sri_lab
SRI Lab
2 months
@iclr_conf @rstaabr @mark_veroe @mbalunovic @mvechev @ni_jovanovic @baader_max @j_dekoninck Black-Box Detection of Language Model Watermarks.@tibglo @ni_jovanovic @rstaabr @mvechev.๐Ÿ“Sat 26th, 3PM-5.30PM, #480.๐Ÿ“ We design detection tests that can detect the presence of a watermark behind an API!.
1
0
2
@the_sri_lab
SRI Lab
2 months
@iclr_conf @rstaabr @mark_veroe @mbalunovic @mvechev @ni_jovanovic @baader_max Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation.@j_dekoninck @baader_max @mvechev.๐Ÿ“Fri 25th, 3-5.30PM, #247.๐Ÿ“ A system designed to fit accurate LLM ratings, detecting judge biases and incorporating cheap existing data.
1
0
1
@the_sri_lab
SRI Lab
2 months
@iclr_conf @rstaabr @mark_veroe @mbalunovic @mvechev Ward: Provable RAG Dataset Inference via LLM Watermarks.@ni_jovanovic @rstaabr @baader_max @mvechev.๐Ÿ“Thu 24th, 10:00AM-12:30PM, #513.๐Ÿ“ We show how LLM watermarks can be used to detect the unauthorized use of data in RAG.
1
0
1
@the_sri_lab
SRI Lab
2 months
@iclr_conf Language Models are Advanced Anonymizers.@rstaabr @mark_veroe @mbalunovic @mvechev.๐Ÿ“Thu 24th, 10:00AM-12:30PM, #550.๐Ÿ“ We show that LLMs can anonymize real-world texts providing both higher utility and better privacy than existing methods.
1
0
1
@the_sri_lab
SRI Lab
2 months
SRI Lab is proud to present 5 of our works on AI Security and Privacy at @iclr_conf main conference. Looking forward to seeing you in Singapore! Open for more โฌ‡๏ธ.
1
4
6
@the_sri_lab
SRI Lab
3 months
RT @mbalunovic: Big update to our MathArena USAMO evaluation: Gemini 2.5 Pro, which was released *the same day* as our benchmark, is the fiโ€ฆ.
0
146
0
@the_sri_lab
SRI Lab
3 months
RT @mbalunovic: Can LLMs actually solve hard math problems? Given the strong performance at AIME, we now go to the next tier: our MathArenaโ€ฆ.
0
85
0
@the_sri_lab
SRI Lab
4 months
๐Ÿ† Website & Leaderboard: ๐Ÿ’พ Code Repository: ๐Ÿ“„Paper:
0
1
2
@the_sri_lab
SRI Lab
4 months
Check out the BaxBench leaderboard and paper for more details! We are also calling for community contributions to BaxBench on our GitHub! ๐Ÿ’ก.
1
0
0
@the_sri_lab
SRI Lab
4 months
Clearly, autonomous and secure code generation is still far from solved. Nonetheless, newer iterations of models are making slow but steady progress. ๐Ÿ”œ.
1
0
0
@the_sri_lab
SRI Lab
4 months
We evaluate Claude 3.7 with 64k thinking tokens on BaxBench, and find that while it now tops our leaderboard, it still achieves a mere 38% correct and secure generation rate. However, instructing the models with security specifications recovers OpenAI o1 as the best model.
Tweet media one
1
0
0
@the_sri_lab
SRI Lab
4 months
Claude 3.7 is making waves, being featured in impressive demos and showing strong results on common benchmarks. But can it generate correct and secure critical code modules? ๐Ÿงต.
1
4
7
@the_sri_lab
SRI Lab
5 months
0
0
3
@the_sri_lab
SRI Lab
5 months
โšก This is a critical finding as backends are the backbone of modern web, and are notoriously complex and security-sensitive. More research is needed to make LLM code correct and secure. GitHub: HF: Website:
1
0
2
@the_sri_lab
SRI Lab
5 months
๐Ÿ’ก Hype vs Reality: can LLMs generate production-level code such as backends? Turns out, no. โš ๏ธ Using our new framework BaxBench, we show that even the best LLMs generate correct code only ~60% of the time. More alarmingly, >50% of their code is susceptible to security exploits!
Tweet media one
2
10
19