merlinstein_ Profile Banner
Merlin Stein Profile
Merlin Stein

@merlinstein_

Followers
76
Following
13
Media
5
Statuses
18

Frontier AI Evaluations & Monitoring | UK AISI | PhD candidate @Oxford | ex-EU AIO

Oxford/London, UK
Joined December 2023
Don't wanna be here? Send us removal request.
@merlinstein_
Merlin Stein
1 month
RT @Manderljung: The EU's Code of Practice for General-Purpose AI is out. As one of the co-chairs who drafted the Safety & Security Chapter….
0
21
0
@merlinstein_
Merlin Stein
6 months
New: Code Inspections to assess agent autonomy. Idea: Scan the code to pre-filter by autonomy levels which agents to assess more in-depth (e.g. via runtime evaluations) & to monitor open source agent developments w/ @pcihon @bansalg_ @sj_manning & Kevin Xu
Tweet media one
Tweet media two
0
1
4
@merlinstein_
Merlin Stein
8 months
Want to make AI safe & helpful? ✔️.3 years of work or research experience related to AI or digital topics, that are relevant for EU policy? ✔️ .EU citizen?✔️.Apply until Jan. 15 to join one of the most exciting teams in AI governance: The EU AI Office.
0
0
4
@merlinstein_
Merlin Stein
9 months
Apply by Dec 8 here:
0
0
0
@merlinstein_
Merlin Stein
9 months
Eligibility: Every org who has evaluated GPAI models that are accessible in the EU (like Llama or GPT4o . ). Selection: Quality and EU relevance of your best paper (summary) about your eval of a particular risk. Apply by Dec. 8: EUSurvey - Survey.
1
0
0
@merlinstein_
Merlin Stein
9 months
Are you an AI eval org or AI eval research lab? . New opportunity to work together with the European AI Office! Apply for the workshop & be invited for further technical exchange with the AI Office.
digital-strategy.ec.europa.eu
The AI office is collecting contributions from experts to feed into the workshop on general-purpose AI models and systemic risks.
1
0
4
@merlinstein_
Merlin Stein
10 months
RT @cp_dunlop: Post-deployment information sharing would allow the AI ecosystem to jointly enhance understanding of AI impacts 'in the wild….
0
7
0
@merlinstein_
Merlin Stein
1 year
Participate in specifying EU rules for advanced AI on evals, monitoring, model card transparency, responsible scaling policies, …. The drawing up of the “codes of practice” is now open to input ( & participation by stakeholders:. (.
0
0
8
@merlinstein_
Merlin Stein
1 year
Excited to have joined the #EUAIOffice in #Brussels on an expert secondment. Focus: general-purpose AI evaluations, monitoring & code of practices. #AIAct implementation. So grateful to apply some of my PhD research in practice & understand EU AI priorities.
Tweet media one
0
0
12
@merlinstein_
Merlin Stein
1 year
Grateful to contribute to this research memo: AISIs can shape AI governance by providing the information basis. Audits, evaluations and monitoring reduce the unknowns. Done by AISIs & the ecosystem.
@aigioxford
Oxford Martin AI Governance Initiative
1 year
New research memo! The AI Safety Institutes (AISIs) are poised to assume an increasingly significant role in the governance of advanced AI. We recently held an expert workshop to explore the roles AISIs can play. Find out more here @oxmartinschool
0
0
1
@merlinstein_
Merlin Stein
1 year
In that context - enjoyed reading similar work on the similarity between algorithmic trading & AI agents from @zittrain. How will circuit breakers for AI agents look like?
0
0
0
@merlinstein_
Merlin Stein
1 year
Systemic risks of general-purpose AI and AI agents might materialize in finance. Like algo trading flash crashes, integration of AI agents might lead to correlated risks - Maybe. Regulators need visibility. Scenarios in my new piece with the BIS & podcast:.
Tweet media one
@InvAssoc
The Investment Association
1 year
We’re excited to share the latest episode of our podcast, IA talks AI, discussing financial stability and systemic risks from the use of AI in asset management. Host @john_allan_ia sits down with @merlinstein_, AI governance researcher at the University of Oxford and co-author
Tweet media one
1
1
3
@merlinstein_
Merlin Stein
1 year
New: "Safe beyond sale: Post-deployment monitoring for advanced AI governance." with @cp_dunlop .- Why? 1) Pre-deployment evals might fail 2) AI capabilities and risks change depending on usage context.- How? An FMTI, but extended & mandatory
Tweet media one
1
9
26
@merlinstein_
Merlin Stein
2 years
FDA-style oversight for foundation models: .Monitoring, 2 approval gates & iterative engagement to move from exploratory to targeted scrutiny. Rec's for specific FDA-inspired mechanisms for each layer of the supply chain. Thanks for the great collab, @agstrait @cp_dunlop!
Tweet media one
@AdaLovelaceInst
Ada Lovelace Institute
2 years
Medical regulators have long applied rigorous processes to new technologies that, alongside possible benefits, could present risks for people and society. Our new paper explores lessons FDA oversight can provide for AI foundation model governance:
1
0
1