EdanToledo Profile Banner
Edan Toledo Profile
Edan Toledo

@EdanToledo

Followers
88
Following
24
Media
0
Statuses
33

PhD Student @AIatMeta & @UCL • Prev RE @InstaDeepAI • MPhil ACS @Cambridge_Uni • Reinforcement Learning • 🇿🇦🇬🇧

Joined September 2022
Don't wanna be here? Send us removal request.
@EdanToledo
Edan Toledo
1 year
🚀 Excited to release Stoix! A new #OpenSource library for End-to-End Distributed (Synchronously) Single-Agent Reinforcement Learning in JAX. 🏛️. 🔗
1
18
79
@EdanToledo
Edan Toledo
2 months
RT @pfau: New paper accepted to ICML! We present a novel policy optimization algorithm for continuous control with a simple closed form whi….
0
41
0
@EdanToledo
Edan Toledo
3 months
RT @MattVMacfarlane: Thrilled to see our NeurIPS 2024 paper, Sequential Monte Carlo Policy Optimisation (, featured….
0
9
0
@EdanToledo
Edan Toledo
4 months
RT @DulhanJay: Efficient LLM reasoning over large data doesn't require massive contexts! 🫡. We show that a simple in-context method, PRISM,….
0
43
0
@EdanToledo
Edan Toledo
6 months
RT @pcastr: It's amazing two of the 2024 #NobelPrize were for AI! But as they say: it took a village. "We didn't win a Nobel", a parody of….
0
19
0
@EdanToledo
Edan Toledo
7 months
RT @instadeepai: Excited to share our latest work on Sequential Monte Carlo Policy Optimisation (SPO)🔥— a scalable, search-based RL algorit….
0
7
0
@EdanToledo
Edan Toledo
8 months
RT @ClementBonnet16: Introducing Latent Program Network (LPN), a new architecture for inductive program synthesis that builds in test-time….
0
31
0
@EdanToledo
Edan Toledo
11 months
Credit to the Mava team for helping as well as InstaDeeps amazing implementation for help as reference.
0
0
5
@EdanToledo
Edan Toledo
11 months
Inline with Stoix’s code philosophy, it's super hackable and easy to modify, perfect for flexible and high-performance RL research! . Currently only PPO is supported but many more to come. Let me know your thoughts.
1
0
4
@EdanToledo
Edan Toledo
11 months
🚀 Introducing Sebulba Systems in Stoix! Harness JAX's speed with non-JAX environments in your single-agent RL research. Sebulba splits the actors and learners, running environments on CPUs but performing batched inference/learning using GPUs/TPUs. 🔗
2
2
32
@EdanToledo
Edan Toledo
11 months
RT @pcastr: Can we fix the review process before we try to automate science?.
0
10
0
@EdanToledo
Edan Toledo
11 months
RT @callumtilbury: What happens when trying to learn multi-agent coordination from a static dataset? Catastrophe, if you’re not careful!. T….
0
8
0
@EdanToledo
Edan Toledo
1 year
RT @ChalumeauFelix: Excited to introduce our latest neural solver, MEMENTO! Enhancing problem-specific adaptation with an explicit memory.….
0
19
0
@EdanToledo
Edan Toledo
1 year
RT @AlexLaterre: Got lost in the #ICLR2024 poster maze? Don't worry, we've got your covered! 🛟. Here is @DonalByrne2, Senior Research Engin….
0
9
0
@EdanToledo
Edan Toledo
1 year
RT @callumtilbury: Curious about this diagram? Join us later today as we discuss growing the MARL ecosystem in JAX! 🤖🍿. @instadeepai @ruanj….
0
3
0
@EdanToledo
Edan Toledo
1 year
RT @ClementBonnet16: Excited to announce Jumanji v1.0, now featuring 22 fast, flexible, and scalable environments!. Fully written in JAX, J….
0
28
0
@EdanToledo
Edan Toledo
1 year
RT @ClementBonnet16: If you haven't yet, please check out the amazing works from the JAX community. E.g. environments: Brax (@OlivierBachem….
0
3
0
@EdanToledo
Edan Toledo
1 year
Important to mention is that Stoix can be seen as a single-agent counter part to Mava . Stoix actually started as a Mava clone when I wanted an End-to-End JAX PPO with all the extra features Mava provides. If you’re into in MARL, can’t recommend it enough.
0
1
6
@EdanToledo
Edan Toledo
1 year
🧪 Robust evaluation and logging ready for statistical testing with and 🚀 Optimised for speed with JAX's pmap & jit allowing for quick and easy scaling. Any contribution and feedback is welcome. 🤝 #RL #JAX.
2
0
3
@EdanToledo
Edan Toledo
1 year
🍬 Environment wrappers for most JAX-native environment suites (Gymnax @RobertTLange , Jumanji @ClementBonnet16 , Brax @GoogleOSS, XMinigrid @how_uhh, Craftax @mitrma and even JAXMarl @alexrutherford0 (using Centralised Controllers).
1
0
6
@EdanToledo
Edan Toledo
1 year
🌟 Features:. 🥑 Ready-to-use RL algorithms like PPO, DQN, SAC & more to serve as baselines and research starting points.
1
0
2