Edan Toledo @EdanToledo X Profile

Edan Toledo

@EdanToledo

Followers

88

Following

24

Media

0

Statuses

33

PhD Student @AIatMeta & @UCL • Prev RE @InstaDeepAI • MPhil ACS @Cambridge_Uni • Reinforcement Learning • 🇿🇦🇬🇧

Joined September 2022

Don't wanna be here? Send us removal request.

Edan Toledo

@EdanToledo

1 year

🚀 Excited to release Stoix! A new #OpenSource library for End-to-End Distributed (Synchronously) Single-Agent Reinforcement Learning in JAX. 🏛️. 🔗

1

18

79

Edan Toledo

@EdanToledo

2 months

RT @pfau: New paper accepted to ICML! We present a novel policy optimization algorithm for continuous control with a simple closed form whi….

0

41

0

Edan Toledo

@EdanToledo

3 months

RT @MattVMacfarlane: Thrilled to see our NeurIPS 2024 paper, Sequential Monte Carlo Policy Optimisation (, featured….

0

9

0

Edan Toledo

@EdanToledo

4 months

RT @DulhanJay: Efficient LLM reasoning over large data doesn't require massive contexts! 🫡. We show that a simple in-context method, PRISM,….

0

43

0

Edan Toledo

@EdanToledo

6 months

RT @pcastr: It's amazing two of the 2024 #NobelPrize were for AI! But as they say: it took a village. "We didn't win a Nobel", a parody of….

0

19

0

Edan Toledo

@EdanToledo

7 months

RT @instadeepai: Excited to share our latest work on Sequential Monte Carlo Policy Optimisation (SPO)🔥— a scalable, search-based RL algorit….

0

7

0

Edan Toledo

@EdanToledo

8 months

RT @ClementBonnet16: Introducing Latent Program Network (LPN), a new architecture for inductive program synthesis that builds in test-time….

0

31

0

Edan Toledo

@EdanToledo

11 months

Credit to the Mava team for helping as well as InstaDeeps amazing implementation for help as reference.

0

5

Edan Toledo

@EdanToledo

11 months

Inline with Stoix’s code philosophy, it's super hackable and easy to modify, perfect for flexible and high-performance RL research! . Currently only PPO is supported but many more to come. Let me know your thoughts.

1

0

4

Edan Toledo

@EdanToledo

11 months

🚀 Introducing Sebulba Systems in Stoix! Harness JAX's speed with non-JAX environments in your single-agent RL research. Sebulba splits the actors and learners, running environments on CPUs but performing batched inference/learning using GPUs/TPUs. 🔗

2

32

Edan Toledo

@EdanToledo

11 months

RT @pcastr: Can we fix the review process before we try to automate science?.

0

10

0

Edan Toledo

@EdanToledo

11 months

RT @callumtilbury: What happens when trying to learn multi-agent coordination from a static dataset? Catastrophe, if you’re not careful!. T….

0

8

0

Edan Toledo

@EdanToledo

1 year

RT @ChalumeauFelix: Excited to introduce our latest neural solver, MEMENTO! Enhancing problem-specific adaptation with an explicit memory.….

0

19

0

Edan Toledo

@EdanToledo

1 year

RT @AlexLaterre: Got lost in the #ICLR2024 poster maze? Don't worry, we've got your covered! 🛟. Here is @DonalByrne2, Senior Research Engin….

0

9

0

Edan Toledo

@EdanToledo

1 year

RT @callumtilbury: Curious about this diagram? Join us later today as we discuss growing the MARL ecosystem in JAX! 🤖🍿. @instadeepai @ruanj….

0

3

0

Edan Toledo

@EdanToledo

1 year

RT @ClementBonnet16: Excited to announce Jumanji v1.0, now featuring 22 fast, flexible, and scalable environments!. Fully written in JAX, J….

0

28

0

Edan Toledo

@EdanToledo

1 year

RT @ClementBonnet16: If you haven't yet, please check out the amazing works from the JAX community. E.g. environments: Brax (@OlivierBachem….

0

3

0

Edan Toledo

@EdanToledo

1 year

Important to mention is that Stoix can be seen as a single-agent counter part to Mava . Stoix actually started as a Mava clone when I wanted an End-to-End JAX PPO with all the extra features Mava provides. If you’re into in MARL, can’t recommend it enough.

0

1

6

Edan Toledo

@EdanToledo

1 year

🧪 Robust evaluation and logging ready for statistical testing with and 🚀 Optimised for speed with JAX's pmap & jit allowing for quick and easy scaling. Any contribution and feedback is welcome. 🤝 #RL #JAX.

2

0

3

Edan Toledo

@EdanToledo

1 year

🍬 Environment wrappers for most JAX-native environment suites (Gymnax @RobertTLange , Jumanji @ClementBonnet16 , Brax @GoogleOSS, XMinigrid @how_uhh, Craftax @mitrma and even JAXMarl @alexrutherford0 (using Centralised Controllers).

1

0

6

Edan Toledo

@EdanToledo

1 year

🌟 Features:. 🥑 Ready-to-use RL algorithms like PPO, DQN, SAC & more to serve as baselines and research starting points.

1

0

2