Bo Dai
@daibond_alpha
Followers
3K
Following
3K
Media
9
Statuses
223
Assistant Professor at @gtcse, Research Scientist at @GoogleDeepMind | ex @googlebrain
California, USA
Joined October 2012
RL is sparkling again.
0
0
54
I did not even have 10 submissions…. There are two different “Bo Dai”.
11
3
139
Please consider joining us to explore the frontier on generative foundation model for decision making, planning, and reasoning.
Our team (w/Dale, @daibond_alpha, @mengjiao_yang + others) at Google DeepMind is looking to hire. If you are interested in foundation models+decision making, and making real-world impact through Gemini and cloud solutions, please consider applying through https://t.co/KfhYZuohIY
0
1
24
Our black-box adaptation for LLMs has been accepted to #ICML2024. We provide offline and online learning strategy for a value function to pivot the LLMs decoding procedure, only with the access to the output sentences of LLMs.
0
1
18
Thanks for sharing our work! We present this principled optimistic/pessimistic policy optimization without uncertainty estimation, and test it with LLMs with my great collaborators @tsen9731, Jincheng @Kgoshvadi @hanjundai Tong Yang @mengjiao_yang, Dale, @yuejiec
Google presents Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF https://t.co/nMIzXVKDv4
5
8
33
This is quite aligned with what we have completed: energy-based black box adaptation https://t.co/sfVJfpzoNP. More interesting part is this adaptor can be trained and used only with sampled sentences from API, agnostic to the logins. And can be used in a plug-and-play way!
arxiv.org
Adapting state-of-the-art Large Language Models (LLMs) like GPT-4 and Gemini for specific tasks is challenging. Due to the opacity in their parameters, embeddings, and even output probabilities,...
Q* Leaked info: Source: An unspecified PasteBin.(L-I-N-K in next post) Can't confirm the authenticity as it's from unknown source but you can have a look. Q* is a dialog system conceptualized by OpenAI, designed to enhance the traditional dialog generation approach through the
0
0
13
We make local private adaptation of GPT possible!
Having troubles with blind domain adaptation for GPTs through OpenAI or Azure 🤔? We are excited to introduce BBox-Adapter 🔌— Lightweight Adapting for Black-Box #LLMs📦. BBox-Adapter offers a transparent, privacy-conscious, and cost-effective solution for customizing
0
2
18
Please come to our poster to see the control closed-loop LLM agent.
Want smarter LLM agents? 🤖 Join Haotian's @haotiansun014 poster on AdaPlanner tomorrow! 📅 It enables LLMs to think ahead & plan adaptively based on feedback. #NeurIPS2023 #LLMs #LLMagent
https://t.co/byl5Stx2uD
0
4
37
Great work of using video model in RL!
Introducing Universal Simulator (UniSim), an interactive simulator of the real world. Interactive website: https://t.co/c3aQazNYXq Paper: https://t.co/1IdxKQAHsd
0
0
13
Do not miss the opportunity! I really appreciate the MLSS 2011 Singapore, which led me to this amazing area :)
The submission page for MLSS 2024 in Okinawa is now OPEN! Don't miss this incredible opportunity to expand your research network, enhance your knowledge in machine learning, and connect with experts in the field. Submission page: https://t.co/9lonTZ3yeo Deadline: Sep/30/2023
0
1
13
Glad to see the constrastive representation works in robotics. There is a connection between the constrastive representation and linear MDP as we investigated in our icml2022 paper
Contrastive RL provides a way to use contrastive learning methods to learn general-purpose goal-conditioned policies, uniting representation learning and RL. We recently got this working at scale with real robots! You can read more here: https://t.co/PZxgiViMw8 A short 🧵👇
0
1
11
As a concurrent work of Voyager with similar components, we echo that these techniques are generally useful beyond Minecraft @DrJimFan.
8
0
3