Haruki Nishimura @imp_aa X Profile

Haruki Nishimura

@imp_aa

Followers

654

Following

1K

Media

38

Statuses

480

Learning and planning for safe, embodied autonomous systems under uncertainty. Senior Research Scientist @ToyotaResearch. PhD from @StanfordMSL. 日本語 & English

https://t.co/A5euvaHs3M

California, USA

Joined March 2018

Don't wanna be here? Send us removal request.

Haruki Nishimura

@imp_aa

4 months

We all love to see our robot policy outperforming baselines. But how do we make sure that the claim is statistically sound, from as few policy rollouts as possible? @das_princeton proposes an effective solution to this fundamental question, which has been accepted at RSS2025!

David Snyder

@das_princeton

4 months

(1/13) How should we rigorously compare robot policies? Comparison is central to robotics research, but is inherently expensive. We introduce STEP, a flexible and data-efficient method for statistically rigorous policy comparison. Accepted at RSS 2025: https://t.co/MtAMIwlbAn

0

2

10

Tairan He

@TairanHe99

2 months

Highly recommended — a tremendous amount of effort to rigorously test an assumption often taken for granted by the community: "Does a multi-task pretrained vision-language policy actually outperform single-task policies?"

Russ Tedrake

@RussTedrake

2 months

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: https://t.co/n0qmDRivRH One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the

0

2

45

Shuran Song

@SongShuran

2 months

have been waiting for this release! Robotics needs rigorous and careful evaluation now more than ever 🦾

Russ Tedrake

@RussTedrake

2 months

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: https://t.co/n0qmDRivRH One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the

1

5

67

Haruki Nishimura

@imp_aa

2 months

Learn more and see this research in action in our latest video: https://t.co/c245bSCC3o Project Page: https://t.co/D0CXeInq1j #Robotics #AI #LBMs #MachineLearning #TRIresearch #RobotLearning

0

1

Haruki Nishimura

@imp_aa

2 months

At @ToyotaResearch, we've been studying how LBMs can help robots learn faster and better. We built a rigorous evaluation pipeline to benchmark LBM performance with statistical confidence. Results suggest that pre-training on hundreds of tasks yields 80% data savings on new tasks.

Russ Tedrake

@RussTedrake

2 months

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: https://t.co/n0qmDRivRH One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the

1

24

Haruki Nishimura

@imp_aa

3 months

Happening now at Ronald Tutor Hall, Room 211!

Haruki Nishimura

@imp_aa

3 months

It is TOMORROW! See you at the 1st Workshop on Robot Evaluation for the Real World! We will be hosting a series of invited, spotlight, and lightning talks with diverse perspectives and applications. We will also have a panel and a debate. https://t.co/AEcphiaP6F

0

1

Haruki Nishimura

@imp_aa

3 months

It is TOMORROW! See you at the 1st Workshop on Robot Evaluation for the Real World! We will be hosting a series of invited, spotlight, and lightning talks with diverse perspectives and applications. We will also have a panel and a debate. https://t.co/AEcphiaP6F

0

5

Christopher Agia

@agiachris

3 months

What makes data “good” for robot learning? We argue: it’s the data that drives closed-loop policy success! Introducing CUPID 💘, a method that curates demonstrations not by "quality" or appearance, but by how they influence policy behavior, using influence functions. (1/6)

6

20

125

Jean Mercat

@MercatJean

3 months

We evaluated more than 1000 reasoning LLMs on 12 reasoning-focused benchmarks and made fascinating observations about cross-benchmark comparisons. You can explore all that data yourself on our HuggingFace spaces page. (1/4)

2

19

96

Qiao Gu

@qiaogu1997

3 months

🚀 Excited to introduce SAFE, our work on multitask failure detection for Vision-Language-Action (VLA) models! 🔍 SAFE is a simple yet powerful detector that leans from VLAs’ semantic-rich internal feature space and outputs a scalar score indicating the likelihood of task failure

2

25

125

Yu Xiang

@YuXiang_IRVL

3 months

“As a PHD student, your job is not publishing a paper every quarter. Focus on a problem in deep understanding and solve it in years under the protect of your adviser” from @RussTedrake #RSS2025

20

81

917

Haruki Nishimura

@imp_aa

3 months

It was such a pleasure to give an invited talk in the RSS Workshop on Reliable Robotics: Safety and Security in the Face of GenAI. I learned diverse perspectives on safety and security (and beyond!), and the panel discussion was very thought-provoking too 🤖🤔

0

5

100

Haruki Nishimura

@imp_aa

3 months

We are presenting two papers in the Imitation Learning I session at #RSS25 this evening! Check out the RSS Website for previews! (Talk 3 and 7) https://t.co/3iZoYmXEZ5

0

1

10

Haruki Nishimura

@imp_aa

6 months

Such a cool work and extensive results by @ChenXu26892388 on run-time monitoring and failure detection of pre-trained vision-based policies, without relying on observing countless failure modes apriori.

Chen Xu

@ChenXu26892388

6 months

Introducing FAIL-Detect 🚨: a method to detect policy failures within a rollout without failure data or a priori knowledge of potential failures. Detections are indicated with a red border. 🧵 1/8

0

7

Haruki Nishimura

@imp_aa

10 months

This is a research internship role.

0

Haruki Nishimura

@imp_aa

10 months

来夏の弊社インターンシッププログラムのご紹介です。私のチームの募集では場所はカリフォルニア州Los Altosとなります。100%英語でのコミュニケーションが前提となりますが今回は米国外の学校からも募集を受け付けているそうなので、研究テーマにご興味のある方は是非ご一考ください。

Haruki Nishimura

@imp_aa

10 months

Currently pursuing a Ph.D. in robotics, ML, or related fields, and interested in making black-box policies robust and reliable? Our team at TRI is hiring 2025 summer interns who will work with myself and @MashaItkina on trustworthy learning for robots.

0

4

Haruki Nishimura

@imp_aa

10 months

Currently pursuing a Ph.D. in robotics, ML, or related fields, and interested in making black-box policies robust and reliable? Our team at TRI is hiring 2025 summer interns who will work with myself and @MashaItkina on trustworthy learning for robots.

1

3

10

Thomas Lew

@thomas__lew

10 months

Our team at TRI is hiring a research intern for the summer of 2025! An exciting opportunity to pursue research at the intersection of perception and control, and to deploy models and algorithms on high-performance cars https://t.co/rcPpHvPYHG

0

3

14

Haruki Nishimura

@imp_aa

1 year

在学中にJASSO海外留学支援制度を受給していた関係で現在でも定期的に「状況調査」なるものがあり、その中で「機構に伝えたいこと」というのがあったので、「支援の拡充は良い事と思う。ただ是非文理の枠にとらわれない形で制度化して欲しい」という意見を伝えました。

0

3

Haruki Nishimura

@imp_aa

1 year

Check out our open-source STATS package https://t.co/alpkMQtJER if you are a roboticist tasked with quantifying policy performance with success/failure labels, and are wondering how to get the tightest confidence interval estimates out of a small set of policy rollouts.

github.com

Computation of binomial confidence intervals that achieve exact coverage. - TRI-ML/binomial_cis

0

1