Jinu Lee @jinulee_v X Profile

Jinu Lee

@jinulee_v

Followers

472

Following

182

Media

17

Statuses

57

PhD Student @UIUC_NLP. Interested in *semantics of reasoning*, from neuro-symbolic methods to reasoning evaluation/improvement in LLMs. Ex-Intern @MSFTResearch

https://t.co/cTTGuA2YSD

Joined November 2023

Don't wanna be here? Send us removal request.

Jinu Lee

@jinulee_v

3 months

Happy to announce that "Evaluating step-by-step reasoning traces" is accepted to EMNLP 2025 Findings! Check out the survey for (1) different criteria about what is a good reasoning trace/step (2) datasets and methods for evaluating reasoning traces: https://t.co/pcNtI8cHdD

5

8

81

Jinu Lee

@jinulee_v

15 hours

Balancing versatility and reliability in long-form text evaluation is no easy task. Yukyung’s work (and other pioneers) inspired me to explore rubric-based checklists, and they’re proving powerful for evaluating complex chains of thought! Check her cool work in EMNLP✅

Yukyung Lee

@yukyunglee_

16 hours

How reliable is your LLM-as-a-Judge?⚖️ Existing methods suffer from (a) rating inconsistencies and (b) low stability of correlation with human judgments across evaluator models. Excited to share CheckEval (@emnlpmeeting), a framework that improves reliability for LLM-as-a-Judge.

0

4

Chaeeun Kim @EMNLP2025

@chaechaek1214

1 day

I'm at #EMNLP2025 this week in Suzhou to present LegalSearchLM! 🤗 Wed, 16:30-18:00 📄paper: https://t.co/lqu90I1596 Come see how we - use first-token-aware autoregressive LMs as retrievers with an FM-index for legal-element reasoning in complex legal case retrieval - release

0

2

16

Xinliang (Frederick) Zhang

@FrederickXZhang

21 days

How do LLMs really navigate the thinking space? Straight off to a final answer OR follow a wiggly path? Definitely commit OR get stuck to “infinite” self-doubting? In our latest study, we unravel (over-)thinking through the lens of sub-thoughts: https://t.co/Wb5AIcbI6a more in 🧵

2

25

59

Jinu Lee

@jinulee_v

3 months

Life updates: I am back at Illinois! I had a wonderful summer in @MSFTResearch training reasoning models for verifiable programming😎

1

0

49

Brando Miranda

@BrandoHablando

4 months

🔄 We were nominated for Oral+top 1 in the MATH-AI workshp at #ICML! 🚨Why? ≈46 % of GitHub commits are AI-generated—but can we verify them correct? 📢 VeriBench challenges agents; turn Python into Lean code! 🧵1/14 📃 Paper: https://t.co/QPCxg5lKM4

1

19

39

Jinu Lee

@jinulee_v

4 months

I will be presenting at 7/28(Mon) 11:00-12:30, Hall 4/5. Would love to chat about reasoning, CoT, NLP+formal methods and more! Can't wait to meet old and new friends😁

Jinu Lee

@jinulee_v

6 months

I am happy to announce that my first-author paper has been accepted to ACL 2025 Main! 🇦🇹 https://t.co/LUnMGpueTF We tackle natural language-first-order logic translation (NL2FOL) using reinforcement learning with NLI labels as rewards. (1/7)

1

22

Jason Wei

@_jasonwei

4 months

New blog post about asymmetry of verification and "verifier's law": https://t.co/bvS8HrX1jP Asymmetry of verification–the idea that some tasks are much easier to verify than to solve–is becoming an important idea as we have RL that finally works generally. Great examples of

53

245

2K

Lifan Yuan

@lifan__yuan

5 months

We always want to scale up RL, yet simply training longer doesn't necessarily push the limits - exploration gets impeded by entropy collapse. We show that the performance ceiling is surprisingly predictable, and the collapse is driven by covariance between logp and advantage.

9

94

540

Ryo Kamoi

@RyoKamoi

6 months

📢 New paper! FoVer enhances PRMs for step-level verification of LLM reasoning w/o human annotation 🚀 We synthesize training data using formal verification tools and improve LLMs at step-level verification of LLM responses on MATH, AIME, MMLU, BBH, etc. https://t.co/gdK6BC7rJv

4

26

127

Jinu Lee

@jinulee_v

6 months

Life update: Just started my summer internship in @MSFTResearch (Redmond, WA)! Happy to chat with fellow MSR people, or anyone around the wider Seattle area 🏙️

3

2

113

Jinu Lee

@jinulee_v

6 months

Finally, I sincerely thank my co-authors, Qi, Runzhi, Vincent, Ziqi @wzq016, Heng @hengjinlp , and Julia. See you in Vienna! 🇦🇹 (7/7)

0

1

Jinu Lee

@jinulee_v

6 months

We also show that RL reduces the arbitrariness of FOL. Arbitrariness is where one can express an NL phrase into different FOLs (predicate names/arities). We show that rounds of RL reduce corpus-wide arbitrariness, which explains the gain in the entailment-preserving rate. (6/7)

1

0

2

Jinu Lee

@jinulee_v

6 months

As a result, the model trained with our objective achieves the best EPR across all three datasets (EntailmentBank, eQASC, and e-SNLI). It outperforms other sentence-to-FOL translation systems, including semantic representation-based methods and end-to-end generative models. (5/7)

1

0

2

Jinu Lee

@jinulee_v

6 months

Next, we train an NL2FOL translator using NLI labels as rewards. First, using a base model, we obtain 16 translations from each of the premises and hypotheses. We reward FOLs that are involved in any entailment-preserving combination. We repeat 5 rounds of training. (4/7)

1

0

1

Jinu Lee

@jinulee_v

6 months

First, we formally define the entailment-preserving rate (EPR). Given a set of premises and a following hypothesis, we want the FOL translation of the premises to also logically entail the hypothesis. Based on this idea, we define three metrics: EPR, EPR@K, EPR@K-Oracle. (3/7)

1

0

Jinu Lee

@jinulee_v

6 months

NL2FOL translation provides a reliable method for logical reasoning. However, its application is mostly limited to (near-)synthetic reasoning tasks. How can we improve an NL2FOL translator to catch the diverse semantics of natural language expressed in NLI tasks? (2/7)

1

0

1

Jinu Lee

@jinulee_v

6 months

I am happy to announce that my first-author paper has been accepted to ACL 2025 Main! 🇦🇹 https://t.co/LUnMGpueTF We tackle natural language-first-order logic translation (NL2FOL) using reinforcement learning with NLI labels as rewards. (1/7)

3

12

96

Sagnik Mukherjee

@saagnikkk

6 months

🚀Our ICML 2025 paper introduces "Premise-Augmented Reasoning Chains" - a structured approach to induce explicit dependencies in reasoning chains. By revealing the dependencies within chains, we significantly improve how LLM reasoning can be verified. 🧵[1/n]

1

25

74

Jinu Lee

@jinulee_v

6 months

📢 I will be presenting **SymBa: Symbolic Backward Chaining for Structured Natural Language Reasoning** at NAACL 2025! Poster: 5/1 2:00-3:30 Hall 3 Let's chat about neurosymbolic reasoning and reasoning evaluation! I will also attend the complex reasoning BoF😁 See you at ABQ!

Jinu Lee

@jinulee_v

10 months

I am happy to announce that my first-author paper is accepted to NAACL 2025 Main! Existing backward chaining (top-down reasoning) methods are incomplete, leading to suboptimal performance. We build SymBa, a complete neuro-symbolic backward chaining method using SLD-Resolution.

0

7

36