Tianshu Zhang @Tianshu_OSU X Profile

Tianshu Zhang

@Tianshu_OSU

Followers

421

Following

290

Media

9

Statuses

87

Ph.D student @osunlp @OhioStateCSE. Ex-intern @IBMResearch, @Adobe. Lead author of TableLlama. #NLProc

https://t.co/bOCwQazjjH

Columbus, OH

Joined September 2022

Don't wanna be here? Send us removal request.

Tianshu Zhang

@Tianshu_OSU

2 years

Can #LLMs excellently handle various table-based tasks? 📢Introducing TableLlama and TableInstruct: the FIRST open-source generalist #LLMs and instruction tuning dataset for tables. 🌟Strong performance on both in-domain & out-of-domain settings. #NLProc

7

18

92

Yu Su

@ysu_nlp

4 days

Important work on AI4S, co-led by @hhsun1 @osunlp

Alex Prompter

@alex_prompter

5 days

This paper from Harvard and MIT quietly answers the most important AI question nobody benchmarks properly: Can LLMs actually discover science, or are they just good at talking about it? The paper is called “Evaluating Large Language Models in Scientific Discovery”, and instead

1

3

34

Yu Su

@ysu_nlp

28 days

Life update: I moved to silicon valley to tackle agents' biggest challenges: plasticity and reliability. Today's agents are smart but brittle. They lack plasticity (continual learning and adaptation) and reliability (stable, predictable behavior with bounded failures). These two

40

44

427

Yu Su

@ysu_nlp

4 months

Computer Use: Modern Moravec's Paradox A new blog post arguing why computer-use agents may be the biggest opportunity and challenge for AGI. https://t.co/6fZfTdx710 Table of Contents > Moravec’s Paradox > Moravec's Paradox in 2025 > Computer use may be the biggest opportunity

9

65

216

Huan Sun (Hiring Ph.D. students for Fall26)

@hhsun1

4 months

I am humbled and grateful to receive two grants from Open Philanthropy @open_phil to advance the safety of AI systems, co-led with my colleague @ysu_nlp. I'm also honored to be the first at @OhioState to receive Open Philanthropy funding. Most credit goes to the amazing students

OSUengineering

@OSUengineering

4 months

Associate Prof. Huan Sun has been awarded two competitive research grants from Open Philanthropy focused on the rapidly evolving field of #AI safety: https://t.co/3UG0YbDbtJ @hhsun1

4

17

79

Tianshu Zhang

@Tianshu_OSU

4 months

🙏 Huge thanks to my amazing collaborators: @kunqian_us @sidthekidder @bestaskwisher @ShaddyGarg @hhsun1 @yunyao_li - couldn’t have done this without you! Also appreciate all discussions from @osunlp !

0

3

Tianshu Zhang

@Tianshu_OSU

4 months

On average, open-source LLMs fine-tuned with EvoSchema outperform different baseline methods, highlighting a path towards more resilient NL2SQL systems that adapt as database schemas evolve over time.

1

0

3

Tianshu Zhang

@Tianshu_OSU

4 months

💡 Why it matters: Database schemas are not static — they evolve 🔄. 🌍🌍 Big picture 🔸EvoSchema defines 10 schema perturbations (column- & table-level) and shows how schema shifts can break SOTA models. 🔸Column-level changes hurt a bit. 🔸Table-level schema changes hurt a lot.

1

0

4

Tianshu Zhang

@Tianshu_OSU

4 months

🎉 Excited to share that our paper EvoSchema: Towards Text-to-SQL Robustness Against Schema Evolution was accepted at VLDB 2025! 🚀 📢 Reminder: join us at VLDB 2025 in London! 🗓️ Sept 2 (Tue), 10:45 AM – 12:15 PM 📍 Room Wordsworth 4F 📄 https://t.co/ZNAav4ZtoX #VLDB2025 #LLMs

1

18

30

Boyuan Zheng

@boyuan__zheng

5 months

Remember “Son of Anton” from the Silicon Valley show(@SiliconHBO)? The experimental AI that “efficiently” orders 4,000 lbs of meat while looking for a cheap burger and “fixes” a bug by deleting all the code? It’s starting to look a lot like reality. Even 18 months ago, my own

Scale AI

@scale_AI

5 months

As AI agents start taking real actions online, how do we prevent unintended harm? We teamed up with @OhioState and @UCBerkeley to create WebGuard: the first dataset for evaluating web agent risks and building real-world safety guardrails for online environments. 🧵

0

27

68

Jianyang Gu

@vimar_gu

5 months

Announcing the @NeurIPSConf 2025 workshop on Imageomics: Discovering Biological Knowledge from Images Using AI! The workshop focuses on the interdisciplinary field between machine learning and biological science. We look forward to seeing you in San Diego! #NeurIPS2025

2

14

27

Boyuan Zheng

@boyuan__zheng

5 months

Attending #ICML2025 🇨🇦 this week! I’ll be co-organizing the Computer Use Agent Workshop @workshopcua on July 19th! Happy to chat about anything related to language agents — especially world modeling, scaling RL for agents, and multi-turn RL. Excited to meet old friends and

2

6

48

Huan Sun (Hiring Ph.D. students for Fall26)

@hhsun1

5 months

🚨 Postdoc Hiring: I am looking for a postdoc to work on rigorously evaluating and advancing the capabilities and safety of computer-use agents (CUAs), co-advised with @ysu_nlp @osunlp. We welcome strong applicants with experience in CUAs, long-horizon reasoning/planning,

1

29

73

Yu Su

@ysu_nlp

6 months

🔎Agentic search like Deep Research is fundamentally changing web search, but it also brings an evaluation crisis⚠️ Introducing Mind2Web 2: Evaluating Agentic Search with Agents-as-a-Judge - 130 tasks (each requiring avg. 100+ webpages) from 1,000+ hours of expert labor -

3

47

224

Huan Sun (Hiring Ph.D. students for Fall26)

@hhsun1

6 months

If you care about building AI co-scientists for data-driven discovery, check out our recent work on automatically collecting large-scale, authentic, high-quality scientific coding tasks at a low cost, led by @YifeiLiPKU @HananeNMoussa @osunlp. 🌟AutoSDT: Scaling Data-Driven

arxiv.org

Despite long-standing efforts in accelerating scientific discovery with AI, building AI co-scientists remains challenging due to limited high-quality data for training and evaluation. To tackle...

Yifei Li (🔍SU26 Internship)

@YifeiLiPKU

6 months

📢 Introducing AutoSDT, a fully automatic pipeline that collects data-driven scientific coding tasks at scale! We use AutoSDT to collect AutoSDT-5K, enabling open co-scientist models that rival GPT-4o on ScienceAgentBench! Thread below ⬇️ (1/n)

0

4

16

Rui Qiu

@RuiQiu18

7 months

Systematic reviews (SRs) drive evidence-based medicine, but months-long workflows can’t keep pace with today’s literature flood. Fully autonomous solutions promise speed, but the magic often fizzles - these models still skip pivotal trials, hallucinate findings, and bury the

1

15

21

Zeyi Liao

@LiaoZeyi

7 months

⁉️Can you really trust Computer-Use Agents (CUAs) to control your computer⁉️ Not yet, @AnthropicAI Opus 4 shows an alarming 48% Attack Success Rate against realistic internet injection❗️ Introducing RedTeamCUA: realistic, interactive, and controlled sandbox environments for

4

31

84

CSE

@OhioStateCSE

7 months

Proud moment for @OhioStateCSE! Prof. @hhsun1 has been awarded funding from @SchmidtSciences' for AI Safety initiative — a first for Ohio State. Her work will help defend AI agents from adversarial attacks.

engineering.osu.edu

Schmidt Sciences selected 27 projects for funding

0

8

13

Huan Sun (Hiring Ph.D. students for Fall26)

@hhsun1

8 months

I will miss #NAACL2025 unfortunately, but please check out our work on chemistry agents, "ChemToolAgent: The Impact of Tools on Language Agents for Chemistry Problem Solving" today (May 1) during 2:00-3:30pm (local time) at Hall 3, Poster Session 5! Some updates: We have renamed

1

16

41

Huan Sun (Hiring Ph.D. students for Fall26)

@hhsun1

8 months

It's a great honor to give a keynote at the @Molecule_Maker symposium at UIUC! Many thanks to Prof. @hengjinlp and Prof. Jiawei Han for invitation. The symposium’s theme this year is “AI scientist? What would it take?”, which I hold close to heart and made a talk titled “Language

2

18

69

Boshi Wang

@BoshiWang2

9 months

LLMs exhibit the Reversal Curse, a basic generalization failure where they struggle to learn reversible factual associations (e.g., "A is B" -> "B is A"). But why? Our new work uncovers that it's a symptom of the long-standing binding problem in AI, and shows that a model design

25

126

863