Tianshu Zhang Profile
Tianshu Zhang

@Tianshu_OSU

Followers
421
Following
290
Media
9
Statuses
87

Ph.D student @osunlp @OhioStateCSE. Ex-intern @IBMResearch, @Adobe. Lead author of TableLlama. #NLProc

Columbus, OH
Joined September 2022
Don't wanna be here? Send us removal request.
@Tianshu_OSU
Tianshu Zhang
2 years
Can #LLMs excellently handle various table-based tasks? 📢Introducing TableLlama and TableInstruct: the FIRST open-source generalist #LLMs and instruction tuning dataset for tables. 🌟Strong performance on both in-domain & out-of-domain settings. #NLProc
7
18
92
@ysu_nlp
Yu Su
4 days
Important work on AI4S, co-led by @hhsun1 @osunlp
@alex_prompter
Alex Prompter
5 days
This paper from Harvard and MIT quietly answers the most important AI question nobody benchmarks properly: Can LLMs actually discover science, or are they just good at talking about it? The paper is called “Evaluating Large Language Models in Scientific Discovery”, and instead
1
3
34
@ysu_nlp
Yu Su
28 days
Life update: I moved to silicon valley to tackle agents' biggest challenges: plasticity and reliability. Today's agents are smart but brittle. They lack plasticity (continual learning and adaptation) and reliability (stable, predictable behavior with bounded failures). These two
40
44
427
@ysu_nlp
Yu Su
4 months
Computer Use: Modern Moravec's Paradox A new blog post arguing why computer-use agents may be the biggest opportunity and challenge for AGI. https://t.co/6fZfTdx710 Table of Contents > Moravec’s Paradox > Moravec's Paradox in 2025 > Computer use may be the biggest opportunity
9
65
216
@hhsun1
Huan Sun (Hiring Ph.D. students for Fall26)
4 months
I am humbled and grateful to receive two grants from Open Philanthropy @open_phil to advance the safety of AI systems, co-led with my colleague @ysu_nlp. I'm also honored to be the first at @OhioState to receive Open Philanthropy funding. Most credit goes to the amazing students
@OSUengineering
OSUengineering
4 months
Associate Prof. Huan Sun has been awarded two competitive research grants from Open Philanthropy focused on the rapidly evolving field of #AI safety: https://t.co/3UG0YbDbtJ @hhsun1
4
17
79
@Tianshu_OSU
Tianshu Zhang
4 months
🙏 Huge thanks to my amazing collaborators: @kunqian_us @sidthekidder @bestaskwisher @ShaddyGarg @hhsun1 @yunyao_li - couldn’t have done this without you! Also appreciate all discussions from @osunlp !
0
0
3
@Tianshu_OSU
Tianshu Zhang
4 months
On average, open-source LLMs fine-tuned with EvoSchema outperform different baseline methods, highlighting a path towards more resilient NL2SQL systems that adapt as database schemas evolve over time.
1
0
3
@Tianshu_OSU
Tianshu Zhang
4 months
💡 Why it matters: Database schemas are not static — they evolve 🔄. 🌍🌍 Big picture 🔸EvoSchema defines 10 schema perturbations (column- & table-level) and shows how schema shifts can break SOTA models. 🔸Column-level changes hurt a bit. 🔸Table-level schema changes hurt a lot.
1
0
4
@Tianshu_OSU
Tianshu Zhang
4 months
🎉 Excited to share that our paper EvoSchema: Towards Text-to-SQL Robustness Against Schema Evolution was accepted at VLDB 2025! 🚀 📢 Reminder: join us at VLDB 2025 in London! 🗓️ Sept 2 (Tue), 10:45 AM – 12:15 PM 📍 Room Wordsworth 4F 📄 https://t.co/ZNAav4ZtoX #VLDB2025 #LLMs
1
18
30
@boyuan__zheng
Boyuan Zheng
5 months
Remember “Son of Anton” from the Silicon Valley show(@SiliconHBO)? The experimental AI that “efficiently” orders 4,000 lbs of meat while looking for a cheap burger and “fixes” a bug by deleting all the code? It’s starting to look a lot like reality. Even 18 months ago, my own
@scale_AI
Scale AI
5 months
As AI agents start taking real actions online, how do we prevent unintended harm? We teamed up with @OhioState and @UCBerkeley to create WebGuard: the first dataset for evaluating web agent risks and building real-world safety guardrails for online environments. 🧵
0
27
68
@vimar_gu
Jianyang Gu
5 months
Announcing the @NeurIPSConf 2025 workshop on Imageomics: Discovering Biological Knowledge from Images Using AI! The workshop focuses on the interdisciplinary field between machine learning and biological science. We look forward to seeing you in San Diego! #NeurIPS2025
2
14
27
@boyuan__zheng
Boyuan Zheng
5 months
Attending #ICML2025 🇨🇦 this week! I’ll be co-organizing the Computer Use Agent Workshop @workshopcua on July 19th! Happy to chat about anything related to language agents — especially world modeling, scaling RL for agents, and multi-turn RL. Excited to meet old friends and
2
6
48
@hhsun1
Huan Sun (Hiring Ph.D. students for Fall26)
5 months
🚨 Postdoc Hiring: I am looking for a postdoc to work on rigorously evaluating and advancing the capabilities and safety of computer-use agents (CUAs), co-advised with @ysu_nlp @osunlp. We welcome strong applicants with experience in CUAs, long-horizon reasoning/planning,
1
29
73
@ysu_nlp
Yu Su
6 months
🔎Agentic search like Deep Research is fundamentally changing web search, but it also brings an evaluation crisis⚠️ Introducing Mind2Web 2: Evaluating Agentic Search with Agents-as-a-Judge - 130 tasks (each requiring avg. 100+ webpages) from 1,000+ hours of expert labor -
3
47
224
@hhsun1
Huan Sun (Hiring Ph.D. students for Fall26)
6 months
If you care about building AI co-scientists for data-driven discovery, check out our recent work on automatically collecting large-scale, authentic, high-quality scientific coding tasks at a low cost, led by @YifeiLiPKU @HananeNMoussa @osunlp. 🌟AutoSDT: Scaling Data-Driven
Tweet card summary image
arxiv.org
Despite long-standing efforts in accelerating scientific discovery with AI, building AI co-scientists remains challenging due to limited high-quality data for training and evaluation. To tackle...
@YifeiLiPKU
Yifei Li (🔍SU26 Internship)
6 months
📢 Introducing AutoSDT, a fully automatic pipeline that collects data-driven scientific coding tasks at scale! We use AutoSDT to collect AutoSDT-5K, enabling open co-scientist models that rival GPT-4o on ScienceAgentBench! Thread below ⬇️ (1/n)
0
4
16
@RuiQiu18
Rui Qiu
7 months
Systematic reviews (SRs) drive evidence-based medicine, but months-long workflows can’t keep pace with today’s literature flood. Fully autonomous solutions promise speed, but the magic often fizzles - these models still skip pivotal trials, hallucinate findings, and bury the
1
15
21
@LiaoZeyi
Zeyi Liao
7 months
⁉️Can you really trust Computer-Use Agents (CUAs) to control your computer⁉️ Not yet, @AnthropicAI Opus 4 shows an alarming 48% Attack Success Rate against realistic internet injection❗️ Introducing RedTeamCUA: realistic, interactive, and controlled sandbox environments for
4
31
84
@OhioStateCSE
CSE
7 months
Proud moment for @OhioStateCSE! Prof. @hhsun1 has been awarded funding from @SchmidtSciences' for AI Safety initiative — a first for Ohio State. Her work will help defend AI agents from adversarial attacks.
Tweet card summary image
engineering.osu.edu
Schmidt Sciences selected 27 projects for funding
0
8
13
@hhsun1
Huan Sun (Hiring Ph.D. students for Fall26)
8 months
I will miss #NAACL2025 unfortunately, but please check out our work on chemistry agents, "ChemToolAgent: The Impact of Tools on Language Agents for Chemistry Problem Solving" today (May 1) during 2:00-3:30pm (local time) at Hall 3, Poster Session 5! Some updates: We have renamed
1
16
41
@hhsun1
Huan Sun (Hiring Ph.D. students for Fall26)
8 months
It's a great honor to give a keynote at the @Molecule_Maker symposium at UIUC! Many thanks to Prof. @hengjinlp and Prof. Jiawei Han for invitation. The symposium’s theme this year is “AI scientist? What would it take?”, which I hold close to heart and made a talk titled “Language
2
18
69
@BoshiWang2
Boshi Wang
9 months
LLMs exhibit the Reversal Curse, a basic generalization failure where they struggle to learn reversible factual associations (e.g., "A is B" -> "B is A"). But why? Our new work uncovers that it's a symptom of the long-standing binding problem in AI, and shows that a model design
25
126
863