
Liyan Tang
@LiyanTang4
Followers
211
Following
156
Media
18
Statuses
169
Fourth-year PhD @UTAustin || NLP || MiniCheck || Intern @GoogleDeepMind; Prev Intern @bespokelabsai, @AmazonScience
Austin, TX, US
Joined February 2022
🔎📄New model & benchmark to check LLMs’ output against docs (e.g., fact-check RAG). 🕵️ MiniCheck: a model w/GPT-4 accuracy @ 400x cheaper. 📚LLM-AggreFact: collects 10 human-labeled datasets of errors in model outputs. w/ @PhilippeLaban, @gregd_nlp 🧵
3
26
86
RT @ZEYULIU10: LLMs trained to memorize new facts can’t use those facts well.🤔. We apply a hypernetwork to ✏️edit✏️ the gradients for fact….
0
61
0
RT @xiye_nlp: 🤔 Recent mech interp work showed that retrieval heads can explain some long-context behavior. But can we use this insight for….
0
17
0
RT @fangcong_y10593: Solving complex problems with CoT requires combining different skills. We can do this by:.🧩Modify the CoT data format….
0
31
0
RT @gregd_nlp: Check out ChartMuseum from @LiyanTang4 @_grace_kim and many other collaborators from UT!. Charts questions take us beyond cu….
0
9
0
Thanks to the awesome team at UT TAUR lab!. @_grace_kim, @lucy_xyzhao, @thomlake, @Wenxuan_Ding_ , @fangcong_y10593, @prasann_singhal, @ManyaWadhwa1, @ZEYULIU10, @ZayneSprague, @ramya_namuduri, @BodunHu, @juand_r_nlp , @PuyuanPeng, @gregd_nlp.
0
0
3
RT @PhilippeLaban: 🆕paper: LLMs Get Lost in Multi-Turn Conversation. In real life, people don’t speak in perfect prompts. So we simulate mu….
0
31
0
RT @AnirudhKhatry: 🚀Introducing CRUST-Bench, a dataset for C-to-Rust transpilation for full codebases 🛠️.A dataset of 100 real-world C repo….
0
18
0
RT @gregd_nlp: New work led by @LiyanTang4 with a strong new model for chart understanding! Check out the blog post, model, and playground!….
0
8
0
Check out my work at @bespokelabsai We release Bespoke-MiniChart-7B, a new SOTA in chart understanding of its size. Chart understanding is really fun and challenging and requires reasoning skills beyond math reasoning. It's a great starting point for open chart model development!.
Announcing Bespoke-MiniChart-7B, a new SOTA in chart understanding for models of comparable size on seven benchmarks, on par with Gemini-1.5-Pro and Claude-3.5! 🚀. Beyond its real-world applications, chart understanding is a good challenging problem for VLMs, since it requires
0
9
30
RT @gregd_nlp: Check out Manya's work on evaluation for open-ended tasks! The criteria from EvalAgent can be plugged into LLM-as-a-judge or….
0
3
0
RT @gregd_nlp: Check out Ramya et al.'s work on understanding discourse similarities in LLM-generated text! We see this as an important ste….
0
2
0
RT @bespokelabsai: OpenAI’s o4 just showed that multi-turn tool use is a huge deal for AI agents. Today, we show how to do the same with yo….
0
49
0
RT @bespokelabsai: Announcing Reasoning Datasets Competition📢in collaboration with @huggingface and @togethercompute.Since the launch of D….
0
43
0
RT @madiator: Introducing Bespoke-Stratos-32B, our reasoning model distilled from DeepSeek-R1 using Berkeley NovaSky’s Sky-T1 recipe. The….
0
136
0
RT @madiator: Deepseek has done it again! This time, lots of action packed insights, stuff that the top labs are not willing to share. Som….
0
54
0