Canwen Xu
@XuCanwen
Followers
2K
Following
786
Media
55
Statuses
518
Senior Researcher @Snowflake❄️; PhD @UCSanDiego 🏄; Formerly @Boson_AI @Microsoft @GoogleAI @huggingface 🤗. RT ≠ endorsements. Views are my own. He/him
Joined March 2017
❄️We're looking for a MLE/Applied Scientist to join our @Snowflake AI team to work on AI+Software Engineering. If you have sharp eyes to find potential pain points for developers and can solve with AI, this job is just for you! 👉Apply here:
careers.snowflake.com
Apply for Software Engineer, Machine Learning – Engineering Systems and AI Research job with Snowflake in Bellevue, Washington, United States. Engineering at Snowflake
0
4
20
🚀 We are thrilled to release the code for ReFoRCE — a powerful Text-to-SQL agent with Self-Refinement, Format Restriction, and Column Exploration! 🥇 Ranked #1 on Spider 2.0 Leaderboard, a major step toward practical, enterprise-ready systems, tackled both: Spider 2.0-snow &
2
30
117
ExCoT rethinks Text2SQL by letting the model think out loud and check its own work—achieving state-of-the-art performance without manual annotations. Great work by @yuxionghe, @yao_zhewei, @XuCanwen and @BohanZhai - Snowflake AI Research https://t.co/sajhMu5SN9
snowflake.com
Learn how Snowflake’s ExCoT optimizes Text2SQL with execution-guided CoT and DPO, setting a new benchmark in natural language to SQL accuracy.
0
3
8
Snowflake's new Arctic Text2SQL model ❄️ sets a new standard for natural language to SQL accuracy! 🚀 Using execution-guided CoT & DPO, it outperforms top models. 💪 Dive into the details: https://t.co/QhEGwLtB6c 📄 #Text2SQL #AI #MachineLearning 🧠
snowflake.com
Learn how Snowflake’s ExCoT optimizes Text2SQL with execution-guided CoT and DPO, setting a new benchmark in natural language to SQL accuracy.
0
6
20
Our Snowflake ML/AI research team, is hiring multiple internship positions (multi-HCs for each direction) among * Reasoning Models * Multi-Modal Embedding and Captioning * SQL + LLM Send your resume to f"{my_first_name}.{my_last_name} at
https://t.co/3h8vi4I9S5"
1
5
50
🚀 Day 0: Warming up for #OpenSourceWeek! We're a tiny team @deepseek_ai exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency. These humble building blocks in our online service have been documented,
1K
3K
21K
❄️ We are hiring at Snowflake for an AI Research Scientist - Reinforcement Learning and Large Language Models (LLMs)
0
2
11
Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference @NeurIPSConf We have ethical reviews for authors, but missed it for invited speakers? 😡
181
803
4K
New Hugging Face Daily Papers feature If you have at least one indexed paper on Hugging Face, you can now directly submit papers to HF daily papers try it here: https://t.co/4n2b8GFMaM
2
25
67
Excited to release Higgs-Llama-3-70B, the first model in our Higgs family of LLM for role playing - Post-trained from Llama-3-base - Excluded benchmark data (including training examples) from our fine-tuning data - Blog: https://t.co/JIS7n6r36r - 🤗: https://t.co/9WH9yig6Yz
5
25
76
From a party thrower POV, *ACL conferences are 100000x better than ML conferences.
2
0
2
Presenting our RepoBench paper! Let’s also chat about agents for coding :)
Happy to share that RepoBench has been accepted to ICLR 2024! 🎉 A big thanks to @XuCanwen and our advisor @McAuleyLabUCSD 🌟!! We're not stopping here - the next-gen RepoBench is already in the works! 🚀 To opt out, check out https://t.co/wEGPDUuj2J.
0
1
4
✈️ Heading to Vienna tomorrow for #ICLR2024! Come grab a coffee with me to chat about LLM efficiency and data! Plus, we’re hiring engineers+scientists at https://t.co/H34LaxaOYK. Let’s talk if you’re interested!
3
0
33
By the way, we are actively hiring at https://t.co/DujiOqw0Xe. Please check out our job posting:
jobs.lever.co
Job openings at Boson AI
1
0
4
🎓 A personal update: I graduated from UCSD @McAuleyLabUCSD and have started my new journey building LLMs at https://t.co/DujiOqw0Xe. I'd like to thank all my friends for your help and support. This is an era of wonder—stay tuned for what's next.
11
1
119
🌟StarCoder2 and The Stack v2 set a new standard for open code data and models, with lots and lots of precious details in the paper.
Introducing: StarCoder2 and The Stack v2 ⭐️ StarCoder2 is trained with a 16k token context and repo-level information for 4T+ tokens. All built on The Stack v2 - the largest code dataset with 900B+ tokens. All code, data and models are fully open! https://t.co/fM7GinxJBd
0
0
6
Happy to share that RepoBench has been accepted to ICLR 2024! 🎉 A big thanks to @XuCanwen and our advisor @McAuleyLabUCSD 🌟!! We're not stopping here - the next-gen RepoBench is already in the works! 🚀 To opt out, check out https://t.co/wEGPDUuj2J.
huggingface.co
📢Introducing RepoBench!🔥 A benchmark for repository-level code auto-completion systems. We're taking #CodeCompletion beyond single-file tasks to real-world, multi-file programming scenarios. Check it out 👇 📄Paper: https://t.co/l3OfRkLKDv 🔗Github: https://t.co/NTNlgeA0Y3
1
4
15
Introducing DeepSeek Coder! - SOTA large coding models with params ranging from 1.3B to 33B. - Building games, testing code, fixing bugs, and analyzing data... You dream it, we make it. - Free for commercial use and fully open-source. Try it out now at https://t.co/NlfpubcP9N
14
99
458
Excited 🤩 to develop Apps using LLMs but puzzled 🤔 over debugging hallucinations? Thrilled to share AutoDebug, a new transferable way of automated faithfulness testing for LLMs, including self-debug & cross-debug Arxiv: https://t.co/ZdvWDhG3yx
https://t.co/LNA2CkMsUb [1/n]
4
23
117