My lab is looking for a postdoc to work on some exciting LLM and Multimodal work. Please fill this simple form if you are interested: Colleagues and friends, please retweet to spread the words.
Hello, non-native English speakers, do you often find it difficult to find someone to practice everyday conversations in English? Try our lab's edubot to practice conversations with a chatbot and receive feedback to improve your English.
My research group will be moving to the CS department at Columbia University in Jan 2021 after three awesome years at UC Davis. Feel free to reach out if you are in the NYC area, as we are always open to collaborate on various NLP topics, especially in dialog systems.
I am teaching a new discussion-based class this semester called Conversation AI. Here is my course's reading list with all the wonderful slides made by my students at Columbia.
This could serve as a good readinglist for people who want to get into ConvAI.
Hi Everyone, I am looking for a postdoc researcher. The position can start as early as Jan 2022. Any time before Sep 2022 is possible. The length of the position can range from one to three years. The lab is looking for someone with a strong NLP and ML background.
GODEL is available on hugging face. Looking for a large open-source pre-trained language model for dialog? GODEL leverages grounded pretraining designed to better support finetuning phases that require information external to the current conversation (e.g. database or document).
Congrats Dr. Weiyan Shi! Weiyan is my first PhD student that graduated from Columbia. It was a pity that we couldn't do this type of celebration last year with my two Davis PhD graduates.
This is why I encourage cleaning existing datasets and creating better new datasets. After all, we are solving problems not building models to fit data sets. We cleaned MultiWOZ recently and will publish a tech report along with the cleaned data soon.
Combining flexibility, utility, and grounding, the GODEL language model helps create dialog agents that are unrestricted in the types of queries they can respond to and the sources of information they can draw from, all while providing useful responses:
Jaime was not only a great researcher but a great department chair. I still remember him paying a semester of my tuition when I left one of my advisors in the mid of my PhD. He is a kind person who believes in students. Jaime will be remembered.
It's with deep sadness that we announce that our founder and longtime director, Jaime Carbonell, has died. Dr. Carbonell was a pioneer in the field of language technologies and an inspiration to those who knew and worked with him.
The AAAI rebuttle deadline is this Sunday 8pm PST and we just got the review a few hours ago. This basically means: Academics have to work over the weekends!
I asked chatGPT about myself. This content can fool people outside of the NLP field. The glaring factual error is that I don't have an NSF CAREER award!😠
After a major conference result release, there are always happy and sad moments. This year our lab and our collaborators have seven papers in the ACL main conference(including 1 short paper) and 1 ACL finding. Though some of the best work didn't get in, we should still celebrate
.
@Zhou_Yu_AI
is looking for
#PhD
students who want to advance the frontier of natural language processing
#NLP
! Check out to learn more about her research. For info on our
#computerscience
PhD programs . The deadline is December 15.
We have three papers all about dialogs, accepted in ACL. Papers and code coming out soon. The first one got 4.5, 4.5, 5 ratings in reviews.
Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation, Weixin Liang, James Zou and Zhou Yu
The United States will be powerfully supporting those industries, like Airlines and others, that are particularly affected by the Chinese Virus. We will be stronger than ever before!
How do you evaluate your dialog systems?🤔 Our lab's LegoEVAL is here to help. It is a tool for dialog systems evaluation.
@aclmeeting
demo track.
#Preprint
:
We are happy to announce that
@Zhou_Yu_AI
from UC Davis, will be one of our keynote speakers. She is an expert in multimodal dialog systems. Her team won the
#AlexaPrize
(
@alexa99
) last year, and she was featured in Forbes as 2018: 30 under 30.
Good luck to your ACL rebuttals everyone. If you are having a second thought. SIGDIAL 2021 is a good choice to resubmit your paper. The deadline is April 2nd but will be extended to April 10th. You just need to submit a placeholder on April 2nd.
It was great to just take a train to attend AAAI from NYC to DC. What's more amazing is that I walked to the venue from the union station. What an energy-efficient trip! Excited to meet old friends and make new friends in
#AAAI2023
🥳I will join
@Northeastern
as an Assistant Prof. in Fall 24 and
@stanfordnlp
as a postdoc with
@Diyi_Yang
for 23-24.
Join my lab, CHATS (Conversation, Human-AI Tech, Security), to work on privacy-preserving dialog systems, as visioned by this demo📌 from CHATS' first project,
Check out my new collaborative work with Google on PRESTO. A new dataset on multilingual task-oriented dialog parsing. We have included real-world language phenomenons such as code-switch, corrections, etc.
Both my first(Zhou) and last name(Yu) are generic Chinese names. Zhou is also hard to pronounce. I prefer to be called Jo. What's worse is that because my first name is a more popular last name. People call me Prof. Zhou sometimes. It's Prof.Yu please.:)
Today I want to take a break from sharing research to share a personal story instead. It’s a story about my name, why I once decided to quit academia, why I came back, what I learnt from it, and why I’m grateful to have an audience here on Twitter.
Some media coverage of our recent AAAI 2020 work on Neural news headline editing.
The paper is available here, please contact us if you want to access the dataset.
Our new AAAI paper
and the code is here
One dialog context can have multiple dialog responses that can be appropriate for finishing the goal. Existing datasets do not count in all the possible responses.
Really proud of my student Sky, This is a paper he did with Joyce from Michigan during her undergrad. The work is about exploring the theory of mind framework in a MineCraft game.
WHA-WHA-WHAAAGHYQUIYAU
We just got an outstanding paper award at EMNLP 2021!!!!🎉
#EMNLP
Do come to our Oral Session @ 13:45 on 11/7 to find out more!!!
Drago contributed to the NLP community so much, especially to the Columbia NLP group, where he received his Ph.D. and taught NLP classes as an adjunct. His family needs help. Drago's daughter, Victoria, needs medical care. Please help them.
I find following related work in NLP really challenging these days. There are so many papers out every month! After panicking for a while, I decide to only read several papers in depth and just scan the abstract of the rest. I also highly recommend having reading groups.:)
Some musings on NLP/ML self-study. tldr:
* Reading a paper is really hard, give yourself a lot of time and slack.
* Pick depth over breadth. If you do depth well, you get breadth for free.
* Be actively skeptical of all claims. Read your own drafts with 2x that skepticism.
Finally we are done with the ACL deadline! It has been a while since I worked till the last minute of a paper deadline. But I kind of cheated this time by being in China. It is only 8pm now.:) All the good wishes to our group's hard work.
JSALT 2020's first plenary speaker presentation - which is open to the public - is set for this Friday (July 3) at 10 AM EDT!
@Zhou_Yu_AI
is our presenter, and the title of her talk is "Seamless Natural Communication between Humans and Machines." Info:
Super excited to share my new
#INTERSPEECH2023
paper with my advisor
@Zhou_Yu_AI
on pre-finetuning for few-shot learning in speech tasks!
Our work is motivated by the difficulty of out-of-domain speaker adaptation.
Paper:
Code:
Congrats to our group for 5 AAAI acceptance. We will release all the final version soon.
Weixin Liang, Youzhi Tian, Chengcai Chen, and Zhou Yu, MOSS: End-to-End Dialog System Framework with Modular Supervision, AAAI, 2020
While everyone is in NeurIPS, I had a nice trip in Beijing. I finally got my U.S. visa renewed. I can attend all the conferences next year! It's a good time to invite me for talks out of the states. :)
Excited to be at NYC for
#AAAI2020
Tomorrow I will give an invited talk 4:00-4:40pm at the conversation recommendation system workshop. My student Weiyan will talk about our non-collaborative dialog challenge in DSTC workshop. Hope to see you soon!
Put ChatGPT at a cocktail party🥂.
Can it
- understand people's conversations, gestures
- figure out their relations,
- and even chime in with social advice?
🦍Announce KokoMind.
🌟Check out this demo! More at
#AI
#GPT4
#ChatGPT
#OpenAI
#Shrinking
🧵
Hi All, Please check out a joint work between Salesforce and our lab at Columbia on a comprehensive collection of dialog data with diverse domains and tasks.
Introducing 🎙️DialogStudio🎙️, the largest and most diverse dialogue dataset collection with diverse goals (e.g. task-oriented, open-domain, NLU, etc.) and different domains (e.g. finance, insurance software, movie, etc.)
#NLP
#AI
[1/7] Pre-trained LMs can do in-context learning, but this is unexpected given the distribution shift between pre-training data and ICL prompts. What structures of pre-training data yield ICL? Check out our work “Parallel Structures in Pre-training Data Yield In-Context Learning”
Our EMNLP poster this Tuesday on How to train a user simulator for RL-based dialog system. There is no perfect user simulator, the key is to train and test your system on multiple imperfect simulators with different pros and cons, so your system is not biased.
Have you wondered what makes a chatbot sound more like human?Whether chatbot's agency (bot or human) will influence the persuasive power of the chatbot ? Come to see our CHI 2020 paper by
@shi_weiyan
. link:
Video link:
🤔As an English second-language learner, I usually have trouble replacing a word, that I frequently used in my essays, with substitutes that show better language proficiency.
✍️Introducing ProLex: a benchmark for language proficiency-oriented lexical substitution 🧵(1/n)
Emotional support is a crucial ability to dialog systems, while providing effective ES is not intuitive. In our
#ACL2021
paper, we define the Emotional Support Conversation task, propose an ESC Framework, and present an ESC dataset.
#NLProc
#TsinghuaCoAI
Do we react emotionally to music in different ways? 今天你网抑云了吗? Our
@columbianlp
#EMNLP2022
paper (w.
@SmaraMuresanNLP
and
@Zhou_Yu_AI
) answers these questions across a social music platform, Netease Cloud Music. tl;dr answer: yes! a 🧵
SIGDIAL 2021 will close paper registration on April 2nd(23:59 GMT-11). The paper submission deadline is extended to April 10th. But please remember to submit a placeholder tomorrow!
This is a previous Podcast on dialog systems I did with NLP highlights with
@nlpmattg
@waleed_ammar
We talked about some topics such as domain adaptation in dialog systems and end-to-end training.
The live session for our
#CVPR
paper UC2 is on Tuesday 6:00 AM - 8:30 AM EDT. If you are interested in Multi-lingual Multi-modal Pre-training, please feel free to check our paper: and bring questions to us!
My group has four events this Tuesday at EMNLP. Please come to meet my students and learn about dialog research! You should definitely come to chat with our Amazon Alexa Prize winner, Gunrock. My students can answer all your questions, for example, how we won the $500,000 prize
Super excited to be organizing the 2nd NLP For Conversational AI Workshop held at
#acl2020
. If you like Seattle coffee ☕, Washington hikes ⛰️, and talking machines 🤖, we look forward to seeing you there!
@aclmeeting
My undergrad intern is discouraged from
@iclr_conf
reviews and refused to improve the submission for future conferences. I am not blaming this on the conference review process. The question is What should we do as mentors to help students battle critical reviews?
This paper says that disclosing bot's identity hurts sales. But the bot's quality has a huge effect on its persuasiveness. We have a study dives deeper on this issue but on a different task (under review). Also I wonder if the culture difference will play a role in this.
Interesting new datapoint on the ethical issue of ChatBots disclosing that they are not human: Voice assistant sales agents work as well as humans when they don’t disclose but are not very effective if they do. HT
@catherinebuk
Have you wondered if your name's been used to train GPT2? How to train high-utility privacy-preserving language models 🧐? Check out our new paper on "Selective Differential Privacy"!
@Zhou_Yu_AI
@ruoxijia
#NLProc
Paper:
Code:
I also like this idea of posting accepted paper ID. CVPR does this and I liked it. Then I don't have to wait for all my students to forward their paper decisions to me. I would be relieved to avoid that awkward question of "how did your paper do?"
Appears to apply even if your university is in "hybrid" mode, if _you_ in particular happen to only be taking online courses. Hopefully directed research counts as in person, but it will definitely take some university lawyers to try to figure this out.
We have two related papers on this project.
-Using Chatbots to Teach Languages
-ErAConD: Error Annotated Conversational Dialog Dataset for Grammatical Error Correction
Please consider Davis. We are only one hour away from the Bay area. Our department is colleageal and welcoming. If you have any questions, feel free to email me.
Here is a blog post of our new AAAI paper on news headline editing. You can find paper, code, and data on it. Stay tuned for our presentation in AAAI next week.
This year ACL is virtual, so feel free to reach out though ACL chat to me on topics such as dialog systems, multimodal learning and language generation.
This is the worst situation I can imagine. It's already very difficult for women to have a successful academic career and a happy family, due to the tenure system. Taking away funding during a women's material leave is outrageous!!
I'm sad that the funding we were successfully granted has been revoked because I am going on maternity leave and apparently an extension to account for this is beyond the remit of the funders. It's hard to win funding, to have it taken away again after so much work is gutting 😭
Spoken dialog has disfluences, such as hmm, ah, and self-correction. In addition, spoken dialog has a lot of ellipses and ASR errors! If you want to know how to build a dependency parser for spoken dialog systems
Come to our EMNLP dialog session on Tuesday.
If you are struggling to make your RL-based dialog systems speak coherent sentences, please check our recent work in EMNLP. Mingyang will present it in the first day dialog track. Ask him for the code if you want to try this out yourself!
I've started to upload the videos for the Neural Nets for NLP class here:
We'll be uploading the videos regularly throughout the rest of the semester, so please follow the playlist if you're interested.
[1/9] Large Language Models (LLMs) can mimic humans to explain human decisions. But can they explain THEMSELVEs? How to evaluate explanations along this axis? Check out our work “Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations”!