Yifan Gao
@Yifan__Gao
Followers
773
Following
109
Media
2
Statuses
36
Senior Applied Scientist, Amazon Stores Foundational AI, PhD, Natural Language Processing
Palo Alto, CA
Joined November 2018
Introducing The Most Advanced Memory System for LLM Agents MIRIX is by far the most advanced memory system in the world, designed to make AI truly remember, learn, and help you over time. Website: https://t.co/KXVIrJ54x3 Paper: https://t.co/zvZNfFAZsl Github:
arxiv.org
Although memory capabilities of AI agents are gaining increasing attention, existing solutions remain fundamentally limited. Most rely on flat, narrowly scoped memory components, constraining...
14
132
801
Following MemoryLLM ( https://t.co/eGn7mWKKUJ), We have trained a new model memoryllm-8b ( https://t.co/4zalGhG0MX) based on Llama3 with a memory pool of size 1.67B! Based on this we built a Chat model memoryllm-8b-chat( https://t.co/f8iWmm11pu) Check them out!
huggingface.co
2
29
118
We maintain an open research environment, encourage academic collaborations, and have a strong record of publications in top-tier conferences such as ICML, NeurIPS, ACL, EMNLP, KDD, SIGIR, and WWW.
0
0
5
Our team focuses on developing state-of-the-art foundational LLMs for e-commerce and building innovative GenAI-powered capabilities like Rufus ( https://t.co/ctqrYBVJDj).
aboutamazon.com
With Rufus, customers are now able to shop alongside a generative AI-powered expert that knows Amazon’s selection inside and out, and can bring it all together with information from across the web to...
1
0
3
Our Amazon Stores Foundational AI team is seeking talented PhD students to join us as research interns in Fall/Winter 2024. We are looking for candidates with top-NLP/ML publications and familiarity with LLM research. Contact me at yifangao@amazon.com if you are interested.
1
13
61
🚀 Introducing MEMORYLLM! Our latest research presents a model with self-updatable parameters, enabling seamless integration of new knowledge. MEMORYLLM retains long-term info without performance drops, validated through extensive benchmarks. 🌟 @icmlconf
1
3
32
#LLM #OnlineShopping #KDDCup This year our Rufus team in Amazon is organizing a KDD Cup 2024 challenge: online shopping for LLMs, including 57 tasks and over 20,000 questions based on real Amazon shopping data. Link:
aicrowd.com
Revolutionise E-Commerce with LLM!
0
0
7
📢🥳A more accessible blog post about our ACL outstanding paper "SCOTT: Self-Consistent Chain-of-Thought Distillation" is now available at
amazon.science
At this year’s ACL, Amazon researchers won an outstanding-paper award for showing that knowledge distillation using contrastive decoding in the teacher model and counterfactual reasoning in the...
1
5
17
Congrats to @PeifengWang3 and all co-authors!
Thrilled to receive an Outstanding Paper Award #ACL2023NLP for our work on Self-Consistent Chain-of-Thought Distillation w/ @PeifengWang3 & co-authors @amazon SCOTT will be presented Tue 9-10:30am ET at "Interpretability & Analysis of Models for NLP 1" at Metropolitan East
0
0
4
🤔How can we teach a language model to reason consistently with its own generated rationales❓ Check out our work at #acl2023 which presents a faithful chain-of-thought knowledge distillation framework for language-based reasoning😎! Paper: https://t.co/CSQsYTy60E 🧵[1/n]
2
8
35
Are Large Pre-Trained Language Models Leaking Your Personal Information? Welcome to check the answer in our recent #EMNLP2022 paper on LM *Memorization* vs *Association* Paper: https://t.co/tsG0oZle2i Code: https://t.co/aiHfVM32Om
Large language models are trained on vast datasets scraped from the internet. This inevitably includes personal data such as addresses, phone numbers and emails. I wanted to know—what do these models have on me?
1
6
34
I am more than happy to share my experience on the search mission understanding team in Amazon Search ( https://t.co/pUnef41ekf). We are looking for NLP research interns all-year-round. Looking forward to chat with old and new friends :-).
0
0
1
I will present two papers “ProQA: Structural Prompt-based Pre-training for Unified Question Answering https://t.co/aNpqssxvBw” Tue 10:45-12:15 “Retrieval-Augmented Multilingual Keyphrase Generation with Retriever-Generator Iterative Training https://t.co/rmKNcq5FZ0” Wed 2:15-3:45
4
0
7
After more than two years of virtual conferences, I am attending my first in-person conference — NAACL 2022 in Seattle.
1
4
33
Glad to see follow-up work on our proposed open-retrieval setting of conversational machine reading. Check our prepared dataset and baselines here: https://t.co/NbSm2TNIha Preprint:
arxiv.org
In conversational machine reading, systems need to interpret natural language rules, answer high-level questions such as "May I qualify for VA health care benefits?", and ask follow-up...
Happy to share that our paper “Smoothing Dialogue States for Open Conversational Machine Reading” has been accepted by #emnlp2021. Preprint: https://t.co/sne6LVyx2f Codes will be released soon !
1
1
6
We also propose token-deletion pretraining to reduce the mismatch between question generation pretraining and question disambiguation finetuning. The overall model, named Refuel, receives competitive performance on AmbigQA, NQ-Open, and TriviaQA.
1
0
1
to arrive at the final disambiguated output. The proposed approach is model-agnostic which improves our model as well as several baselines!
1
0
0
To answer ambiguous open-domain questions, we propose a round-trip prediction approach to iteratively generate additional interpretations that our model fails to find in the first pass and then verify and filter out the incorrect question-answer pairs
1
0
0
My internship work at AWS AI is accepted by #ACL2021NLP main conference! "Answering Ambiguous Questions through Generative Evidence Fusion and Round-Trip Prediction" Previous preprint: https://t.co/Biu8Ye8xwC. The camera-ready version will be updated soon!
3
5
51