@XueFz
Fuzhao Xue on the job market!
7 months
Great work! Once again, it highlights the implicit repetition of training tokens. While the Chinchilla law is commendable, it's clear it won't endure indefinitely. As models grow larger, the Language Model (LLM) assimilates knowledge from implicitly repeated tokens. This is…
@lmsysorg
lmsys.org
7 months
Catch me if you can! Announcing Llama-rephraser: 13B models reaching GPT-4 performance in major benchmarks (MMLU/GSK-8K/HumanEval)! To validate results, we followed OpenAI's decontamination method and found no evidence of contamination...🤔 Blog: [1/n]
Tweet media one
22
154
925
0
9
67