Great work! Once again, it highlights the implicit repetition of training tokens.
While the Chinchilla law is commendable, it's clear it won't endure indefinitely. As models grow larger, the Language Model (LLM) assimilates knowledge from implicitly repeated tokens. This is…
Catch me if you can! Announcing Llama-rephraser: 13B models reaching GPT-4 performance in major benchmarks (MMLU/GSK-8K/HumanEval)!
To validate results, we followed OpenAI's decontamination method and found no evidence of contamination...🤔
Blog:
[1/n]