Liling Tan @alvations X Profile

Liling Tan

@alvations

Followers

1K

Following

11K

Media

568

Statuses

8K

Code, geek, game

Joined May 2015

Don't wanna be here? Send us removal request.

Liling Tan

@alvations

12 hours

Imagine if people literally argues about , how much more fun you give to the world for training English -> Hylian/Sheikah models?.

0

Liling Tan

@alvations

12 hours

Now for another #neuralempty idea for Go to your fav #LLM, make it explain and generate Now take Tatoeba most freq sents and make it generate images in Hylian/Sheikah. Then you find that there are many unsolved NLP tasks =).

Liling Tan

@alvations

12 hours

Want a free #nlproc idea for a paper for at @emnlpmeeting organized by @rajammanabrolu et al.?. Go to the page, then do this to your favorite #LLM:

1

0

Liling Tan

@alvations

12 hours

@emnlpmeeting @rajammanabrolu And finally:. > Now create more types for animals, bugs, items, junk, etc. and generate 50 items per type. Now you have a dataset enough to visualize/cluster and finetune a model, play around for a workshop paper =).

0

Liling Tan

@alvations

12 hours

@emnlpmeeting @rajammanabrolu Then do this to create the dataset:. > Now generate a list of sea creatures and theor animal crossing style quotes, return a JSON for each creature that contains {"type": "sea creature", "name": ". ", "catch_quote": . , "explanation": ". "} Give me a JSONL file.

1

0

Liling Tan

@alvations

12 hours

Want a free #nlproc idea for a paper for at @emnlpmeeting organized by @rajammanabrolu et al.?. Go to the page, then do this to your favorite #LLM:

1

0

Liling Tan

@alvations

1 day

What if pagerank, tfidf and neural search can co-exist and not cross-canabalize traffic?. What if we rebuild the internet as we knew it? .

0

Liling Tan

@alvations

2 days

Imagine decoy agents making decoy agents and microverse of microverses.

0

Liling Tan

@alvations

2 days

Circling back to this. There’s a huge opportunity to replace humans in a simulation world of “agents”. Imagine instead of paying human vendors to “scale your AI”, which annotators are already aided by models, #llm bots/agents in this other virtual world becomes your vendor.

Liling Tan

@alvations

29 days

There should be 2 internets. One for #llm bots, one for humans.

2

0

Liling Tan

@alvations

6 days

Ah this is in Table 4 😌.

0

Liling Tan

@alvations

6 days

So many food for thought in one paper! Thanks @BZhangGo @orf_bnw for the fun read!!.

0

Liling Tan

@alvations

6 days

One itch to scratch is really back to the 2B-2B model, what if we just Xavier initialize it instead of initializing with pretrained Gemma?.

1

0

Liling Tan

@alvations

6 days

BTW, Section 4 data setting is a hidden gem. Why “adapt to a max 2 trillion tokens”? Cos it keeps experiments tractable. We want delta comparisons for ablations not best reasoning model in the world. “Gemma 2 pretaining comes with knowledge distillation.” Got it😉.

0

Liling Tan

@alvations

6 days

@BZhangGo are you folks planning to release the code that does the conversion/initialization of the enc-decoder model with the decoder-only weights?.

1

0

Liling Tan

@alvations

6 days

I wonder, if we do RLHF eventually why did they still do prefixLM / UL2 “adaptation”. And what happens if we put the adapted en-/decoder model and split into half or average the weights an put it back to a decoder-only model?.

1

0

Liling Tan

@alvations

6 days

It’s unsatisfying that 9-9B loses to 9B decoder-only in Table3 5-shot WMT. Is it because we give up something in the RLHF for translation?. BTW Table 3 should really be sorted by task name, a bit tedious to compare (a) vs (b).

0

Liling Tan

@alvations

6 days

After reading, re-reading and thinking, why do 2B-2B when you could have split the 2B layers into half and do 1B-1B? That way the 2B decoder-only can be more comparable to 1B-1B en-/decoder.

2

0

Liling Tan

@alvations

7 days

pretty awesome stuff from #neuralempty folks.

0

Liling Tan

@alvations

7 days

Now we get a glimpse of the flash 😉 #llm #nlproc #renaissance.

Omar Sanseviero

@osanseviero

7 days

Introducing T5Gemma: the next generation of encoder-decoder/T5 models!. 🔧Decoder models adapted to be encoder-decoder.🔥32 models with different combinations.🤗Available in Hugging Face and Kaggle.

5

0

Liling Tan

@alvations

7 days

Any resemblance is purely coincidental. Someday I hope to see this “bayhugs” software in @misorobotics 😁. #huggingmax #bayhugs.

Thomas Wolf

@Thom_Wolf

7 days

Thrilled to finally share what we've been working on for months at @huggingface 🤝@pollenrobotics. Our first robot: Reachy Mini. A dream come true: cute and low priced, hackable yet easy to use, powered by open-source and the infinite community. Tiny price, small size, huge

0

Liling Tan

@alvations

9 days

Olmo from is one of the few commendable and truly transparent #llm

0