Nonso Dev
@nonsodev
Followers
2K
Following
1K
Media
98
Statuses
2K
Senior ML engineer and Tech educator •• I Make AI powered solutions •• I speak 🇬🇧🇸🇦🇪🇸 .
Lagos, Nigeria
Joined October 2022
I was training this model on Modal—it took about two hours and cost around $8. When it finally finished, I meant to run the next cell... and accidentally reran the training one instead. Guess I'm literally paying for my mistakes 😂
0
0
0
In today's episode of programming horror... In the Python docs of random.seed() def, we're told "If a is an int, it is used directly." [1] But if you seed with 3 or -3, you actually get the exact same rng object, producing the same streams. (TIL). In nanochat I was using the
217
495
8K
Of course there is no harm to experimenting, if you have time and a gpu, but here are few tips: - Use when model is too complex. - You can actually see the overfitting happening. (This is better because instead of holding on to regularization as default, you can see if the model
I was reviewing a mentee's Deep learning model today and saw something i had wanted to talk about for a while. A lot of hobby DL models we create do not get better performance when Regularization is implemented, in fact it decreases training accuracy and overall generalization.
0
0
1
I was reviewing a mentee's Deep learning model today and saw something i had wanted to talk about for a while. A lot of hobby DL models we create do not get better performance when Regularization is implemented, in fact it decreases training accuracy and overall generalization.
0
0
3
Picked up fastai over the last weekend for a gig so I did a little something for y'all to check out. A pneumonia detector from chest xrays...You could open on colab, run the cells and start testing it out right away and of course, improve it. Colab link: https://t.co/kauYTmJN1Q
colab.research.google.com
Colab notebook
0
0
1
had a dinner with this guy in korea heard that he writes code seems cool
323
487
16K
for someone always training models on an A100 box, T4's are sooo effing slow, lol.
0
0
1
everyone: - “just use the API” PewDiePie: - built a 10x GPU AI Server (8x modded 48GB 4090s, 2x RTX 4000 Ada) - runs opensourcemodels with vLLM for TP - vibe-coded his own Chat UI, including RAG, DeepResearch, and TTS - is fine-tuning his own model be like PewDiePie Buy a GPU
527
1K
23K
Did a parametric experiment to understand really the reason why dropout reg works so well. Didn't really document per se as I just wanted to check it out. I feel like I'd go with the researchers that say this is because of the less reliance on individual nodes there by
2
0
4
I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language
🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping
576
2K
13K
An exciting milestone for AI in science: Our C2S-Scale 27B foundation model, built with @Yale and based on Gemma, generated a novel hypothesis about cancer cellular behavior, which scientists experimentally validated in living cells. With more preclinical and clinical tests,
557
3K
22K
your honor i object, i dont know about harvard but stanford literally releases SOTA courses
Harvard and Stanford students tell me their professors don't understand AI and the courses are outdated. If elite schools can't keep up, the credential arms race is over. Self-learning is the only way now.
43
125
3K
Live session tomorrow 🚨 We’ll go over how to join the Builders’ Challenge #3, what to build, and how to make your project stand out 🙌 Catch it live here on X or on YouTube at 3PM CET. Set a reminder: https://t.co/ZOCYzWsUjH
36
42
95
Going to sleep after accidently deleting the latest checkpoints!
1
1
12