ADDY
@buglessaddy
Followers
23
Following
186
Media
1
Statuses
48
A quirky CS student who likes to try everything out. Building software and exploring algorithms. Fueled by python, c++, MERN and a good PPL workout
Joined August 2025
Share your portfolio.. Here's mine https://t.co/pYz93oPksX
adicodes.vercel.app
Portfolio of Aditya Mukherjee, a full stack developer specializing in MERN and Next.js
2
0
2
What's the one terminal command you use so much, you could probably type it in your sleep?
0
0
2
what's a genuinely practical, non-obvious problem in your daily life or workflow that an LLM could realistically solve, but isn't being used for yet? #TechTwitter #BuildInPublic #AI #LLM #programming
0
0
2
My takeaway from this 3 AM rabbit hole was that the most exciting innovation isn't just making bigger models, but finding clever engineering hacks to make them smaller, faster, and more accessible for everyone. #LLMs #AI #DeepLearning #TechTwitter #BuildInPublic
0
0
1
And it's not just one company's game anymore. You have Google's Gemini, Meta's Llama, Mistral, and countless open-source heroes. The community is taking these base models and fine-tuning them for everything you can imagine. It feels less like a corporate race where we might get
1
0
0
Now enters PEFT & LoRA (Parameter-Efficient Fine-Tuning & Low-Rank Adaptation). This is the real big-brain move. Instead of rebuilding the skyscraper, you just plug in a small, new control panel. You only train this tiny panel, leaving the billions of original parameters frozen.
1
0
0
So you have a massive model with billions of parameters. Training it from scratch costs more than my entire https://t.co/Bw6dY4SUoa tuition (and my next 3 generations). So how do you teach it a new skill, like writing legal documents? Fine-tuning the entire model is like
1
0
0
First up was Tokenization. LLMs don't read words, they see "tokens". It’s like chopping up a sentence into LEGO bricks before building something with it. The model isn't reading English; it's just calculating the relationship between these little mathematical bricks. Wild.
1
0
0
My CS course is still on bubble sort but I just spent my night diving into LLM whitepapers and honestly, my head feels like it has been recompiled. Mind. Blown. For the curious among you about the magic behind models like Gemini, here is what I learned today
1
0
1
Summary: From the architecture's core to advanced scaling, tuning, acceleration, and usage techniques, this whitepaper details the complex steps needed to develop modern, effective LLMs. The field remains active and continuously evolving.
1
0
0
Guiding Output Quality Users guide models using Prompt Engineering: • Zero-shot: Prompting with instructions only. • Few-shot: Providing a few examples within the prompt. • Chain-of-Thought (CoT): Demonstrating step-by-step reasoning to improve performance on complex tasks.
1
0
0
Inference Acceleration As models grow, accelerating inference is crucial for reducing latency and cost. • Output-Preserving Methods (no change to output) include Flash Attention (optimizing the quadratic self-attention calculation) and Prefix Caching (reusing computed attention
1
0
0
Alignment and Tuning After initial pre-training, models are specialized using fine-tuning. First, Supervised Fine-Tuning (SFT) uses high-quality, task-specific, labeled datasets to help the model capture the essence of a task or follow instructions. Next, Reinforcement Learning
1
0
0
Compute-Optimal Scaling Early scaling focused heavily on model size, but the key insight from the Chinchilla paper fundamentally changed development. Optimal LLM performance now requires near-equal scaling in both model size and training data size. The focus shifted to scaling
1
0
0
Architecture Foundation The Transformer architecture is the foundation for all modern LLMs, including decoder-only variants used by most contemporary models. To build powerful, effective LLMs, developers focus on optimizing four main areas: scaling laws, alignment/tuning,
1
0
0
The LLM Whitepaper Thread: 5 Key Pillars of Foundational Large Language Models
1
0
0
What do you think is stopping you from creating the next big thing? Let’s discuss
1
0
6