Brendan Hogan
@brendanh0gan
Followers
2K
Following
3K
Media
144
Statuses
990
ml research scientist @morganstanley || phd in cs @cornell 2024
nyc
Joined November 2020
introducing qqWen: our fully open-sourced project (code+weights+data+detailed technical report) for full-stack finetuning (pretrain+SFT+RL) a series of models (1.5b, 3b, 7b, 14b & 32b) for a niche financial programming language called Q All details below!
21
95
756
Met a founder in SF who's crushing it $1 million MRR after just 3 months But he had a nagging problem "Our customer's data just feels so vulnerable. And the government doesn't even care to put serious guidelines in place that tell us how to store and manage our data." He
37
29
956
me when i come across an exceptionally well-crafted personal website
127
572
14K
We're scaling our Open-Source Environments Program As part of this, we're committing hundreds of thousands of $ in bounties and looking for partners who want to join our mission to accelerate open superintelligence Join us in building the global hub for environments and evals
12
60
429
app idea: fine tune a vision llm on pictures of car damage and the fair repair price get a quick fair estimate maybe just sells leads to auto body shops
2
0
16
my per token credit tweak for grpo might actually be working 🥲
1
0
13
the related works section went from being my least favorite part to write to one of my favorites
0
0
4
repo: https://t.co/tsSFY8mqYo link to the old blog with more details about pretraining:
github.com
Contribute to brendanhogan/nbn-gpt development by creating an account on GitHub.
1
1
4
update 1 year later: i've now added a GRPO trainer to hill climb telling the scariest stories! 👻 after grpo, my 1.5B model (fully trained from scratch pretrain, midtrain, rl) went from losing every time to beating gpt4.1-nano at scary stories in ~25% of head-to-head matchups.
introducing the G(houlish) P(retrained) T(errifier) model🎃 I trained a 1.5-billion-parameter GPT model (pretraining (matching @OpenAI GPT-2 performance), fine-tuning and rlhf) on @LambdaAPI with 8xH100s to generate scary stories for Halloween! Heavily inspired by the work
1
3
36
next step is to try smaller MoE models and to build more Q environments
0
0
0
* the pretraining data is all idiomatic Q - and of course our RL env just enforces the model writes correct Q - the pythonic bias came from SFT on Q that was translated from python unfortunately its still a little pythonic - I think its just the nature of the leet code problems
1
0
0
Update to this - we have trained a 72B model the most exciting part is the pretrain accuracy is high enough that we can skip SFT and just use RL - meaning less pythonic Q! * interestingly not much performance gain over 32B (actually worse pass@1) - but with higher pass@N's it
introducing qqWen: our fully open-sourced project (code+weights+data+detailed technical report) for full-stack finetuning (pretrain+SFT+RL) a series of models (1.5b, 3b, 7b, 14b & 32b) for a niche financial programming language called Q All details below!
1
1
1
releasing my grpo v2 repo: nano-grpo-reasoning-gym two big changes (1) this one entirely implements the grpo training stack from just pytorch/very simple python code - but is now extended to use vLLM, the liger kernel and other optimizations that make it much quicker to train
2
8
103
does backtracking help the model think or is it just more likely in human text when someone backtracks they get the correct answer is that the same thing
2
0
6
“if this sigmoid trend continues well reach 2 by mid 2026!”
0
0
2
getting a citi bike membership has turned the worst part of my week into one of best parts
1
0
6