kirkby_max Profile Banner
Max Kirkby Profile
Max Kirkby

@kirkby_max

Followers
800
Following
7K
Media
7
Statuses
213

co-founder @parsedlabs. PhD'ing @OxNeuro @rhodes_trust. hierarchical plans and continual learning

Oxford/SF
Joined February 2021
Don't wanna be here? Send us removal request.
@parsedlabs
parsed
4 days
We’re releasing a product that trains fast, domain-aware search models on your knowledge base. Drop in your KB and we synthesise data, then use RL with verifiable rewards to train <4B models. It trains in a couple of hours, is about an order of magnitude faster than your
1
5
13
@parsedlabs
parsed
16 days
Introducing Lumina. We've built an adaptive evaluation engine that discovers failures and evolves its own outputs, all by iterating with the customer in the loop. Proper evals can only be constructed by “touching grass”, and we think this holds incredible promise for steering
0
6
12
@stefanopopoulos
Paras Stefanopoulos
16 days
RGT is available in our platform right now for our customers. Havin' fun, building frontier tech, seeing downstream customers getting real value from OS models and eating Kababs 🔥 We plan on exposing more of our web-app so the public can interact with these methods as well as
@charles0neill
Charlie O'Neill
16 days
🧵 We just published our work on Rationale-Guided Training (RGT), a stupidly simple method that allows you to circumvent the difficult with RL and get much better performance than plain SFT Everyone's trying to build "RL-as-a-service" for LLMs, and instead we found something way
0
2
3
@charles0neill
Charlie O'Neill
16 days
🧵 We just published our work on Rationale-Guided Training (RGT), a stupidly simple method that allows you to circumvent the difficult with RL and get much better performance than plain SFT Everyone's trying to build "RL-as-a-service" for LLMs, and instead we found something way
4
5
16
@johnschulman2
John Schulman
16 days
jack-o-lora
10
9
334
@kirkby_max
Max Kirkby
17 days
We found a neat way to teach models the rules and not just the answers using strategy tokens, with implications for cleaner supervision and sample-efficiency gains. Fresh from @part_harry_ @charles0neill and us @parsedlabs!
@parsedlabs
parsed
17 days
We discovered that teaching models why answers are correct, not just what to output, dramatically improves training efficiency. By making latent strategies explicit during training (e.g., "don't infer diagnoses from medications"), we achieve the same performance with 10x fewer
0
1
6
@kirkby_max
Max Kirkby
18 days
Attribution without the guess work. Awesome to see this work spearheaded by @Jozef_Nathaniel @charles0neill out!
@parsedlabs
parsed
18 days
Introducing attention-based attribution: why cosine similarity is cosplay. Averaging the right transformer layers yields true attribution from attention, delivering reliable chunk-level auditability with sub-100 ms overhead and lower memory. It even works on a closed model!
0
1
5
@kirkby_max
Max Kirkby
25 days
Exciting findings from our work at @parsedlabs with @charles0neill. We demonstrate that low-rank LoRA delivers full fine-tuning quality in production. Our experiments reveal several promising relationships between training loss, evaluation metrics and dataset size (more on this
1
1
7
@ayushnoori
Ayush Noori
1 month
Introducing MedLog, a global log for medical AI. It was a privilege to lead this collaboration across 35 institutions and 9 countries, including @HarvardDBMI, @GoogleDeepMind, @aboutKP, @WHO, @ClalitHealth, @MSFTResearch, @StanfordMed, @GatesFoundation, & many more, supervised
13
32
142
@karpathy
Andrej Karpathy
2 months
Finally had a chance to listen through this pod with Sutton, which was interesting and amusing. As background, Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea
@dwarkesh_sp
Dwarkesh Patel
2 months
.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training
431
1K
10K
@khoomeik
Rohan Pandey
2 months
today you will be presented 2 visions of humanity's future with AI if you don't want to build the infinite AI tiktok slop machine but want to develop AI that accelerates fundamental science, raising civilization to Kardashev 1 and beyond come join us at @periodiclabs
@willdepue
will depue
4 months
do not build Infinite Jest (V), do not build the infinite AI TikTok slop machine, do not build the P-zombie AI boy/girlfriend, do not build the child-eating short-form video blackhole, do not build the human-feedback-optimized diffusion transformer porn generator. save yourselves
29
38
741
@KrisTorpJensen
Kristopher Torp Jensen
2 months
I’m super excited to finally put my recent work with @behrenstimb on bioRxiv, where we develop a new mechanistic theory of how PFC structures adaptive behaviour using attractor dynamics in space and time! https://t.co/umhunhmUk0 1/8
1
16
57
@willccbb
will brown
2 months
the reason i think this is wrong is because RL environments are going to be far too large of a thing for any one company to dominate. no, the *data* isn't going to all be open. the task design won't all be. but the infra will be. many startups are all racing to build their
@MechanizeWork
Mechanize
2 months
Open source is not a viable strategy for producing RL environments at scale. The future of AGI lies in closed source development.
31
42
565
@stefanopopoulos
Paras Stefanopoulos
2 months
Don't want to work on boring issue tracking software? We're looking for top-tier engineers who build beautiful products. Jokes aside, we ❤️ Linear and are setting the same quality bar for our platform. We create hyper-specialised LLMs delivering real commercial value at scale.
@thenanyu
Nan Yu
2 months
If you are an engineer, and you are pretty sure you're a better product person than most or all of the PMs you've worked with, we would like to hire you at Linear. Apply on the website and mention this tweet and we will make sure we get to your application quickly
0
3
6
@cogscikid
Wilka Carvalho
2 months
A limitation of LLMs: tacit knowledge---the knowledge you can't put into words LLMs have all the world's written knowledge. But many important things were never written down. How do you teach ice skating through text? "Bend your knees, shift forward, arms out for balance." But
0
3
9
@tuhinone
Tuhin Srivastava
2 months
Today, we’re excited to announce our $150M Series D, led by BOND, with Jay Simons joining our Board. We’re also thrilled to welcome Conviction and CapitalG to the round, alongside support from 01 Advisors, IVP, Spark Capital, Greylock Partners, Scribble Ventures, and Premji
77
37
487
@part_harry_
Harry Partridge
2 months
hot take: RL environments shouldn't return rewards or scores; they should just return unstructured output (e.g. plain text, images, etc). Instead we need to start designing 'heuristics' which a model uses to determine the appropriate score/reward for a given env response.
1
1
4
@ayushnoori
Ayush Noori
3 months
The team behind @parsedlabs are among the most brilliant people I know, but also are plain good and nice people who care about improving specialized intelligence, especially for high-risk domains like medical, legal, or financial. I'm excited to see Parsed take off! 🚀
@charles0neill
Charlie O'Neill
3 months
Today, we’re launching Parsed. We are incredibly lucky to live in a world where we stand on the shoulders of giants, first in science and now in AI. Our heroes have gotten us to this point, where we have brilliant general intelligence in our pocket. But this is a local minima. We
1
4
13
@basetenco
Baseten
3 months
Our team met Parsed a few months ago, and we could not be more excited to see the inflection point they are a part of - customized models built for those high impact jobs. This is an incredible team and we're thrilled to power their inference. Congrats @parsedlabs. Let's build.
@charles0neill
Charlie O'Neill
3 months
Today, we’re launching Parsed. We are incredibly lucky to live in a world where we stand on the shoulders of giants, first in science and now in AI. Our heroes have gotten us to this point, where we have brilliant general intelligence in our pocket. But this is a local minima. We
0
3
26