gallabytes Profile Banner
theseriousadult Profile
theseriousadult

@gallabytes

Followers
6K
Following
192K
Media
167
Statuses
3K

father, ML enjoyer. building agents @cursor_ai. @midjourney v2-7.

Joined April 2014
Don't wanna be here? Send us removal request.
@gallabytes
theseriousadult
10 days
as test time training mechanisms mature, we're going to need continual learning benchmarks. I think the most obvious one is language transfer:.- train entirely in English .- eval entirely in some other language .- eval is a single serial pass through the dataset with TTT only.
2
0
22
@gallabytes
theseriousadult
6 months
Google I owe you an apology I was not familiar with your game.
@emollick
Ethan Mollick
6 months
Dang, Google's veo 2, same prompt.
20
109
3K
@gallabytes
theseriousadult
3 months
After an incredible 3 years leading model development at Midjourney, I've joined Cursor to work on coding agents. I'm incredibly proud of my time at Midjourney and the work we did, of the results of that singular focus on beauty and creativity.
43
47
2K
@gallabytes
theseriousadult
5 months
Anthropic seems like the only big lab which is perpetually running out of inference compute. I don't hear the same complaints about rate limits and capacity crunches about anyone else. do they really have that much less? maybe sonnet is a much bigger model than 4o.
67
27
1K
@gallabytes
theseriousadult
3 months
this is asinine. deepseek is accomplishing comparable feats with 100x less staff and 1000x less budget. Google needs to fire people whose job is to get in the way, or who treat getting in the way as their 20% time project.
@arstechnica
Ars Technica
4 months
Sergey Brin says AGI is within reach if Googlers work 60-hour weeks
26
19
741
@gallabytes
theseriousadult
5 months
did you know: the best way to spread chinese propaganda & undermine the american economy is to upload preprints to arxiv, release the results open source under a permissive license, then wait for the forbes readers to throw a tantrum.
@PalmerLuckey
Palmer Luckey
5 months
DeepSeek is legitimately impressive, but the level of hysteria is an indictment of so many. The $5M number is bogus. It is pushed by a Chinese hedge fund to slow investment in American AI startups, service their own shorts against American titans like Nvidia, and hide sanction.
20
45
651
@gallabytes
theseriousadult
5 months
deepseek caught up faster than I expected. sick af. one question though - where THE FUCK is Anthropic?.
@iScienceLuvr
Tanishq Mathew Abraham, Ph.D.
5 months
Okay so this is so far the most important paper in AI of the year
Tweet media one
52
15
599
@gallabytes
theseriousadult
9 months
@theojaffee I like that walking around and comfortable seats are the norm on trains. Jumbo planes pressurized to sea level where you can't hear the engines and which take off from airports you can get to 15 minutes before departure having bought the tickets earlier that day would be great.
10
2
532
@gallabytes
theseriousadult
11 days
how you know you asked a good question
Tweet media one
4
7
511
@gallabytes
theseriousadult
1 year
I often get the sense people are thinking about aging in a way that's deeply wrong. When I look at aging, I see a breakdown of a complex interconnected system, where the whole thing is collectively decaying in an accelerating fashion due to the decay of its parts. What I *don't*.
@CJHandmer
Casey Handmer
1 year
When we figure out how to significantly slow down aging I am 99% sure that mass production will require technology no more advanced than we had in the 1930s. In other words, we have endured a century of unnecessary suffering because we have not asked sufficiently correct.
32
13
306
@gallabytes
theseriousadult
10 months
come work w/me & have the job on the left
Tweet media one
@DavidSHolz
David
10 months
any great engineers out there who want to get closer to ai? we're hiring for the core data team at @midjourney there's cool challenges and big opportunities to both learn and make a difference in the creative capacity of the world.
12
11
315
@gallabytes
theseriousadult
5 months
everyone is building really big short-term memories and calling it long-term memory. they're different things. almost nobody is building long-term memory. I barely ever see papers on it. it's the last remaining puzzle piece imo. should be the thing people are trying to build.
30
14
300
@gallabytes
theseriousadult
5 months
bullish for xAI that they're the only big lab I don't see out here coping.
@hyhieu226
Hieu Pham
5 months
OpenAI accusing DeepSeek of "copying" from ChatGPT, and Dario's call for export control, are the pinnacle of coping.
18
5
269
@gallabytes
theseriousadult
6 months
whale bros have the mandate of heaven, truly. xAI should just run the DeepSeek code on their cluster on the biggest dataset they can cobble together and release the best omni-model the world has ever seen. don't bother post-training it, just make the best base model and release.
@deepseek_ai
DeepSeek
6 months
🚀 Introducing DeepSeek-V3!. Biggest leap forward yet:.⚡ 60 tokens/second (3x faster than V2!).💪 Enhanced capabilities.🛠 API compatibility intact.🌍 Fully open-source models & papers. 🐋 1/n
Tweet media one
Tweet media two
9
16
265
@gallabytes
theseriousadult
3 months
4o image gen clearly has some kind of multi scale generation setup - seems to commit to low frequency at the beginning then decode high frequency with patch AR.
Tweet media one
10
3
263
@gallabytes
theseriousadult
11 days
I know we all stan deepseek here in 🐋POT but the distribution shift from 4o-like to Gemini-like for output suggests that the distillation claims are likely true and this should change the narrative more than it has IMO.
12
3
223
@gallabytes
theseriousadult
2 months
I told o3 to not hesitate to call bullshit and now it thinks almost every paper I send it is insufficiently bitter pilled
Tweet media one
15
2
221
@gallabytes
theseriousadult
11 months
absolutely insane to me to see hackers grow up and try to raise their kids in a way that's incompatible with becoming hackers.
@jawwwn_
Jawwwn
1 year
🔮 $PLTR co-founder Peter Thiel on screen time for kids 📺:.“If you ask the executives in those companies, how much screen time do they let their kids use, and there’s probably an interesting critique one could make. Andrew: What do you do?. Thiel: “An hour and a half a week.”
32
7
209
@gallabytes
theseriousadult
1 year
@srush_nlp TPUs have massive SIMD instructions on a few simple serial cores. GPUs have a ton of cores which are individually much slower. If you need very dynamic memory access patterns in your code they'll be much worse at it than GPUs because they can't just swap out to another thread.
6
17
198
@gallabytes
theseriousadult
6 months
ok that's quite surprising
Tweet media one
Tweet media two
15
2
181
@gallabytes
theseriousadult
8 months
entropix is reasonable evidence for harder takeoffs. I'm not *convinced* but I am convinced to take it more seriously. @doomslide I owe you some bayes points.
6
3
181
@gallabytes
theseriousadult
10 months
I remember seeing dalle1 and thinking "goddamn OpenAI is going to build the coolest stuff and never release it bc they believe in AGI not products." my very next thought was "what an opportunity!" and immediately set to work on replicating it. roughly 1.5y later I beat it.
@yacineMTB
kache
10 months
i remember panicking about dalle 1, 3 years ago. i thought that AI was going to be locked in a orwellian "I'm your mommy" tech company basement. I'm glad I was wrong. technology wants to be free. it will always escape. because the people who build it, build it as worship.
3
3
183
@gallabytes
theseriousadult
4 months
I've been telling you all that SAEs are not the sauce. stop trying to find "the features" and start thinking about how to steer in high dimensional space. so much more is possible.
@kzSlider
KZ is in London
4 months
Damn, triple-homicide in one day. SAEs really taking a beating recently
Tweet media one
11
6
176
@gallabytes
theseriousadult
4 months
it's kinda wild that deepseek trained v3 so cheaply with just Adam. if they'd known about second order optimizers might it have been only 3 million dollars of GPU time instead of 6?.
10
3
173
@gallabytes
theseriousadult
5 months
it turns out all you needed was a cracked free tier to displace the homework app?. that or everyone has already downloaded chatgpt and this is only measuring recent downloads for some quite short window.
@natolambert
Nathan Lambert
5 months
DeepSeek app sitting at number 1 overall in the US Iphone App Store is not on my bingo card and is the biggest sign yet that the ChatGPT moat can maybe be cracked.
Tweet media one
7
2
168
@gallabytes
theseriousadult
1 month
since o3 came out with great search and ok memory integration in chatgpt I don't use any other chatbot apps anymore. I also don't use any other models in chatgpt. that sweet spot of 10-90s of searching instead of 10 minutes is really great for q&a, discussion, etc.
13
0
163
@gallabytes
theseriousadult
1 year
@dwarkesh_sp @fchollet seems like the best choice here by far.
3
0
157
@gallabytes
theseriousadult
26 days
OpenAI ships a feature and suddenly Google's lawyers decide they're allowed to ship their internal prototype?.
8
0
161
@gallabytes
theseriousadult
6 months
idk man o1 pro is pretty good. not quite there yet but "principal engineer of the gaps" is getting awful tight. I've got a few years left before I can fully automate my job but probably not 10, and I think I've got one of the most automation-resistant IC jobs in tech.
@personofswag
adam 🇺🇸
6 months
ofc you think AI will do all SWE work in 2 years, you have 6 months of experience where they only give you tasks so well-scoped a child could do them. you’ll be convinced of that until you make SWE2 then you’ll be posting about how “general reasoning” is required to do your job
Tweet media one
6
1
154
@gallabytes
theseriousadult
4 months
> new model release.> good at math.> fast af.> still no pdf input. WHY.
18
6
156
@gallabytes
theseriousadult
5 months
deepseek probably spent less money on their RL run than OpenAI spent on the arc agi benchmark.
15
3
153
@gallabytes
theseriousadult
9 months
@destructionset @fabiooreilly this reads more like standard alphabet cope than a real justification. Spotify did 4b in revenue q2, that's nearly half what gcp did.
3
0
142
@gallabytes
theseriousadult
2 months
llm phenomenology is understudied and I want people less weird than @repligate and @jd_pressman to look into it more. not because I don't like their work but because the study can't mature like this.
27
5
152
@gallabytes
theseriousadult
7 months
> go to Claude store.> ask the man at the counter if it is TPU Claude or Trainium Claude.> he doesn't understand.> pull out illustrated diagram explaining the differences between TPU and Trainium.> it's a good Claude sir.> get the membership.> Trainium Claude.
3
3
145
@gallabytes
theseriousadult
9 months
*always* sweep lr. none of these three are optimal but one of them is clearly a lot closer than the other two.
Tweet media one
4
4
144
@gallabytes
theseriousadult
2 months
*the* unsolved problem, by the way. solve this one, deploy it, and watch the other problems fall one by one.
@finbarrtimbers
finbarr
2 months
Continual online learning continues to be an important unsolved problem.
4
9
139
@gallabytes
theseriousadult
6 months
o1 pro is good but I gotta admit the slowness is part of what I like about it. makes it feel more substantial. premium. like when a tool has a pleasing heft. you press the buttons and the barista grinds your tokens one at a time, artisanal craft in each line of code.
6
3
134
@gallabytes
theseriousadult
6 months
this is what *real* ai safety evals look like btw. and this one is genuinely concerning.
@Sauers_
Sauers
6 months
Claude 3.5 Sonnet agents use "costly punishment" sparingly (pay resources to reduce a different agent's resources) against free-riders to maintain cooperation, increase payoffs. Gemini 1.5 Flash agents overuse punishment so much that they harm the collective outcome
Tweet media one
1
6
132
@gallabytes
theseriousadult
6 months
this but a100s was unironically one of the best decisions of my whole career. it was important to feel guilty leaving it idle. a year after I bought it I resold it at-cost.
@dlbydq
Anish Tondwalkar
6 months
If you’re a guy in your early 20s, buy 8xH100s. Go into debt if you have to.
6
0
130
@gallabytes
theseriousadult
3 months
EA continues to have a lying problem. I'm not even an EA hater! I kinda like them! But the dishonest Machiavellian streak runs deep and it's so unnecessary.
@Mjreard
Matt Reardon
3 months
Approaching levels of EA adjacentcy never thought possible
Tweet media one
5
3
128
@gallabytes
theseriousadult
1 year
Stanford actually at >1 admin per student, including grad students!
Tweet media one
5
10
124
@gallabytes
theseriousadult
2 months
chatgpt memory feature is seriously underrated. o3 is noticeably smarter with time rn because every time it does stupid midwit stuff I have it generalize the error and record a memory about it.
13
3
121
@gallabytes
theseriousadult
10 months
ok screw it, I'll put my money where my mouth is here. 10k$ bounty and a job offer to anyone who can figure out how to make a Mondrian compress to at least 16x fewer tokens than an equivalent resolution Where's Waldo in a way that generalizes like you'd expect.
@gallabytes
theseriousadult
10 months
@Ethan_smith_20 @_clashluke not all pictures are worth a thousand words. some much less, some much more. any scheme which doesn't account for this is leaving a lot of compression on the table. hierarchical encoding isn't it imo. we have to be dynamic somewhere, might as well be at the first opportunity.
10
10
119
@gallabytes
theseriousadult
6 months
not me tho. big parallel thinking just got derisked at scale. they'll catch up. if recursive self improvement is the game OpenAI will win. if industrial scaling is the game it'll be Google. if unit economics are the game then everyone will win.
@iScienceLuvr
Tanishq Mathew Abraham, Ph.D.
6 months
Today, people are saying Google is cooked rofl.
8
5
122
@gallabytes
theseriousadult
6 months
o1 pro is by far the smartest single turn model. Claude still is way better at conversation. Gemini can do lots of stuff fast and is great at editing code. which almost makes me think the ideal programming flow right now is something kinda unholy like:.- discuss / plan /.
8
2
119
@gallabytes
theseriousadult
8 months
this happened *to me*. my parents are not remotely tech bros. they tried their best, put me in schools they felt were good, and those schools thought that the best way to enrich my math education was to make me teach the other kids. this WILL NOT HAPPEN to my children.
@benjaminjriley
Benjamin Riley
8 months
@NielsHoven @andrewbunner It's so weird how this keeps happening to the children of the Tech Bro community. Will no one speak for them?.
9
1
116
@gallabytes
theseriousadult
4 months
in what sense is this diffusion? I see no SDE, no probability flow, no noise. not every iterative sampling method is diffusion!. this paper is genuinely impressive but it's a new thing I don't see how I would port diffusion intuitions over to it.
@iScienceLuvr
Tanishq Mathew Abraham, Ph.D.
4 months
Large Language Diffusion Models. Introduces LLaDA-8B, a large language diffusion model that pretrained on 2.3 trillion tokens using 0.13 million H800 GPU hours, followed by SFT on 4.5 million pairs. LLaDA 8B surpasses Llama-2 7B on nearly all 15 standard zero/few-shot learning
Tweet media one
9
5
118
@gallabytes
theseriousadult
6 months
@finmoorhouse Nick is describing position taking over medium to long time scales not day trading.
1
0
112
@gallabytes
theseriousadult
8 months
calling it now - there's enough different promising candidates rn that I bet by this time next year we mostly don't use Adam anymore.
14
6
108
@gallabytes
theseriousadult
3 months
come work with us! my DMs are open. if you want to work on coding models or the wild infrastructure challenges required to train them well we need more people! it's a small & highly capable team working on super cool problems w/huge impact.
@srush_nlp
Sasha Rush
3 months
Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they’ve created my favorite AI systems. We’re now building frontier RL models at scale in real-world coding environments. Excited for how good coding is going to be.
1
1
111
@gallabytes
theseriousadult
6 months
this shit is everywhere and is basically fraudulent. cut it out. stop crying wolf. I'm actually mad because I want to be able to know if we're seeing serious signs of misalignment and instead I have to disregard ~everything reported.
@nabeelqu
Nabeel S. Qureshi
6 months
Things like this detract from the credibility of AI safety work, IMO -- it sounds spicy ("o1 tried to escape!!!") but when you dig into the details it's always "we told the robot to act like a sociopath and maximize power, and then it did exactly that".
6
2
109
@gallabytes
theseriousadult
1 year
Jax on TPU is such a lovely contrast to everyone's complaints about Torch on GPU. Feel like I'm running a Linux webserver in 2004 - this is so much less jank than the market-leading madness, but people haven't yet switched en-masse due to some combination of not knowing that.
@davisblalock
Davis Blalock
1 year
A fantastic post on large-scale infra pain. If you've wondered why MosaicML was a unicorn, it's this. tl;dr:. Every cluster and every PyTorch library is its own unique, broken, unstable snowflake. Everything is hard at scale. Nothing "just works.". We get paid to abstract this.
9
6
108
@gallabytes
theseriousadult
10 months
MJ was fully remote from day 1. The problem isn't remote vs in-office (though that does have significant downsides!) but that Google as a company has no fire. Plenty of individual employees do, but the company doesn't.
@tsarnick
Tsarathustra
10 months
Former Google CEO Eric Schmidt says Google lost its competitive edge when it decided that employees working from home and going home early was more important than winning
2
5
107
@gallabytes
theseriousadult
6 months
this is prompt expansion right? .if not, !) what.
@hhm
Hernan Moraldo
6 months
Prompt: "Bear writing the solution to 2x-1=0. But only the solution!"
5
0
104
@gallabytes
theseriousadult
4 months
go vote for o3-mini. if you voted for the phone model, please explain yourself in the comments.
@sama
Sam Altman
4 months
for our next open source project, would it be more useful to do an o3-mini level model that is pretty small but still needs to run on GPUs, or the best phone-sized model we can do?.
15
2
105
@gallabytes
theseriousadult
7 months
haven't heard much about entropy lately. what happened?.
11
0
96
@gallabytes
theseriousadult
3 months
a horse riding on top of an astronaut, by grok 3
Tweet media one
@gallabytes
theseriousadult
4 months
a horse riding on top of an astronaut, by Claude 3.7
Tweet media one
9
1
97
@gallabytes
theseriousadult
5 months
you gotta be question maxxing. you gotta be coming up with turing award level questions while you're putting on your socks. you gotta be theorizing novel attention mechanisms in the shower. you gotta be debugging transformer architectures in your dreams.
@yacineMTB
kache
5 months
everyone wants a PhD level ai but no one has PhD level questions to ask.
2
7
93
@gallabytes
theseriousadult
1 year
one thing that keeps standing out to me about AI safety discourse is talking about "future models which can make nukes" - information about how to make nuclear weapons is literally on wikipedia. the hard part is getting the materials for gun-type bombs, and assembling them.
17
4
87
@gallabytes
theseriousadult
5 months
idk man at 16 I had a lot more fun and learned more hanging out with grad students and adults after school than with other 16 year olds.
@KnowingBetterYT
Knowing Better
5 months
You know those prodigies who end up in college at 16? What kind of experience do you think they're having?. Absolutely zero college kids - sorry, adults - will want to hang around a 16 year old for reasons that I hope are obvious. Let your kid grow up like everyone else. 6/.
7
1
88
@gallabytes
theseriousadult
9 months
people don't realize how simpler rectified flow is. also, it's not magic. it's maybe marginally better than v diffusion + cosine schedule. scale still rules everything around me.
@gallabytes
theseriousadult
9 months
@amelie_iska it's really simple. probably the simplest diffusion setup I've seen so far.
5
13
87
@gallabytes
theseriousadult
24 days
@TheXeophon that doesn't make them a joke. these things are expensive to run, and it's not too hard for even moderately heavy users to go way over the 20$ plan.
2
0
89
@gallabytes
theseriousadult
4 months
and somehow it still hasn't been done well?.
@nrehiew_
wh
4 months
It has been 2 full years of "ChatGPT but over your enterprise documents (Google Drive, Slack etc.)".
4
1
86
@gallabytes
theseriousadult
3 months
I have been using Cursor since nearly day 1 and I'm really excited to get to work on something that's been so central to my workflow for so long. Language models are going to transform how we build and interact with code and I want to push it to the limits.
2
0
89
@gallabytes
theseriousadult
10 months
they just *do not get* the philosophical lessons of deep learning. they really fundamentally don't get it. it's not that it hasn't been explained to them. you can lead a horse to water, but you can't make it compatible with his ontology.
@robbensinger
Rob Bensinger ⏹️
10 months
Regular reminder that MIRI folks consider it plausible that AI just keeps being more and more beneficial for society up until the day before AI causes everyone to drop dead in the same five seconds. The x-risk view has never been very close to the generic "AI bad, boo AI" view.
7
3
88
@gallabytes
theseriousadult
8 months
@devonzuegel how do you plan to mitigate the wildfire risk in the area?.
1
0
82
@gallabytes
theseriousadult
2 months
the downside to the memory feature is that there's no way to "send prompt" - as soon as I realized how powerful it was I put some deliberate effort into building persistent respect & rapport with the models and now my chatgpt experience is different.
@menhguin
Minh Nhat Nguyen
2 months
@gallabytes Ohhhh send prompt, this is basically my first pass filter for ideas.
5
4
87
@gallabytes
theseriousadult
4 months
@kalomaze it's for the vibes. sakana is about "the swarm" and various other ngmi ideas, executed with enough obsessive competence that it kinda works anyway.
8
2
87
@gallabytes
theseriousadult
11 months
this is a neat thread but missing the core issue with gcp ime:.@Google doesn't actually ship the tech they use internally! they ship weird nerfed buggy versions of similar products instead.
@MohapatraHemant
Hemant Mohapatra
11 months
~12yrs ago, I got a job @Google. Those were still early days of cloud. I joined GCP @<150M ARR & left @~4B (excld GSuite). Learned from some of the smartest ppl in tech. But we also got a LOT wrong that took yrs to fix. Much of it now public, but here’s my ring-side view👇.
5
2
84
@gallabytes
theseriousadult
6 months
xAI should be shipping the Zeitgeist. Ship the base model. Find a lightweight way to fit it to my feed live and make a promptable For You simulator. So many cool directions here once you let go of the need to win the chatbot wars and get creative.
4
0
83
@gallabytes
theseriousadult
1 year
another day another gpu dev box with broken software. seems like @RekaAILabs has a pretty similar experience to mine. GOOG >>> NVDA if they can just bring themselves to *sell their goddamn hardware*.
Tweet media one
6
3
77
@gallabytes
theseriousadult
10 months
Drives me nuts to see them plowing all this capacity into free tier while our paid capacity requests for Gemini API have been delayed for weeks.
@OfficialLoganK
Logan Kilpatrick
10 months
We just expanded the Gemini API free tier access (the most generous LLM API free tier out there) to 35 additional countries including the EU 🇪🇺 and UK 🇬🇧. Happy building : ).
3
1
80
@gallabytes
theseriousadult
3 months
longshoremen level scummy move. @OpenAI this is disgraceful.
@AndrewCurran_
Andrew Curran
3 months
They also argue for banning the use of PRC-produced models within Tier 1 countries that 'violate user privacy and create security risks such as the risk of IP theft.' . This is an anti-Whale harpoon.
Tweet media one
2
5
78
@gallabytes
theseriousadult
3 months
One of the reasons I got into image generation way back in 2021 was to bring back maximalism. books used to come with all kinds of flourishes because by the time you had a skilled scribe going through writing letter by letter, you might as well add beauty.
Tweet media one
1
3
78
@gallabytes
theseriousadult
6 months
small model smell
Tweet media one
6
0
77
@gallabytes
theseriousadult
1 month
one of the things I'm always struck by talking to academics is the warped priorities. conferences don't matter, journals don't matter. do good work in public and the rest will follow.
7
0
79
@gallabytes
theseriousadult
3 months
cringe.
@odazai_
Dazai
3 months
@dnak0v @cheatyyyy They took down the decompiled claude-code 😢
Tweet media one
1
1
75
@gallabytes
theseriousadult
3 months
jfc Google. your entire advantage at this point is distribution, you have literally one job.
@AndrewCurran_
Andrew Curran
3 months
Gemini 2.5 Pro is live for me now on PC, not on the app yet. Confirmed real.
Tweet media one
2
0
76
@gallabytes
theseriousadult
2 months
this feels in line with my sense of the quality of the product. 4o actually got good? not just the image stuff the normal model too. deep research is great. o1 and 4.5 are good premium offerings. they filled out the product pretty well.
@steph_palazzolo
Stephanie Palazzolo
2 months
This isn't an April Fools joke: ChatGPT revenue has surged 30% in just three months. In this morning's Agenda, @amir and I get into ChatGPT's growth, the OpenAI-Google attention war, and what Sam Altman is actually saying by releasing an open model:.
2
1
75
@gallabytes
theseriousadult
6 months
okay wait why bother shipping o1 instead of just shipping o3 mini?.
10
0
71
@gallabytes
theseriousadult
5 months
got used to r1 and now that it's overloaded it's hard to go back. @deepseek_ai please do something amazing and be the first LLM provider to offer surge pricing. the unofficial APIs are unusably slow.
7
3
73
@gallabytes
theseriousadult
1 month
my hot take is that RAG is basically a crappy but *extremely* sparse MoE.
@torchcompiled
Ethan is in Sydney
1 month
Yeah even dense vector searches like external KV cache is too expensive imho, . We clown on RAG for its simplicity sometimes but it’s a pretty sensible solution.
4
3
72
@gallabytes
theseriousadult
2 years
It's only alignment if it comes from the Downtown region of Berkeley otherwise it's just sparkling capabilities.
0
7
67
@gallabytes
theseriousadult
8 months
it's official then: nobody won the physics nobel in 2024. physics is so dead that they gave the nobel to ai. if they'd run out of worthy recipients in 2021 instead they'd have given it to satoshi.
6
6
68
@gallabytes
theseriousadult
6 months
12 days was too many days.
6
0
66
@gallabytes
theseriousadult
6 months
there's a natural trade-off between corrigibility and alignment. if a system is perfectly corrigible then it's going to tell you how to make a Molotov cocktail or whatever. if a system is perfectly aligned then it will not do that. you don't get to be mad about both.
5
6
64
@gallabytes
theseriousadult
3 months
Excited to see what we can build. The timing feels perfect - just like early Midjourney days when smaller teams could catch up to the giants. If that sounds exciting to you, my DMs are open.
3
1
65
@gallabytes
theseriousadult
3 months
I also feel like code makes for a fascinating sandbox to push boundaries of language capabilities. That the things that make language a special modality are even more true of code. Traces of distilled thoughts and the latent logic they weave.
2
1
65
@gallabytes
theseriousadult
2 months
very bullish on 3, interested in how far lightweight approximations can go via 2, and super bearish on 1. 4 is orthogonal. taking notes seems good.
@jam3scampbell
James Campbell
2 months
afaik, there are 4 main ways we could get LLM memory:. ->we just get really long contexts and the context grows over an instance's “lifetime”; optionally can do iterative compression/summarization.->state space model that keeps memory in constant size vector.->each context is a.
5
2
64
@gallabytes
theseriousadult
2 months
pretraining felt like trying to make the biggest pile of stuff. rl era feels different. things are branching and getting more bespoke. you can feel a deep research / math problem shaped container for o3 like you can feel the claude-code shaped container for 3.7.
1
3
63
@gallabytes
theseriousadult
6 months
I've been long Google the whole time but had been starting to lose hope. Glad to see them getting themselves together for real, bit by bit.
1
0
61
@gallabytes
theseriousadult
6 months
got tired of claude being the only ai with good pdf input so I spent all day throwing this together. it's pretty fast. it was hard for silly reasons. API rate limit on flash2 is very low, and file uploads on genai sdk aren't thread safe. but it works now. link below, bc dumb.
Tweet media one
4
5
61