sophia Profile Banner
sophia Profile
sophia

@cis_female

Followers
2,975
Following
1,697
Media
1,100
Statuses
11,473

i want to know everything

sf
Joined November 2019
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@cis_female
sophia
10 months
I feel like it’s not obvious to consumers how monumental and expensive regular software is. iOS cost about as much as the Manhattan project (~$20b). Google search cost about as much as the ISS (~$100b).
52
115
2K
@cis_female
sophia
11 months
432 Park Avenue, the platonic ideal of an expensive structure per square foot, spent more buying the land than building the building ($724m for land, ~$500m for construction)
Tweet media one
31
46
1K
@cis_female
sophia
8 months
Elon is substantially correct that it's bad Wikipedia begs for donations so much because they use the money for stupid things. In 2021 they got $164m in donations, spent $145m, of that $66m was spent on "core functions", i.e. keeping the site up.
24
38
971
@cis_female
sophia
1 year
nvidia is currently shipping around 200,000 H100s *per quarter*, which is around 12 exaflops of fp64, more than double the entire top500 combined
14
84
707
@cis_female
sophia
9 months
@netcapgirl In 2010 purity of intercepted drug shipments ranged from 13-62%, in 2022 it ranged from 58-75%
Tweet media one
Tweet media two
11
12
535
@cis_female
sophia
1 year
I implemented the transformer in 30 minutes and 30 lines of python which compiles to 6670 instructions of gpu microcode (lower-level assembly), or ~100kb total (twitter js is ~1000kb). This shitty implementation achieves 40% of the A100's *THEORETICAL MAXIMUM* performance
11
17
423
@cis_female
sophia
7 months
unimaginable alpha drop from a field's medal winning mathematician. have got to start doing this expeditiously
Tweet media one
12
27
428
@cis_female
sophia
2 years
Shooting my shot
Tweet media one
22
3
393
@cis_female
sophia
11 months
posted in the rationalism groupchat
Tweet media one
8
15
330
@cis_female
sophia
3 months
There is no fucking way Pi has 6 million MAU when the app has 500 times fewer ratings than ChatGPT
Tweet media one
Tweet media two
Tweet media three
33
5
308
@cis_female
sophia
1 year
Everywhere on twitter you see people talking about open-source models and local inference, but local inference is much harder than most people think for fundamental reasons related to how GPT-style models work.
@SullyOmarr
Sully
1 year
Everyone is talking about OpenAI, Microsoft and Stability as a huge AI player. But there is one company that arguable the best positioned in the space, and they dont even have an LLM offering yet. Apple has the potential to shift the entire AI landscape, here's how:
Tweet media one
190
634
4K
16
33
273
@cis_female
sophia
8 months
i support this with no irony whatsoever. you should be able to go to the doctor for an mtm transition
9
13
249
@cis_female
sophia
10 months
people complain about shoddy or buggy software sometimes but I wonder if they really understand how expensive it is to make this stuff
3
3
253
@cis_female
sophia
1 year
can't get over the fact that the best way we have to improve the performance of our linear algebra is to whisper sweet nothings in its ear
@kareem_carr
🔥Kareem Carr | Statistician 🔥
1 year
We now have data demonstrating my tips for improving prompts to GPT-4 work at least in this one case! ✅ Telling GPT-4 it was more competent increased the success rate from 35% to 92% ✅ Giving GPT-4 a strategy for completing the task increased it from 26% to 54%
Tweet media one
30
96
661
4
27
217
@cis_female
sophia
3 months
In the OpenAI blog post they mentioned "Albania using OpenAI tools to speed up its EU accession" but I didn't realize how insane this was -- they are apparently going to rewrite old laws wholesale with GPT-4 to align with EU rules
8
44
181
@cis_female
sophia
4 months
AI/ML research pre-2012 was totally worthless. Neural networks are just matmuls, backpropagation is obvious, we would have figured all this out once we had the compute for ads models, applied it to everything else, and we would get the exact same world we have now.
30
6
178
@cis_female
sophia
1 year
hey, don't fucking do this! thanks!
Tweet media one
12
3
162
@cis_female
sophia
8 months
@bnkrft I agree that a strong community is necessary, but my general perception from when i participated in editing in the past is that the programs they do to grow wikimedia don't meaningfully contribute to this.
1
1
151
@cis_female
sophia
2 months
what the hell does this mean
@dannypostmaa
Danny Postma
2 months
Jensen, the founder of NVIDIA, said something that triggered me today at the Stripe event. He compared the Industrial Revolution's conversion of water (atoms) to electricity (electrons) to a modern process of converting electricity to tokens. He introduced the concept of 'token…
55
99
953
12
1
150
@cis_female
sophia
2 years
@alth0u how are they possibly paying $1,000 a month for utilities. this whole thing is insane
6
0
139
@cis_female
sophia
2 years
has there literally ever been a startup with more funding per employee than anthropic? $700m funding for 40 employees, $17m per employee at this point which I'm guessing they're spending ~entirely on compute. How would it even be possible to scale up that large
8
1
134
@cis_female
sophia
10 months
@tjade273 I mean part of the money spent on both iOS and Google includes memory safe programming languages (Swift, Go) and compiler technology (llvm)
0
3
138
@cis_female
sophia
8 months
@moonglitchwitch @TylerGlaiel I think the idea is that you roughly characterize the noise and then generate something similar, not necessarily an exact recreation
1
0
136
@cis_female
sophia
2 years
Applying to Netflix. Wish me luck. I want that $500k
7
1
133
@cis_female
sophia
9 months
Becoming more convinced that the best researchers need to have an intuitive grasp of systems so they understand what’s easy and what’s hard, what’s cheap and what’s expensive
6
8
131
@cis_female
sophia
2 years
@coldhealing Her cadence is off somehow
3
0
105
@cis_female
sophia
1 year
made a simple transformer implementation for testing how complicated transformers are in practice:
5
10
127
@cis_female
sophia
27 days
@swe_zach send him a dick pic
1
0
129
@cis_female
sophia
1 month
the best version of this was Da Yan's paper, where he wrote a matmul in SASS (~GPU microcode, whereas CUDA ~= GPU C) that outperformed cuBLAS matmul by 20% (!!!) on square shapes (!!!)
Tweet media one
@johannes_hage
Johannes Hagemann
1 month
CUDA mode maxxing doesn’t cut it anymore anon, quants out here reverse engineering better register mappings than nvcc for the highest throughput
Tweet media one
4
19
346
3
13
128
@cis_female
sophia
11 months
@Duderichy yes, and also to demolish some buildings
1
0
124
@cis_female
sophia
11 months
to be fair this is literally on park avenue, they can’t use the whole site, etc. etc. but still!
1
0
124
@cis_female
sophia
2 years
@powerbottomdad1 Got this today. Like come on man, you already scheduled a screen!!!
Tweet media one
18
1
110
@cis_female
sophia
2 years
@alth0u the people on which this country was built
Tweet media one
4
1
108
@cis_female
sophia
1 year
i didn't kiss a girl until i was earning six figures
7
0
108
@cis_female
sophia
2 years
@alth0u do they just have the shower running at all times for convenience' sake?
3
0
105
@cis_female
sophia
2 months
My bear case for Nvidia is simple: giant matmul machines are not that hard to make: Google + Amazon already have. Given the flops/instruction ratio of modern transformers, software doesn't matter either: if their stack sucks, just write everything in microcode!
12
5
108
@cis_female
sophia
1 year
. @noumena and I met at vibecamp last year and we're making a documentary about Vibecamp and what it meant to people. How did Vibecamp change your life? What did it mean to you? DM us! We're looking for people to interview.
7
6
108
@cis_female
sophia
7 months
I feel like voice sample attack is less relevant here than the fact that he went around at parties introducing himself as beff jezos. Same with Roon/typed -- it doesn't matter if there's a face pic posted, there must be hundreds of people who know them irl already
7
0
96
@cis_female
sophia
2 months
llama-3-70B is as good or better than sonnet but ~10x cheaper, about as cheap as Haiku. Llama has just demolished everything below gpt-4 level
Tweet media one
3
7
97
@cis_female
sophia
11 months
GPU PROGRAMMING FRIDAY IS BACK: A NEW RENAISSANCE we will be dissecting Hopper CUDA matmuls, doing ablations and trying to understand their performance characteristics. 6pm at Noisebridge, 272 Capp Street
9
4
93
@cis_female
sophia
1 year
thinking about when i was crushing on this girl at a summer camp with no computers and she said she was going away for a weekend and asked if i wanted anything and i asked if she could get me the kernel implementation of printf()
3
0
93
@cis_female
sophia
3 months
Shin Jinseo is about as much better than the second best Go player as Magnus Carlsen is better than the #100 chess player. This is an absurd level of domination.
4
1
92
@cis_female
sophia
9 months
classic cuda moment when adding a print statement causes your indexing bugs to go away
9
1
90
@cis_female
sophia
1 year
this is what i’ve been working on the last little bit: writing those kernels!
@magicailabs
Magic.dev
1 year
How? We tried to scale standard GPT context windows but quickly got stuck. So, we designed a new approach: the Long-term Memory Network (LTM Net). Training and serving LTM Nets required a custom ML stack, from GPU kernels to how we distribute the model across a cluster.
2
7
115
11
3
88
@cis_female
sophia
4 months
Come work with us! We're hiring for a bunch of different roles
@magicailabs
Magic.dev
4 months
We've raised $117M from @natfriedman and others to build an AI software engineer. Code generation is both a product and a path to AGI, requiring new algorithms, lots of CUDA, frontier-scale training, RL, and a new UI. We are hiring!
Tweet media one
35
94
684
5
0
82
@cis_female
sophia
5 months
@danluu Yeah the first comment about Google is literally unrecognizable to the company I worked at
0
0
81
@cis_female
sophia
10 months
@meatballtimes one of the other takeaways haha
2
0
82
@cis_female
sophia
1 year
here's the transformer implementation: here's the microcode for fp16 matmul (mlp_up, mlp_down, qkv, o) here's the microcode for flashattn
4
4
82
@cis_female
sophia
3 months
Claude and Chatgpt4 seem to both be worse than useless for CUDA programming. Asked them a few questions and they made up bullshit that I then had to check and confirm was wrong.
17
0
79
@cis_female
sophia
1 year
How does the H100 have only 50% more transistors than the A100 but manages 200% more compute all across the board? They didn't seem to give anything up e.g. same SRAM sizes so did they just get twice as efficient at creating floating point units? what did they do?
11
3
81
@cis_female
sophia
3 years
cannot get over people abbreviating "L1 Cache" as "L1$", its twisted
3
1
78
@cis_female
sophia
5 months
Why were there multiple $100m-$1000m businesses created with 3.5 & turbo but no significant businesses created with 4 & turbo? Tons of people report 4 is way better than 3.5, so why were there not major new products enabled by 4?
12
2
79
@cis_female
sophia
2 years
@xsphi you still get improvement over time if you can filter for popular things, because people are still selecting a certain portion of the latent space
2
0
77
@cis_female
sophia
2 months
@chordbug SEGMENTATION FAULT: CORE DUMPED
0
0
77
@cis_female
sophia
6 months
@nikitabier One of the weird impacts of this is that you couldn't gift people things at all, because this would be a way to get around the inheritance tax. You could have a floor below which the tax doesn't apply but above that floor gifting people things would essentially be illegal
6
0
77
@cis_female
sophia
21 days
i fucking hate optimizing compilers. I was trying to write a global sync in CUDA for benchmarking, finally managed to get something I thought worked, PTX looked good, but ptxas broke my sync so I had to add more stupid bullshit to get it to work again
Tweet media one
Tweet media two
Tweet media three
Tweet media four
6
3
75
@cis_female
sophia
2 years
Head too full of tensors to think about women
5
3
71
@cis_female
sophia
2 months
the actual bull case for groq is agents. no one needs to read text that quickly — but an assistant that can think at 500 tokens/second is compelling
5
1
75
@cis_female
sophia
1 year
absolutely tragic that thousands of virile american men died constructing the panama canal due to a lack of cute asian girls
5
6
72
@cis_female
sophia
8 months
source: wikipedia 2021 form 990
0
0
75
@cis_female
sophia
2 years
@timdmackey @emailfromnaomi haha webp and heic are actually similar in that they are image formats derived from video formats!
1
1
75
@cis_female
sophia
2 months
wow, llama-3-8b has the same MMLU as 3.5-turbo but is servable for ~$0.05/million tokens + is RLHFable, finetunable, etc.
2
1
71
@cis_female
sophia
2 years
@xsphi like if i generate 10,000 images and only post one that's still valuable for a model to train on because the one image is much better
3
0
67
@cis_female
sophia
2 years
Stripe tells me on Tuesday the offer explodes next Monday; Next Monday comes and they tell me they need a decision by Wednesday. This is the easiest poker game I’ve ever played!
4
1
65
@cis_female
sophia
6 months
@geneslovee the project to put widgets on the lock screen was called "locket" for this reason
0
2
68
@cis_female
sophia
15 days
Sophia Fun Facts 1. Amish have an exemption from social security 2. I was just in a cool castle in Shizuoka and they built the castle in the 90s 3. Manhattan pays 2.7% of federal income tax off 0.5% of the population 4. Semaglutide loses patent protection in 2031
6
0
68
@cis_female
sophia
6 months
@fasterthanlime they have a 76% gross margin as of their last quarter -- their services make money, they just spend most of it on marketing/development
0
1
68
@cis_female
sophia
2 years
spent a few days actually coding and feel profoundly drained, unable to speak, unable to concentrate
3
0
61
@cis_female
sophia
2 years
Last day at Google :)
Tweet media one
3
0
63
@cis_female
sophia
3 months
In matrix multiplications, we go (m,k) @ (k, n) -> (m, n). This takes mk+kn+mn membw and mkn flops. When one of m/k/n is much smaller than the others, say m, mk+mn << kn, so arithmetic intensity is now mkn/kn = m. so: easy approximation of arithmetic intensity is just min(m,n,k)
1
0
65
@cis_female
sophia
2 years
What am I looking for in my next job? Give me a medium-large complex system to autisticly optimize so I don’t have to track progress or listen to product thoughts and I can just work
1
5
64
@cis_female
sophia
1 year
This is not publicly posted in an easily digestible way so I am posting it here for that purpose: Cerebras WSE2 costs about $5m for 16 petaflops of fp16 and probably 4 petaflops of fp32. This is about ten times worse than A100 (31 gigaflops/$ vs 3.2 gigaflops/$).
11
0
64
@cis_female
sophia
1 year
in 2010 people said "it's IO bound!" and they were correct. But since then IO has gotten 100x faster and CPUs have gotten 5x faster, so it's not IO bound anymore is this actually true? it feels like a nice story
13
1
63
@cis_female
sophia
6 months
If you support trans women make sure to get your gpus from @sfcompute
8
6
61
@cis_female
sophia
1 year
why does nobody offer a bulk inference product for llms? eg “here are 10,000 prompts, give me results in the next 24 hours”. any time when a gpu would be unused (presumably often?) you could work on one of these bulk inference jobs
2
1
60
@cis_female
sophia
3 months
I think it'll just say "yes" to anything
Tweet media one
@bitcloud
Lachlan Phillips exo/acc 👾
3 months
H O L Y S H I T
Tweet media one
269
885
21K
5
3
59
@cis_female
sophia
2 years
Pretty fucked up that you can’t program so well (or poorly!) that you die
8
1
56
@cis_female
sophia
7 months
ready to grind
Tweet media one
1
1
57
@cis_female
sophia
4 years
Tweet media one
1
5
38
@cis_female
sophia
10 months
@gh0stp3pp3r_ $100b has been cumulatively spent on google search over its existence
2
0
58
@cis_female
sophia
3 months
I just realized something horrifying: men who are women score higher on tests of realism.intelligence is highly correlated with transgenderism. What this indicates is that if the male understands too much about reality, it wants to grow fat tits. Masculinity is existential terror
@DeepDishEnjoyer
peepeepoopoo
3 months
Transwomen are absolutely cracked, like more than Ashkenazis, but y'all aren't ready for that conversation.
Tweet media one
234
103
2K
5
3
58
@cis_female
sophia
5 months
@natfriedman Call it 10T tokens of good text @ $1.5/m is $15,000,000. To recreate at quality much more expensive :p
3
0
57
@cis_female
sophia
2 years
@rzhang88 @waxpancake @minimaxir @ByFrustrated another way to try to reproduce this
Tweet media one
2
0
58
@cis_female
sophia
8 months
a bit i've had for many years is that the future of gender is femboys on SERMs and the /r/growyourclit subreddit. my cis sister is on antiandrogens, my cis dad is on T. hormones are coming for everyone
3
1
56
@cis_female
sophia
2 years
put in my two weeks, probably joining stripe? maybe snap? anthropic if they'll have me?
6
0
53
@cis_female
sophia
3 months
look I am very pro-LLM but for the love of god don't write your laws with GPT-4???? If you're going to enforce these on a population of millions of people hire a goddamn lawyer
6
1
52
@cis_female
sophia
2 months
Someone should run a free llama-3-70b chat interface. If you assume 1k tok/conversation and your cost is $0.12/million tokens, a CPM of $2 makes it easily profitable. Would be a major challenger to ChatGPT and Claude IMO
Tweet media one
9
1
54
@cis_female
sophia
1 year
If all men got better, would there still be incels? I think so. People often tell incels that it's easy to have sex -- just be a better person. Yes! But if you structure society so the worst 5% of men can't find love, there will always be incels.
11
1
51
@cis_female
sophia
10 months
I am now the official sponsor of the yen going up emoji. i'd say this is a yen going up emoji moment 💹💹💹
3
1
45
@cis_female
sophia
4 years
@momminature this is the artist by the way. she posts a lot about this sort of thing on /r/gentlefemdom and /r/rolereversal.
1
10
51
@cis_female
sophia
3 years
Tweet media one
1
1
43
@cis_female
sophia
1 year
rule of thumb: a billion parameters * billion tokens is $20 so chinchilla (70b params * 1400b tokens) was about $2,000,000 per training run. original gpt-3 (175b params * 300b tokens) would be $1,000,000
3
1
50
@cis_female
sophia
2 years
@Duderichy I never thought this stuff was that useful because they're not choosing among random american men but among upper-middle-class LA men approximately their age who have a much higher probability of being white, 6'+, not married, earning $100k, etc.
2
0
52
@cis_female
sophia
2 months
@AnniePosting do you live in the US
1
0
53
@cis_female
sophia
2 years
Eminem Snatches Children at NeurIPS: A Shocking Tale
3
11
51
@cis_female
sophia
3 months
The company that makes this drug, Vertex Pharmaceuticals, made $9 BILLION off it in 2023 even though there are only 100,000 people with CF in the whole world. Literally $90,000 per *possible patient* per year. Miracle of modern capitalism to incentivize solving tiny diseases
@LeahLibresco
Leah Libresco Sargeant
3 months
"A child born with CF in the ’50s could expect to live until age 5. In the ’70s, age 10. In the early 2000s, age 35. With Trikafta... those who begin taking the drug in early adolescence, a recent study projected, can expect to survive to age 82.5"
26
200
1K
2
2
52