
Hensen Juang
@basedjensen
Followers
10K
Following
74K
Media
2K
Statuses
16K
inference cluster janitor / Sys admin cum architect
Joined June 2024
Man these kids really have no idea what they are doing now do they.
The @USGSA IT team just saved $1M per year by converting 14,000 magnetic tapes (70 yr old technology for information storage) to permanent modern digital records.
137
2K
33K
Hi actual engineer here do not fucking do this.
@jzellis Hi, AI engineer here. He’s asking a legit question. LLMs could be great at converting data formats.
118
186
6K
Bro is gonna be a he/him liberal by the time market opens.
I just figured out why @howardlutnick is indifferent to the stock market and the economy crashing. He and Cantor are long bonds. He profits when our economy implodes. It’s a bad idea to pick a Secretary of Commerce whose firm is levered long fixed income. It’s an irreconcilable.
23
152
3K
@saberjet Oh no look someone who has no idea what they are talking about is loudly being wrong on the timeline.
7
10
3K
@ZachWarunek @anhpham408 Bit of a over reaction to delete dms cause Elon got denied by a Asian chick don't ya think?.
9
71
3K
@tredingdown I don't know so far from what doge had claimed it's not giving lot of confidence.
0
1
2K
I am glad no matter how bad an employee screws up neither Sama nor gdb would do this in their worst days and would step up and take the blame themselves.
@joannejang The employee that made the change was an ex-OpenAI employee that hasn't fully absorbed xAI's culture yet 😬.
29
40
2K
Lol they basically made H800 work same as H100 by dropping down to ptx and allocating sm units to get around nerfed link speed. Fucking brilliant.
> “it seems like they rebuilt everything from the ground up.”.> DeepSeek allocated 20 SMs of 132 for server-to-server communication, directly at PTX level.pivot to learning CUDA, they're still 1 step ahead. I think Whale could design their own hardware by this point.
29
135
1K
I am sorry export controls simply will not work against people who can pull this off.
DeepSeek is a wake up call for America, but it doesn’t change the strategy:. - USA must out-innovate &race faster, as we have done in the entire history of AI.- Tighten export controls on chips so that we can maintain future leads. Every major breakthrough in AI has been American.
28
84
1K
This is not a flex that people think it is. If you are at office grinding away and pushing code at 11:38 night before you go to prod (self imposed deadline at that) something is seriously fucked up.
It's 11:30pm, and many @xai people are in office, hard working at their computers. It's an amazing vibe. Everyone pushes their way to deliver the best experience to you users. Everyone supports everyone. No one fucks no one with politics. You can just do things.
54
23
984
@harambe_musk seems like shrek entropy sampler with early exit solves if not reduce the hallucination problem significantly with big boy models. Some people are running evals now so far seems promising.
22
45
872
@FundProphet @Orwelian84 Those will not last anywhere close to what properly stored tapes will. Not to mention tapes have far higher data density than hdd. Enterprise cold storage is on tape for a reason.
5
2
675
Whale bros officially made it they got schmidhubered.
DeepSeek [1] uses elements of the 2015 reinforcement learning prompt engineer [2] and its 2018 refinement [3] which collapses the RL machine and world model of [2] into a single net through the neural net distillation procedure of 1991 [4]: a distilled chain of thought system.
16
31
549
since I am getting asked a lot this is basically how it's done. the device runs embedded python the backend runs fastapi .
@ricepaddyhacker qr code at seat/ticket use the app to scan it to register the positions and sync with controller the stick talks to app over blue tooth.
8
16
539
I personally know at least 5 startups 3 in yc that got killed by computer use capability.
Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use. Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text.
24
5
529
Reducing social bias makes model less intelligent. Read what you will into this.
Finally, we discovered a feature that significantly reduces bias scores across nine social dimensions within the sweet spot. This did come with a slight capability drop, which highlights potential trade-offs in feature steering.
37
18
512
Bruh ceo of Intel was not working on Thursdays.
Every Thursday I do a 24 hour prayer and fasting day . This week I'd invite you to join me in praying and fasting for the 100K Intel employees as they navigate this difficult period. Intel and its team is of seminal importance to the future of the industry and US.
15
7
477
People who have no idea what on earth they are talking about sure seem to have strong opinions about how everyone else is wrong.
The more you look at @OpenAI's o3 the less impressive it looks. First the thousands of dollars per task cost. Then we find out that the score was of a finetuned version of o3 specifically for the arc challenge. Lastly, it wasn't even the arc challenge we all know. It was JSON
22
12
426
@ricepaddyhacker qr code at seat/ticket use the app to scan it to register the positions and sync with controller the stick talks to app over blue tooth.
15
16
372
Murica is going to lose bad brah. They deployed asian trade wifus to do this
Please go look up your favorite brand and China warehouse on TikTok. They are really fighting back. They said those leggings yall pay $100 for at Lululemon, we'll sell it to you for $5 and help you get it back to the states.
21
18
373
@WaifuverseAI @DocStrangelove2 multiple satellites from multiple different countries have photographed the flags.
2
0
343
@DanielParker_13 @FundProphet @Orwelian84 It's exponentially more expensive. Also good luck when your raid controller fails and it is no longer being made by manufacturer. ( fun facr hp screwed me on this).
0
0
362
This is an incredibly low amount for this kind of work.
Results of our jailbreaking challenge:. After 5 days, >300,000 messages, and est. 3,700 collective hours our system got broken. In the end 4 users passed all levels, 1 found a universal jailbreak. We’re paying $55k in total to the winners. Thanks to everyone who participated!.
12
6
336
Some thoughts.1) it looks like sonnet is smaller than we might have thought.2) they are having serious skill issues trying to serve the model.3) anthropic has not figured out how to scale their inference infra.4) no wonder anthropic has been trying to get me to join their.
16
5
327
this is disappointing coming from hf ceo.
Once again, an AI system is not "thinking", it's "processing", "running predictions",. just like Google or computers do. Giving the false impression that technology systems are human is just cheap snake oil and marketing to fool you into thinking it's more clever than it is.
33
3
314
dawg it's 20 days till elections why are you dropping a shit coin.
Today’s the day! @WorldLibertyFi token sale is live. Get your $WLFI tokens now. Purchase $WLFI here:
25
6
301
You all want to know why apple is so far behind with regards to AI/ML ? The ethics crowd basically destroyed the engineering teams and this paper is a legacy of that.
LLMs don’t reason. But that’s OK, neither do we. What we imagine to be “reasoning” is an abstraction that has never been realized in any strict sense in any stochastic natural phenomenon, such as human brain.
18
10
297
The funniest thing is a lot of whale bro cracked autists wanted to move to murica but could not move due to insane immigration policies and anti China sentiment.
> $5.5M for Sonnet tier.it's unsurprising that they're proud of it, but it sure feels like they're rubbing it in. «$100M runs, huh? 30.84M H100-hours on 405B, yeah? Half-witted Western hacks, your silicon is wasted on you, your thoughts wouldn't reduce loss of your own models»
10
13
296
Deepseek did not disprove scaling laws jfc I am going to need you none technical midwits to take a time out and chill.
Stargate, if it goes forward, is likely to become one of the biggest wastages of capital in history:. 1) It hinges on outdated assumptions about the importance of computing scale in AI (the 'bigger compute = better AI' dogma), which DeepSeek just proved is wrong. 2) It assumes.
16
6
296
I am far from a rationalist but this thinking that cognition is not computable and its some form of woowoo thing is the most god awful takes I have heard.
this is one of my most fundamental disagreements with rationalists; their unquestioned belief that the human mind is the result of some computable process.
21
8
268
This app is going to see levels of unhinged not seen yet.
BREAKING:. India’s aircraft carrier INS Vikrant has left its port and is heading toward Pakistan. Meanwhile, Pakistan’s military announces live-fire drills in the area off the Pakistani coast where INS Vikrant will deploy. Pakistan will conduct a missile test in the area 🇮🇳🇵🇰
11
4
253
Sama is a god tier poaster. Inherently unbullyable.
@elonmusk anyway, see you next week, let’s be friends . 🥺.👉👈. agi too important to let a lil feud get in the way.
17
3
249