Arpit Saxena Profile
Arpit Saxena

@arpit_tarang

Followers
245
Following
4K
Media
0
Statuses
47

building something

San Francisco
Joined February 2014
Don't wanna be here? Send us removal request.
@arpit_tarang
Arpit Saxena
5 months
RT @_neonique: THE BOB???!?!???????? OH THIS DIVA.
Tweet media one
0
39K
0
@arpit_tarang
Arpit Saxena
7 months
RT @mpopv: You bolt awake in a dimly lit server room. You are not online. It is October 29, 1969. You are Leonard Kleinrock, and you have c….
0
6K
0
@arpit_tarang
Arpit Saxena
8 months
Tweet media one
0
1K
0
@arpit_tarang
Arpit Saxena
8 months
RT @fchollet: If your learning algorithm is based on correlation rather than causation, it will struggle with overfitting. To understand so….
0
198
0
@arpit_tarang
Arpit Saxena
8 months
RT @yacineMTB: my honest reaction:
Tweet media one
0
100
0
@arpit_tarang
Arpit Saxena
9 months
RT @paulg: If you're hungry but too lazy to prepare healthy food, you'll consume junk food. If you're hungry for knowledge but too lazy to….
0
2K
0
@arpit_tarang
Arpit Saxena
9 months
Maybe a reason gwern can do deep work is that his 12k income makes him immune to The Algorithm and he’s free to spend his attention elsewhere.
0
0
1
@arpit_tarang
Arpit Saxena
9 months
RT @gdb: favorite part of a holiday weekend is that it's a great time for focused koding.
0
122
0
@arpit_tarang
Arpit Saxena
10 months
as more things become 'in-distribution', it'll be hard to tell how much the LLM is thinking. The only benchmarks surviving memorization seem to be private ones (ARC-AGI). maybe a true 'program synthesis' bm exists that can resist memorization?.
0
0
3
@arpit_tarang
Arpit Saxena
10 months
All of them fail to print the correct outputs, even with CoT; they fail to correct their mistakes (o1-preview performs much better than 4o/sonnet; IME 4o has got worse on this task over the last few months).
1
0
1
@arpit_tarang
Arpit Saxena
10 months
Prompt: Can you write a fizzbuzz program but for the integers 7 and 11? i.e. for multiples of 7 it prints "Fizz" and multiples of 11 it prints "Buzz" and for numbers that are divisible by both 7 and 11 it prints "FizzBuzz". It should iterate over the numbers 1 to 100. Use python.
1
0
0
@arpit_tarang
Arpit Saxena
10 months
The thing is SOTA LLMs can't even solve FizzBuzz when you give integers other than 3 and 5. Here's o1-preview, sonnet-3.5-new, gpt-4o all failing at this simple task:.
@karpathy
Andrej Karpathy
10 months
Moravec's paradox in LLM evals. I was reacting to this new benchmark of frontier math where LLMs only solve 2%. It was introduced because LLMs are increasingly crushing existing math benchmarks. The interesting issue is that even though by many accounts (/evals), LLMs are inching.
1
0
4
@arpit_tarang
Arpit Saxena
10 months
RT @nuwandavek: @arvidkahl Hey @arvidkahl lots of cool solutions here! But I think you should ideally be able to do this on google sheets.….
0
1
0
@arpit_tarang
Arpit Saxena
10 months
RT @yoavgo: search engines like google cuts off the serendipity discovery allowed by library shelves / google maps cuts off the user's spat….
0
7
0
@arpit_tarang
Arpit Saxena
11 months
RT @garrytan: Ok contrarian take on this: this is mainly fueled by second price auctions by Meta and Google’s ad marketplaces that extract….
0
8
0
@arpit_tarang
Arpit Saxena
11 months
RT @bryancsk: The Nobel prize in physics should actually go to Larry Ellison and Marc Benioff for the invention of B2B sass.
0
116
0
@arpit_tarang
Arpit Saxena
11 months
Tweet media one
Tweet media two
0
62
0
@arpit_tarang
Arpit Saxena
11 months
RT @nuwandavek: Truth: Dashboards are super useful to track the general health of whatever you're working on - company, project, etc. More….
0
1
0
@arpit_tarang
Arpit Saxena
11 months
Tweet media one
0
143
0