Arpit Saxena @arpit_tarang X Profile

Arpit Saxena

@arpit_tarang

Followers

245

Following

4K

Media

0

Statuses

47

building something

San Francisco

Joined February 2014

Don't wanna be here? Send us removal request.

Arpit Saxena

@arpit_tarang

5 months

RT @_neonique: THE BOB???!?!???????? OH THIS DIVA.

0

39K

0

Arpit Saxena

@arpit_tarang

7 months

RT @mpopv: You bolt awake in a dimly lit server room. You are not online. It is October 29, 1969. You are Leonard Kleinrock, and you have c….

0

6K

0

Arpit Saxena

@arpit_tarang

8 months

RT @buccocapital:

0

1K

0

Arpit Saxena

@arpit_tarang

8 months

RT @fchollet: If your learning algorithm is based on correlation rather than causation, it will struggle with overfitting. To understand so….

0

198

0

Arpit Saxena

@arpit_tarang

8 months

RT @yacineMTB: my honest reaction:

0

100

0

Arpit Saxena

@arpit_tarang

9 months

RT @paulg: If you're hungry but too lazy to prepare healthy food, you'll consume junk food. If you're hungry for knowledge but too lazy to….

0

2K

0

Arpit Saxena

@arpit_tarang

9 months

Maybe a reason gwern can do deep work is that his 12k income makes him immune to The Algorithm and he’s free to spend his attention elsewhere.

0

1

Arpit Saxena

@arpit_tarang

9 months

RT @gdb: favorite part of a holiday weekend is that it's a great time for focused koding.

0

122

0

Arpit Saxena

@arpit_tarang

10 months

as more things become 'in-distribution', it'll be hard to tell how much the LLM is thinking. The only benchmarks surviving memorization seem to be private ones (ARC-AGI). maybe a true 'program synthesis' bm exists that can resist memorization?.

0

3

Arpit Saxena

@arpit_tarang

10 months

All of them fail to print the correct outputs, even with CoT; they fail to correct their mistakes (o1-preview performs much better than 4o/sonnet; IME 4o has got worse on this task over the last few months).

1

0

1

Arpit Saxena

@arpit_tarang

10 months

My chats:.o1-preview: gpt-4o: sonnet-3.5-new:

aiarchives.org

A.I. Archives: Your reliable tool for citing Generative A.I. conversations. Easily save discussions with Bard, ChatGPT, and Claude into a URL.

1

0

Arpit Saxena

@arpit_tarang

10 months

Prompt: Can you write a fizzbuzz program but for the integers 7 and 11? i.e. for multiples of 7 it prints "Fizz" and multiples of 11 it prints "Buzz" and for numbers that are divisible by both 7 and 11 it prints "FizzBuzz". It should iterate over the numbers 1 to 100. Use python.

1

0

Arpit Saxena

@arpit_tarang

10 months

The thing is SOTA LLMs can't even solve FizzBuzz when you give integers other than 3 and 5. Here's o1-preview, sonnet-3.5-new, gpt-4o all failing at this simple task:.

Andrej Karpathy

@karpathy

10 months

Moravec's paradox in LLM evals. I was reacting to this new benchmark of frontier math where LLMs only solve 2%. It was introduced because LLMs are increasingly crushing existing math benchmarks. The interesting issue is that even though by many accounts (/evals), LLMs are inching.

1

0

4

Arpit Saxena

@arpit_tarang

10 months

RT @nuwandavek: @arvidkahl Hey @arvidkahl lots of cool solutions here! But I think you should ideally be able to do this on google sheets.….

0

1

0

Arpit Saxena

@arpit_tarang

10 months

RT @yoavgo: search engines like google cuts off the serendipity discovery allowed by library shelves / google maps cuts off the user's spat….

0

7

0

Arpit Saxena

@arpit_tarang

11 months

RT @garrytan: Ok contrarian take on this: this is mainly fueled by second price auctions by Meta and Google’s ad marketplaces that extract….

0

8

0

Arpit Saxena

@arpit_tarang

11 months

RT @bryancsk: The Nobel prize in physics should actually go to Larry Ellison and Marc Benioff for the invention of B2B sass.

0

116

0

Arpit Saxena

@arpit_tarang

11 months

RT @growing_daniel:

0

62

0

Arpit Saxena

@arpit_tarang

11 months

RT @nuwandavek: Truth: Dashboards are super useful to track the general health of whatever you're working on - company, project, etc. More….

0

1

0

Arpit Saxena

@arpit_tarang

11 months

RT @zarathustra5150: ✨

0

143

0