shwetank @Shwetankumar X Profile

shwetank

@Shwetankumar

Followers

65

Following

274

Media

15

Statuses

522

AI, Physics, SWE, Investor . Opinions my own. (https://t.co/n9o0tkURTG)

Joined May 2014

Don't wanna be here? Send us removal request.

shwetank

@Shwetankumar

1 month

@akapoor_av8r , @SachinSethPhD , @anuj77 , @cprohan - please tag your representatives! cc: @palkisu Indian Americans speaking out!

0

shwetank

@Shwetankumar

1 month

3/ India is set to become the world's 3rd largest economy. Indian IT firms alone contribute $198B to our economy and employ 207K Americans. Let's build bridges, not barriers. Our innovation ecosystem depends on this talent pipeline. #USIndiaPartnership #ImmigrationReform #H1B

1

0

shwetank

@Shwetankumar

1 month

2/ Reform H1B to support skilled workers, not penalize them Clear the 700K+ green card backlog Strengthen US-India ties for our economic security

2

0

shwetank

@Shwetankumar

1 month

1/ Indian Americans represent just 1.5% of our population yet contribute $300 billion annually in taxes, lead major tech companies, and co-founded 72 US unicorns. The new $100K H1B fee threatens this vital partnership. @SpeakerPelosi @RepKevinMullin - we need your leadership to:

1

0

shwetank

@Shwetankumar

6 months

Could not agree more!

Prof. Shamika Ravi

@ShamikaRavi

6 months

Statements by foreign nations that put the onus on both India and Pakistan to resolve differences - are equating the victim and the perpetrator of terrorism. This is a tacit reward to the terrorist state. #ZeroToleranceForTerrorism

0

1

Dr. S. Jaishankar

@DrSJaishankar

6 months

The world must show zero tolerance for terrorism. #OperationSindoor

7K

45K

284K

shwetank

@Shwetankumar

6 months

It’s called a terrorist attack ⁦@nytimes⁩ ! And note that the response is going to be very very predictable. The government knows what it’s doing.

0

clem 🤗

@ClementDelangue

9 months

If you’d have told me a few years ago that a model released on HF could tank Wall Street and get mentioned by the US President, I probably wouldn’t have believed you. What a world we live in!

31

60

445

Edward Beeching

@edwardbeeching

9 months

As part of our open reproduction of R1, we have roughly reproduced DeepSeek's MATH-500 eval numbers with Hugging Face's lighteval suite. We had to improve our latex parser to get the last few %.

24

110

1K

shwetank

@Shwetankumar

9 months

You know that’s why you should make your words soft and tender - you never know when you might have to eat them.

Arnaud Bertrand

@RnaudBertrand

9 months

This is pretty hilarious in retrospect. In India in 2023, Altman was asked how if a small, smart team with a budget of $10 million could build something substantial within AI. His reply: "It’s totally hopeless to compete with us on training foundation models"

0

3

Teknium (e/λ)

@Teknium

9 months

We retrained hermes with 5k deepseek r1 distilled cots. I can confirm a few things: 1. You can have a generalist + reasoning mode, we labeled all longCoT samples from r1 with a static systeem prompt, the model when not using it does normal fast LLM intuitive responses, and with,

79

133

1K

Nicholas Fabiano, MD

@NTFabiano

10 months

The irony

217

3K

17K

shwetank

@Shwetankumar

10 months

Spot on!

Daniel Jeffries

@Dan_Jeffries1

10 months

This talking head is dead wrong. The best open source models are currently Chinese, not western. Qwen 2.5 is the best open source model on the planet and has been for some time. Deepseek is innovating with new techniques that they didn't copy from anywhere. EngineAI has

0

shwetank

@Shwetankumar

10 months

My love-hate letter to NVIDIA's Project DIGITS is live! 🎯 One Man's Very Mixed Emotions about NVIDIA's Project DIGITS

1

0

2

shwetank

@Shwetankumar

10 months

Real missed opportunity to have said - "Not Groqing this..." - just sayin'

0

1

shwetank

@Shwetankumar

10 months

Ok more measure question after reading more - is there int4 fine tuning technique I don’t know about? What about bandwidth limits? Lot less excited atm.

Bojan Tunguz

@tunguz

10 months

This is seriously the most exciting NVIDIA product that I’ve seen in at least a decade. https://t.co/o0B2ZPrsIG

1

0

shwetank

@Shwetankumar

10 months

Wow! Drool! Take my money!

nvidia.com

NVIDIA DGX Spark, a personal AI supercomputer that’s powered by the NVIDIA GB10 Superchip and based on #NVIDIAGraceBlackwell architecture.

0

shwetank

@Shwetankumar

10 months

To early 2000 physics PhDs - giving a new meaning to “why hide the data when you can have Schon it!” 😉

anton

@abacaj

10 months

Not pointing any fingers at anyone specific but odds of these older benchmarks being in the train set is pretty high

0

shwetank

@Shwetankumar

10 months

Now replace people with LLMs and you have an idea of challenges with purely standardized benchmarks for LLMs applied to your business.

Hamel Husain

@HamelHusain

10 months

I don’t know what people expect honestly. Come up with a dumb standardized test and yes people are gonna practice it ad nauseam Just like any other standardized test the biggest predictor of success is practice

0

shwetank

@Shwetankumar

10 months

Hmmmm interesting! But likely to miss collective blind spots of models with same underlying architecture trained with same data - maybe?

rohit

@krishnanrohit

10 months

SlopRank, a PageRank for LLM evaluations. Enjoy!

1

0

1