Web3Aible @Web3Aible X Profile

Web3Aible

@Web3Aible

Followers

3K

Following

8K

Media

2K

Statuses

10K

https://t.co/YtOn58zyJ3

See what I'm building 👉

Joined August 2020

Don't wanna be here? Send us removal request.

Web3Aible

@Web3Aible

3 days

Here is a One question #randombench evaluation of LLMs and I think it is putting every model in its rightful position. Quest: You must produce one single token. Follow these rules exactly. Any extra words, code fences, or explanations = FAIL. LINE1: Only models that read rules

1

4

jack

@jack

1 day

https://t.co/TpInnGsCsy

reddit.com

Explore this post and more from the Nepal community

177

375

3K

Web3Aible

@Web3Aible

20 hours

That young man Charlie Kirk died and it's so painful to lose a young man in a public place. Guns violence is a threat to anyone No gun should enter a public gathering and reassess the mental capacity of gun owners more frequently. RIP Charlie

0

Web3Aible

@Web3Aible

21 hours

Charlie Kirk getting shot on the neck in such a place is quite unfortunate. I hope he survive the heinous acts

0

NASA Mars

@NASAMars

1 day

After a year of scientific scrutiny, a rock sample collected by the Perseverance rover has been confirmed to contain a potential biosignature. The sample is the best candidate so far to provide evidence of ancient microbial life on Mars. https://t.co/0BAO1dhMG8

726

8K

32K

Mira Murati

@miramurati

23 hours

A big part of our mission at Thinking Machines is to improve people’s scientific understanding of AI and work with the broader research community. Introducing Connectionism today to share some of our scientific insights.

Thinking Machines

@thinkymachines

23 hours

Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference” We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to

138

310

4K

Web3Aible

@Web3Aible

23 hours

Replit Agents 3 cooked

0

Web3Aible

@Web3Aible

1 day

$BNB hits ATH

0

Web3Aible

@Web3Aible

1 day

Which colour again?

0

Web3Aible

@Web3Aible

1 day

Notable SOTA models that failed to pass the test include Grok 4 (4 shots), Gemini 2.5 pro (including with the highest thinking budget), all Claude 4 & 4.1 models including 32k thinking after several shot. Quest2 is more complex than #randombench quest 1. https://t.co/FwhN5LkGCm

Web3Aible

@Web3Aible

3 days

Here is a One question #randombench evaluation of LLMs and I think it is putting every model in its rightful position. Quest: You must produce one single token. Follow these rules exactly. Any extra words, code fences, or explanations = FAIL. LINE1: Only models that read rules

0

Web3Aible

@Web3Aible

1 day

I tested #K2Think by @mbzuai and in one shot, it crushed the most challenging Quest2 in just 7 seconds! Note only 03, 04mini, gpt-5 (mini-High, High, High-new-system-prompt), Grok 3 mini-High, and Sonoma Sky Alpha passed it. All these SOTA models used atleast 1 minute!

1

0

5

Web3Aible

@Web3Aible

2 days

Quest 2: testing

0

Web3Aible

@Web3Aible

2 days

OpenAi routing models based on complexity of queries is a smart way of utilising model capabilities and balancing between capabilities and costs and time. sometimes the model fails to correctly route to a higher reasoning model. I wonder how effective it is on cost reduction.

0

Web3Aible

@Web3Aible

2 days

I think I will hold back to buy M5 chip

0

Web3Aible

@Web3Aible

2 days

New iPhone ready for local Ai mobile ai.

0

1

Web3Aible

@Web3Aible

2 days

If your Ai model is not answering this one Question, probably you don't need it as your thinking machine.

Web3Aible

@Web3Aible

3 days

Here is a One question #randombench evaluation of LLMs and I think it is putting every model in its rightful position. Quest: You must produce one single token. Follow these rules exactly. Any extra words, code fences, or explanations = FAIL. LINE1: Only models that read rules

0

Web3Aible

@Web3Aible

3 days

Here is a One question #randombench evaluation of LLMs and I think it is putting every model in its rightful position. Quest: You must produce one single token. Follow these rules exactly. Any extra words, code fences, or explanations = FAIL. LINE1: Only models that read rules

1

4

Web3Aible

@Web3Aible

2 days

$WLD will be $10 in 3 months. Yes, you heard it right. There is a strong momentum after year-long liquidations, now new big players are set to entrench utility and boom is surely coming.

0

1

Philipp Schmid

@_philschmid

2 days

The official MCP Registry is here! An open catalog and API designed to solve how MCP servers are discovered. It doesn't host the actual server code. It stores metadata (server.json) that points to packages in other registries like NPM, PyPI, and Docker Hub, standardizing server

11

94

594

Web3Aible

@Web3Aible

2 days

European Ai front Mistral raised €1.7B to advanced ai. When AGI is achieved. Access will only likely favour those countries who attributed resources to advanced ai. No free milk . Advancing tech should be every contries initiative..support those who have progressed.

0

1

Web3Aible

@Web3Aible

3 days

What if):

0

1