Agha
@drsmartfish
Followers
182
Following
7K
Media
693
Statuses
3K
Talking AI, building systems, analysing impact and posting what actually matters
Melbourne, Victoria
Joined March 2024
AA-Omniscience benchmark by @ArtificialAnlys (6K Qs, 42 topics) reveals top AI models like Claude 4.1 Opus (+5) barely outpace errors. Grok 4 leads in Health/Science, Claudes in Law/SE. Larger models (e.g., Grok 4, 39% acc, 3T params) trade reliability for accuracy; Claudes shine
Announcing AA-Omniscience, our new benchmark for knowledge and hallucination across >40 topics, where all but three models are more likely to hallucinate than give a correct answer Embedded knowledge in language models is important for many real world use cases. Without
0
0
1
0
0
0
Grok Imagine prompt: There is hot steaming water being used to wash dishes
0
0
0
Jack didn’t even put up a fight 🙁
0
0
2
OpenAI's $1.4 trillion "Sam's Splurge" has shifted AI narrative from hype to realism, exposing monetization and infrastructure risks, as shown in the ecosystem diagram. This has led to market multiple compression, a focus on fundamentals, and debates on unemployment and grid
The Non-Bubble that disappointed both Bulls and Bears -- how Sam's Splurge changed everything The worst kept secret among Tech market participants — just something AI bulls don’t admit out loud: they want a price-action bubble every bit as much as the bears do. Both want to see
0
1
2
AI FRONTIER ARCHITECT – Ilya Sutskever > Born in Russia in 1986, raised in Israel surrounded by many cultures and ideas > Moved to Canada at 16 and immersed himself in mathematics and computer science > Earned BSc, MSc, and PhD at the University of Toronto under Geoffrey Hinton
0
1
2
Why would they compare it to the medium tho
GPT-5.1 is now available in the API. Pricing is the same as GPT-5. We are also releasing gpt-5.1-codex and gpt-5.1-codex-mini in the API, specialized for long-running coding tasks. Prompt caching now lasts up to 24 hours! Updated evals in our blog post.
0
1
2
‘DUNE: PART 3’ has wrapped filming. In theaters on December 18, 2026.
1K
8K
119K
LTX-2 Pro by @Lightricks ranks #3 in image-to-video (ELO 1,289), trailing Kling 2.5 Turbo & Veo 3.1 With 4K/50fps video + audio at $0.06/s cheaper than Veo ($0.20/s) & Sora ($0.50/s). Released Oct 23, 2025, it’s open-source & workflow-ready.
LTX-2 Pro secures #3 in Image to Video in the Artificial Analysis Video Arena, delivering quality comparable to Veo 3.1 and Kling 2.5 Turbo, while also ranking #7 in Text to Video! LTX-2 is the latest video generation model from Lightricks, trailing only Kling 2.5 Turbo and Veo
0
1
5
Zohran Mamdani has now reached an all time high of 93% chances to win the election and become the next mayor of New York, at @Kalshi
1
1
3
0
0
2
NVIDIA’s 6-month partnership surge! $NVDA It reveals a deliberate, accelerating conquest of AI infrastructure across all high-leverage domains. No hype, just the facts: • OpenAI: $100B phased investment for 10GW AI clusters. Millions of Blackwell GPUs. This is the physical
0
0
3
The butthole ecosystem
0
0
3