Augustinas Malinauskas @NeurIPS Dec1-7
@amgauge
Followers
1K
Following
3K
Media
30
Statuses
389
PhD Theoretical Physics Oxford. CTO @eternisai, building systems to enable coordination at scale
San Francisco, CA
Joined August 2016
No AI agent can do this simple task "Make a spreadsheet from NeurIPS calendar page consisting of: event name, time, abstract, speaker names with affiliations" - OpenAI Agent ❌ - Gemini Agent ❌ - Manus ❌ - ChatGPT Atlas ❌ They all give up after 20 events while anyone could
2
1
4
This result is often used in Quantum Field Theory/String Theory, for example when calculating Casimir's force. The sum does not converge in standard analysis. What people really mean by writing this is: "If I take the analytic continuation of the function represented by this
0
0
3
SILO UPDATES 1/ Silo upgrades: new web app, encrypted sync, deep research. Silo started on iOS as a private-by-default AI interface. That same interface now runs on any device, with history that stays end-to-end encrypted across phone and browser.
1
9
34
Dirac’s math literally powers the transistors that make your tweet possible. This is why 1. Nearly 100 years ago Dirac wanted an equation that can describe an electron (spin-½ particle) to unify Special Relativity and Quantum Mechanics. 2. Dirac came up with an operator, but
0
0
0
With hardware co-location, TEE breaks are a dime a dozen but this one takes it to the next level and lets you generate a fake attestation report using keys extracted from production hardware with a ~$1k setup. Here's a breakdown and it affects secure LLM inference too: The
More interposer fun, this time with DDR5 memory. Breaking TDX, SGX, SEV and even Nvidia TEEs. Checkout our work at https://t.co/Jl1dpGnM6J, and get a personally-signed Intel attestation report at @TEEdotFail.
2
1
8
How do LLMs trust, betray, and cooperate under well considered game scenarios? I built StrategyBench to find out. Open benchmark + scenarios, tuned for research and safety work:
strategy.freysa.ai
Comprehensive AI model performance across social deduction and strategy games. Built by Eternis for multi-agent AI research.
7
9
40
Intelligence needs to be self-owned and sovereign to the individual. Not served behind closed doors by a privately owned entity.
0
0
5
Enchanted is now Silo. From Greek "siros", a secure underground chamber for storing grain. Today, a chamber for your intelligence. Your data, your thoughts, your AI. An AI you share your life with, speak to about your deepest thoughts is not personal if it isn't private.
12
14
95
Thinking Machines blog is one of the most detailed and insightful blogs I came across
0
0
0
The bet is on building a continual learning system. What does this mean? Cursor's update is a working example. New data comes in, the system knows how to filter for the most valuable samples. It then leverages RL/other algorithms to deploy a checkpoint trained using said data.
We introduce a better recipe for collecting post-training data when using GRPO. Collecting samples from experts is expensive, annotation budgets are limited. Which examples are actually worth paying for? We find that focusing on hard samples results in a 30-40% improvement. 1/7
2
2
7