rkosai Profile Banner
Ryan Kosai Profile
Ryan Kosai

@rkosai

Followers
484
Following
4K
Media
79
Statuses
2K

CTO @readysetpotato, an autonomous research agent for biology. 20 years in software, principal engineer, 3x CTO.

Seattle, WA
Joined May 2008
Don't wanna be here? Send us removal request.
@rkosai
Ryan Kosai
5 years
I learned about programming from my dad. That's how I know the first rule of software is to number your punch cards in case you drop them.
1
1
20
@Andercot
Andrew Côté
2 days
There's this kind of sociopathic defacement of civil society in the name of "achievement" where we form a lipid bilayer of ego membrane, something on the outside that assumes this is all par for the course, and something internal that feels like we're winking at one another
@bihanmahadewa
Bihan
5 days
chat she started interviewing a founding engineer mid dinner 💀
96
196
2K
@rkosai
Ryan Kosai
19 days
If we are successful in our goal, millions of AI scientists running in a high-fidelity environment will accelerate scientific discovery to an unfathomable scale.
1
0
0
@rkosai
Ryan Kosai
19 days
This is the end of a long thread. Potato is building an environment for biological sciences. The environment includes tools to search and review scientific literature, build wet lab protocols, create and run bioinformatics scripts, and drive robots for full lab automation.
1
0
0
@rkosai
Ryan Kosai
19 days
While we've built an AI-agent resting on existing frontier models, we continue to advance and build the environment for an AI-powered future — a high-fidelity world model for scientific research.
1
0
0
@rkosai
Ryan Kosai
19 days
As suggested earlier, we expect all this functionality to be subsumed by frontier models by the major LLM providers.
1
0
0
@rkosai
Ryan Kosai
19 days
The system also generates an LLM-as-judge numerical scoring rubric with categories such as completion, correctness, and efficiency. These scores form an early framework for measuring how well the agent utilized the environment, as well as how well it explains, justifies, and
1
0
0
@rkosai
Ryan Kosai
19 days
Once the agent has ended its session, the system automatically considers the full history and artifacts to produce a complete summary report. This report provides the scientific context, experiments run, and a summary of the evaluated data.
1
0
0
@rkosai
Ryan Kosai
19 days
In the long term, a more sustainable solution is to allow automated recursive backtracking, so the agent can decide against its current trajectory and try again from an earlier iteration. However, this approach requires very robust instantaneous scoring toward the final outcome,
1
0
0
@rkosai
Ryan Kosai
19 days
This can be performed post hoc, so even if the agent has completed a run, a user can revisit the timeline and create a branch at any desired point along that timeline.
1
0
0
@rkosai
Ryan Kosai
19 days
To mitigate this issue in the short-term, we allow the user-in-the-loop to branch the timeline after any thinking phase. During a branching event, the historical record is maintained, but the model is prompted to reevaluate its decision with additional user suggestions and
1
0
0
@rkosai
Ryan Kosai
19 days
One major challenge to building a robust agent in the current era is that frontier model outputs still exhibit both high variability and sensitivity to input conditions. Practically speaking, this means that a non-trivial number of decisions made by the model are wrong or
1
0
0
@rkosai
Ryan Kosai
19 days
We also compress the historical context, using the compression method published by Factory AI, and have found it to be very effective and tunable across context window sizes. https://t.co/T9agXlxzJ5
factory.ai
Compressing Context LLMs attend only to the tokens in their current prompt. Because every model enforces a finite contex...
1
0
0
@rkosai
Ryan Kosai
19 days
This has the benefit of compressing both the tool request and response into a single chunk of information, eliminating content duplication. Internal testing has also shown the XML structure to be more amenable to handling by the latest models than the native JSON formats used for
1
0
0
@rkosai
Ryan Kosai
19 days
Even with a dedicated tool context, information accumulates rather quickly. To further simplify the context window for each thinking phase, we rewrite the entire history into concise XML blocks representing previous tool executions.
1
0
0
@rkosai
Ryan Kosai
19 days
Storing the artifacts in a structured format additionally provides a medium-term “memory” for the agent, which allows it to retrieve specific details by key, without automatically consuming valuable context space.
1
0
0
@rkosai
Ryan Kosai
19 days
Likewise, evaluating literature involves search and reranking, scanning documents for relevant facts, and finding corroboration or divergence between documents. All of these are token-heavy activities, and benefit from the externalizing effects of a tool-oriented architecture.
1
0
0
@rkosai
Ryan Kosai
19 days
This context is crucial for accurately setting parameters, such as in protein-ligand docking, where environmental conditions directly impact affinity scoring.
1
0
0
@rkosai
Ryan Kosai
19 days
For example, we've found that in-silico computational simulation requires detailed module documentation, but also benefits heavily from literature-drawn scientific context.
1
0
0
@rkosai
Ryan Kosai
19 days
Instead, we return just the necessary results, with a reference ID to later access the tool's internal reasoning and artifacts.
1
0
0