Cotool
@cotoolai
Followers
176
Following
127
Media
9
Statuses
31
Composable AI agents for security teams
Joined May 2025
Introducing Cotool Cotool uses AI to automate repetitive security workflows so teams can focus on threat hunting instead of manual tasks. Our early users have used the copilot and agents to cut investigation time by 90% while increasing detection quality & coverage. We allow
14
2
57
We added a new cohort of frontier models to our eval! Gemini 3 Pro, Claude Opus 4.5, and GPT-5.1 are all compared in our updated post:
1/6 π UPDATED EVAL RESULTS We compared Gemini 3 Pro, Claude Opus 4.5, and GPT 5.1 on a single investigation task of our internal agent eval for Security Operations tasks. Key Results: - @OpenAI GPT-5+ models maintain the performance-cost Pareto frontier - @AnthropicAI Opus
0
0
0
6/6 Finally, this work is a follow up to a previous blog post we put out. For more info around motivation, methodology, and much more around the eval itself, check out our initial post here: https://t.co/ZoP1qiZKVa
πToday we're sharing initial results from one of our internal agent evals for Security Operations tasks. We replicated the @splunk BOTSv3 CTF environment in an eval to test frontier models' capability on realistic blue team cybersecurity tasks. BOTSv3 comprises over 2.7M logs
0
0
1
5/6 Full Blog Post: https://t.co/He4yzb8xke Evals in security operations are an evergreen challenge. As agents take over more security operations tasks, benchmarking performance becomes increasingly critical. Our goal is to push the community forward with better metrics so that
cotool.ai
Customizable AI Agents that automate detection engineering, monitor emerging threats, run continuous hunts, and investigate alerts instantly. Improve coverage, cut MTTR, and eliminate manual &...
1
0
1
4/6 Interpretation for Security Teams For real-world SecOps agents considering the new models: - GPT-5.1 is now the recommended choice for most blue team investigation tasks. It matches Opus 4.5's accuracy at roughly 1/3 the cost with better task completion reliability. - Opus
1
0
1
3/6 Task Duration Opus 4.5 completed tasks much faster on average with less variance than any other model tested. This notably includes a ~2x speed up in wall clock duration over Haiku 4.5. This is an interesting result, as it flips a commonly held assumption that smaller
1
0
1
2/6 Performance <> Cost GPT-5.1 and Opus 4.5 marginally improve the state of the art (SOTA) performance in accuracy over the previous cohort of frontier models. However, GPT 5+ models continue to define the performance-cost Pareto frontier, offering the best tradeoff between
1
0
1
1/6 π UPDATED EVAL RESULTS We compared Gemini 3 Pro, Claude Opus 4.5, and GPT 5.1 on a single investigation task of our internal agent eval for Security Operations tasks. Key Results: - @OpenAI GPT-5+ models maintain the performance-cost Pareto frontier - @AnthropicAI Opus
2
1
5
Blog Post: https://t.co/ZQowT9Dh7l Evals in security operations are an evergreen challenge. As agents take over more security operations tasks, benchmarking performance becomes increasingly critical. Our goal is to push the community forward with better metrics so that security
cotool.ai
Customizable AI Agents that automate detection engineering, monitor emerging threats, run continuous hunts, and investigate alerts instantly. Improve coverage, cut MTTR, and eliminate manual &...
1
0
7
Interpretation for Security Teams: - GPT-5 family of models show signs of being the best performing and most cost-efficient models for blue team investigations - Haiku-4.5 is ideal for interactive triage or real-time alert enrichment where response time matters - Gemini models
1
0
6
Some agent run executions exceeded 1hr+ in wall clock runtime - but @AnthropicAI's Claude Haiku 4.5 proved the most efficient in task duration, making it a strong favorite for latency constrained investigation tasks
1
0
4
We found @OpenAI GPT-5 family of models to be both the highest performing and most cost effective models for this task. GPT-5 models handily define the Pareto frontier of performance-cost tradeoff, as shown in the figure below. Tests were run before GPT-5.1 API access was
1
0
4
πToday we're sharing initial results from one of our internal agent evals for Security Operations tasks. We replicated the @splunk BOTSv3 CTF environment in an eval to test frontier models' capability on realistic blue team cybersecurity tasks. BOTSv3 comprises over 2.7M logs
1
4
20
BSidesNYC welcomes @cotoolai as a kilobit sponsor for our Oct 18, 2025, conference. https://t.co/sRG8TTJ3uC Cotool works alongside security engineers during alert triage & investigation, reducing time spent by up to 90%. https://t.co/4KJ1KtmbhK
1
2
4
Last week, @eddieconkml shipped agent instructions! Just like Claude code, cursor, and others can suggest code diffs, our tool now diffs your system prompts with your feedback to create better performing agents! Check it out below πππ
0
0
7
We met @eddieconkml and @endorseurgirl from @cotoolai, building composable AI agents for security teams. Their co-pilot aims to give time back to cybersecurity teams.
2
4
14
We're copping 5 rollies off @getbezel and shipping them to London for you goat @jordihays @johncoogan Thanks @tbpn for the shoutout
9
3
19
Awesome product, @cotoolai! And thanks for the shoutout at 0:34! π Teamwork makes the dream work
Cotool (@cotoolai) is an agentic security platform that eliminates manual and repetitive work for security teams. It helps teams investigate faster, automates common tasks, and documents work in seconds. https://t.co/Bahz5TuDSz Congrats on the launch, @maxpollard415,
3
1
13
10/10 no notes
Cotool (@cotoolai) is an agentic security platform that eliminates manual and repetitive work for security teams. It helps teams investigate faster, automates common tasks, and documents work in seconds. https://t.co/Bahz5TuDSz Congrats on the launch, @maxpollard415,
5
3
49