p0deje Profile Banner
Alex Rodionov Profile
Alex Rodionov

@p0deje

Followers
475
Following
117
Media
57
Statuses
2K

engineer @ airbnb, tech lead @ selenium, author @ maccy, alumnium

Joined December 2010
Don't wanna be here? Send us removal request.
@p0deje
Alex Rodionov
13 days
With open reasoning models like DeepSeek R1, Magistral, and GPT-OSS, we might use this new architecture everywhere. The only exception is Anthropic, which still doesn't have a cheap thinking model available!.
0
0
0
@p0deje
Alex Rodionov
13 days
I'm yet to dig deeper and test how it works. But it seems like the future versions of Alumnium might have a significant improvement in accuracy, cost, and performance, by switching to GPT-5 Nano or Gemini 2.5 Flash.
1
0
0
@p0deje
Alex Rodionov
13 days
With the rise of reasoning models, Alumnium might completely remove the planner and let the model "think" about the plan before producing tool calls, all within the same API request! Think cheaper and faster test execution.
1
0
0
@p0deje
Alex Rodionov
13 days
The whole point of the planner was to let LLMs provide the most accurate result without putting tool-calling constraints on the output. After all, LLMs just "guess the next token". It could also be easily primed with few-shot chain-of-thought examples.
1
0
0
@p0deje
Alex Rodionov
13 days
Alumnium has a dual agentic workflow to perform actions. There is a "planner agent" that analyzes the accessibility tree of the page and outputs steps to perform. These steps would then be passed to a tool-calling "actor agent" that would map steps to specific actions.
1
0
0
@p0deje
Alex Rodionov
13 days
I'm very interested in GPT-5 Nano, specifically in how to use it in Alumnium. My intuition tells me it might significantly decrease the cost and latency of execution, along with the internal complexity of the implementation.
Tweet card summary image
github.com
Now that we have fast and relatively cheap thinking models, we can try removing the planner agent completely and instead ensure that the actor agent thinks before responding with tool calls. This w...
1
0
0
@p0deje
Alex Rodionov
14 days
We'll have to stick to Mistral for now, since it also has vision. Or should Alumnium support dual-model mode - gpt-oss for text-only, mistral-small for vision?.
0
0
1
@p0deje
Alex Rodionov
14 days
If you are eager to try it out, you can do this today by pulling the model with Ollama and running your tests with `ALUMNIUM_MODEL="ollama/gpt-oss:20b"` environment variable.
1
0
1
@p0deje
Alex Rodionov
14 days
Just tested how Alumnium runs on gpt-oss:20b! .It is 30% faster than mistral-small3.1:24b (default), but I see issues with structured output. Need to dig deeper and see if thinking affects this, or I should use the tool calling instead - not sure what Langchain uses internally.
1
0
2
@p0deje
Alex Rodionov
27 days
There were lots of other small improvements, particularly on Appium support. Oh, and Alumnium now runs on @MistralAI Le Platforme!.
0
0
0
@p0deje
Alex Rodionov
27 days
The caching was completely reworked. A default langchain SQLite cache stores prompts/responses as-is (and even indexes them). A few test runs, and you have a database that measures in MBs. This is now fixed with our custom cache implementation!.
1
0
0
@p0deje
Alex Rodionov
27 days
When you use the Area API, you pay less for tokens and make tests more stable and faster. It also results in more cache hits and is a good way to write cleaner test code!.We might eventually make area selection fully automatic!.
1
0
0
@p0deje
Alex Rodionov
27 days
There is now a new Area API, which allows you to narrow down the scope to a particular part of the screen. Think of it as your "component" classes in the page object pattern. Once the area is identified, Alumnium only uses its part of the accessibility tree, not the whole.
1
0
0
@p0deje
Alex Rodionov
27 days
This not only makes you pay for extra tokens, but also slows down and sometimes confuses LLM. Alumnium now clears this duplicate information, making tests more stable, faster, and around 50% cheaper.
1
0
0
@p0deje
Alex Rodionov
27 days
There was a lot of work to improve React Native support. In particular, its accessibility tree tends to include a lot of duplicate information, with each parent's `name` containing text from all child nodes.
1
0
0
@p0deje
Alex Rodionov
27 days
Alumnium 0.12 is out. It's still the open-source AI-powered test automation library that doesn't force you to rewrite all your tests. It's still the only one that works both on the web and mobile.
Tweet card summary image
github.com
What's Changed chore: refactor XCUITestAccessibilityTree by @sh3pik in #114 feat: enrich appium accessibility tree by @p0deje in #117 feat: appium drag&drop and press_key by @Balaji028 in ...
1
0
2
@p0deje
Alex Rodionov
27 days
If you use Maccy with a DisplayPort monitor, you probably no longer see its window. This is now fixed in 2.4.1. Lesson learnt, telling a window to be excluded from screen sharing = invisible on DisplayPort too.
Tweet card summary image
github.com
Reverted screensharing privacy mode, which also made windows invisible on external monitors connected via DisplayPort (#1136).
1
0
1
@p0deje
Alex Rodionov
29 days
It’s been an incredible journey. Any one source author who tries to grow the project should give it a shot.
@OpenCoreVenture
Open Core Ventures
30 days
Congrats @p0deje on growing Alumnium 13X during the Catalyst program!. πŸ‘₯ 2 to 12 contributors .🌟 29 to 400 stars.πŸš€ Over 1,200 monthly downloads . Read Alex's story:
0
0
3
@p0deje
Alex Rodionov
30 days
Maccy window is now excluded from being shown during screen sharing, so you don't accidentally show your whole clipboard history. There is a bug that the preview is still shown - will be fixed in a new release. The downside is that you cannot take a screenshot of Maccy anymore!.
0
0
2