Rob Farlow
@RobFarlow
Followers
657
Following
886
Media
16
Statuses
45
building https://t.co/tIZVrveiP7 (we're hiring) | uwaterloo eng
sf
Joined February 2013
Today, we're making Scouts available to everyone! Earlier this year, Scouts was born out of a simple observation — that so many of life's background (or even foreground!) tasks have a recurring flavor, e.g. house hunting, early stages of travel planning, sourcing leads,
71
66
421
We eval'd @amazon Nova Act on Navi-Bench. Score: 72.6% ➡️ 4th behind Navigator, Opus, Sonnet Vibes: • Strong grounding — near 0 mistypes/misclicks • Great on short-horizon tasks, struggles with longer ones (low persistence + reasoning) • Final messages not comprehensive
3
7
41
we needed to build a bunch of RL training environments so this summer we hired the 7 most cracked waterloo cs interns we could find
2
2
73
We're excited to launch Scouts — always-on AI agents that monitor the web for anything you care about.
139
173
3K
Another example of how bad DOM based browser agents are Task: Order a medium cheese pizza from papa johns Browseruse first selects the wrong size because it can't associated the radio buttons with the size labels Then goes on to add the pizza 5 times because it can't tell its
0
0
9
People often ask me about DOM vs vision based browser agents. Frameworks like Browser Use can be faster and cheaper but start failing even on basic tasks like this example where it struggles for 5+ minutes to input a date because it can't see the date picker
3
2
18
BrowserUse is often unable to complete the most basic tasks like "Create a new contact called Tanmay Shah with email tanmay@plato.so and phone number 6476261842" Completely misses the phone number
2
0
9
We created a dataset & environment for common CRM tasks and benchmarked different browser use agents. First results show Openai's computer-use-preview model taking a slight edge over Claude with BrowserUse far behind
5
4
28
Running different browser agents on non-Webvoyager tasks clearly shows two things: 1) OpenAI's Operator is clearly better than the others 2) self-reported Webvoyager results don't mean anything
4
3
31
looking at the price of BTC in context to other assets (especially gold) gives so much perspective. BTC is a better than gold in every way except for that gold has a trusted history for thousands of years
0
0
4
the hard pill I had to swallow after graduating engineering: the technical challenges with starting most software businesses are increasingly trivial. the way to win is with design, marketing, and partnerships
1
0
11
when I was 15 I thought it was my dream to work at google. I put google stickers all over my laptop. the only people that work there now do it for the salary, security and status. crazy how much can change in a decade
0
0
5
if you’re not using arc by now you’re a bum and i have no respect for you
0
0
2
I wish I could just walk into an office, watch people work, join their meetings, and ask any questions I want to. I would be able to come up with thousands of brilliant startup ideas
1
0
5
full day of coding working on a saas product: ~1hr in terminal ~4hr actually coding ~6hr testing/chatgpt/googling/stack overflow
1
0
4
"I notice every time when something is AI generated. its so obvious"
0
0
1
pretty much every smart tech person I met outside of sf is either going to SF or wants to go but can't for various reasons
1
0
3
its hard to intuitively grasp the concept of exponential growth. unfathomable to think the entire progress of human history will be doubled in the next 20 years
0
0
3