Javier Alvarez Valle
@alvarezvalle
Followers
682
Following
909
Media
62
Statuses
864
Cambridge, UK
Joined February 2009
We're taking the next big step with Researcher. With Computer Use, it can now securely browse the open and gated web to find hard-to-locate information—even across hundreds of sites—and handle multi-step tasks to uncover insights, take action, and create richer reports.
97
301
2K
In engineering, you should first solve your problem within a relaxed design space, and only *then* should you determine the minimal constraints required to implement that solution. Don't settle on the hardware before you know what software you'll need to run. Don't design the
38
69
790
Watching Claude Code execute in an infinite loop with defined objectives while autonomously handling its own git commits and https://t.co/SGbauPrlvn is genuinely mind-blowing.
0
0
0
Love the new ARC-AGI-3 games. Having to discover the rules is very interesting. I think there are some priors about playing games required to be able to efficiently learn the rules. I doubt any human can pass them in a reasonable time even if they never played games.
Today, we're announcing a preview of ARC-AGI-3, the Interactive Reasoning Benchmark with the widest gap between easy for humans and hard for AI We’re releasing: * 3 games (environments) * $10K agent contest * AI agents API Starting scores - Frontier AI: 0%, Humans: 100%
0
0
0
Excited to share two advances that bring us closer to real-world impact in healthcare AI: SDBench introduces a new benchmark that transforms 304 NEJM cases into interactive diagnostic simulations. AI must ask questions, order tests, and weigh costs, mirroring the complexity of
206
905
5K
The world’s first multimodal, bilingual radiology dataset could reshape the way radiologists and AI systems make sense of X-rays. PadChest-GR, developed by the University of Alicante with Microsoft Research, has the potential to advance research across the field for years to
0
7
38
🔍 Why it matters: - 4,555 chest X-ray studies - 10,459 sentence-level clinical findings in Spanish & English - Precise spatial annotations - Designed specifically to ground AI in multimodal data and improve clinical verification and interpretability
0
0
0
NEJM-AI: Introducing PadChest-GR, the world's first bilingual, grounded radiology reporting benchmark for chest X-rays! https://t.co/7PMiIf1Uts
1
1
2
There is a lot of hyped up excitement about MCP, but I think most people miss the essence of why MCP is a such a BIG thing. As the fundamental theorem of computer science states "All problems in computer science can be solved by another level of indirection". That is exactly
19
70
543
Today I did my first PR in GitHub without writing a single line of code, just asking questions and reviewing. Imagine a team of copilot doctors working with you.
We have seen many ways that generative AI can transform healthcare but over-stretched health-IT teams can’t deploy new software systems to surface the latest advancements fast enough. But... what if developers could meet clinicians where they already work today, building,
1
2
20
A big day – we are making the largest dataset of chest X-rays available to the world. This is the first dataset to cross 100k patients. https://t.co/dPSSpFHkae
huggingface.co
13
74
380
Three Microsoft CEOs walk into a room on Microsoft’s 50th anniversary … and are interviewed by Copilot!
800
3K
20K
Lee Iacocca understood the power of truly listening to your team. His leadership philosophy, which helped create the Ford Mustang and save Chrysler from bankruptcy, wasn't taught in business school:
3
42
271
GPT-4.5 got better at drawing
0
0
1
GitHub Copilot is all-in on agents. Check out Agent Mode, and a first look at our Autonomous SWE agent.
Today, we are infusing the power of agentic AI into the GitHub Copilot experience, elevating Copilot from pair to peer programmer 🤖 (1/4) https://t.co/zr6l3uaTmb
93
256
2K