Michael Skarlinski @m_skarlinski X Profile

Michael Skarlinski

@m_skarlinski

Followers

553

Following

136

Media

7

Statuses

64

Head of Platform @ Edison Scientific

San Francisco

Joined May 2024

Don't wanna be here? Send us removal request.

Sam Rodriques

@SGRodriques

4 hours

I think people are mostly still just starting to play with Kosmos and understand what it can do, but the response so far has been significantly beyond what we expected. Excerpts from a great write up by Zachary Flamholz: “It is an understatement to say I was impressed with what

2

33

Michael Skarlinski

@m_skarlinski

1 day

Try it out for yourself on our platform: https://t.co/N2a1pva58W

platform.edisonscientific.com

AI Agents for Scientific Discovery

0

2

5

Michael Skarlinski

@m_skarlinski

1 day

Kosmos is unlike any other agent we have at Edison, both in terms of outputs and infrastructure. Running at scale requires our platform to support order-of-magnitude swings in resource requirements, all unknown at submit time. Each run sees between 0 and 120 sandbox

2

18

58

Sam Rodriques

@SGRodriques

8 days

Our older agents, like Crow, Phoenix, and HasAnyone, are still available on our platform as Literature, Molecules, and Precedent, respectively, for 1-2 credits per run. We will be launching more powerful versions soon! (Falcon, our deep research agent, has merged with Crow.)

1

10

29

Andrew White 🐦‍⬛

@andrewwhite01

9 days

After two years of work, we’ve made an AI Scientist that runs for days and makes genuine discoveries. Working with external collaborators, we report seven externally validated discoveries across multiple fields. It is available right now for anyone to use. 1/5

111

549

4K

Andrew White 🐦‍⬛

@andrewwhite01

9 days

We’ve been unusually quiet for ~5 months because we didn’t want to just announce something and not let people use it. So we built it at scale (thanks @m_skarlinski and @ludomitch and eng🫠). And are letting edu users try it a bunch. 2/5

3

6

196

Tony Kulesa

@kulesatony

9 days

More on Kosmos from some of the team behind it here. And check out the technical report:

arxiv.org

Data-driven scientific discovery requires iterative cycles of literature search, hypothesis generation, and data analysis. Substantial progress has been made towards AI agents that can automate...

Sam Rodriques

@SGRodriques

9 days

Kosmos, our newest AI Scientist, is available to use today on our platform. Watch here as three of our scientists describe what Kosmos is, and how it can accelerate scientific research.

2

26

Sam Rodriques

@SGRodriques

9 days

Try Kosmos on our new platform, here: https://t.co/PHYFaC5idK Read our technical report: https://t.co/20AcIFWAZl Read more about Kosmos on our blog: https://t.co/qCvlEwxrZi Finally, read more about Edison Scientific here: https://t.co/iSQIpPP7N4 We can’t wait to see how you

edisonscientific.com

Today we are launching Edison Scientific, a new commercial spinout that will focus on further developing and deploying our AI Scientist for commercial applications.

6

38

236

Sam Rodriques

@SGRodriques

9 days

Today, we’re announcing Kosmos, our newest AI Scientist, available to use now. Users estimate Kosmos does 6 months of work in a single day. One run can read 1,500 papers and write 42,000 lines of code. At least 79% of its findings are reproducible. Kosmos has made 7 discoveries

195

649

4K

Michael Skarlinski

@m_skarlinski

2 months

More of my thoughts here: https://t.co/mCZwLtRdiN Most vector DBs support "hybrid" retrieval with both sparse and dense indices. These implementations seem to be wholly separate indices, akin to separate systems which are merged. And, in a hosted setting, you still pay the price

1

0

2

Michael Skarlinski

@m_skarlinski

2 months

- For semantic search, LLMs usually nail query expansions and synonyms, adding specificity where necessary. - DeepMind's recent "On the Theoretical Limitations of Embedding-Based Retrieval" paper shows that dense embeddings will fail to capture context in many scenarios where

1

0

3

Michael Skarlinski

@m_skarlinski

2 months

Am I missing the boat on vector DBs? From web and social mentions, Pinecone is up 97% YoY and Milvus is up 50% YoY. Dense embedding indices are great in non-text settings, but advantages over sparse indices in text-heavy RAG applications aren't always obvious to me (1/3).

4

3

14

Michael Skarlinski

@m_skarlinski

3 months

Reach out if you'd like to join our amazing platform team!!

Sam Rodriques

@SGRodriques

3 months

We are looking to hire an outstanding UI/UX designer with strong front-end engineering skills to reimagine how researchers can make discoveries in collaboration with AI. If you have these skills and want to help AI accelerate science, get in touch.

0

1

2

Ethan Mollick

@emollick

4 months

This is bad for AI measurement. As other AI benchmarks have become saturated, model makers have turned to Humanity’s Last Exam as a good measure of AI ability. Except a careful review suggests many of the exam questions have incorrect “right” answers. Benchmarking is hard.

Andrew White 🐦‍⬛

@andrewwhite01

4 months

HLE has recently become the benchmark to beat for frontier agents. We @FutureHouseSF took a closer look at the chem and bio questions and found about 30% of them are likely invalid based on our analysis and third-party PhD evaluations. 1/7

6

24

186

Andrew White 🐦‍⬛

@andrewwhite01

4 months

HLE has recently become the benchmark to beat for frontier agents. We @FutureHouseSF took a closer look at the chem and bio questions and found about 30% of them are likely invalid based on our analysis and third-party PhD evaluations. 1/7

19

89

605

nature

@Nature

5 months

FutureHouse aims to build an ‘AI scientist’ that can command the entire research pipeline, from hypothesis generation to paper production https://t.co/CvReGqVOUK

nature.com

Nature - The model, called ether0, outperforms other advanced AIs at chemistry tasks and is a stepping stone towards automating the entire research pipeline.

6

35

150

Sam Rodriques

@SGRodriques

5 months

Today we are releasing ether0, our first scientific reasoning model. We trained Mistral 24B with RL on several molecular design tasks in chemistry. Remarkably, we found that LLMs can learn some scientific tasks more much data-efficiently than specialized models trained from

33

226

1K

Andrew White 🐦‍⬛

@andrewwhite01

5 months

At FutureHouse, we’ve noticed scientific agents are good at applying average intelligence across tasks. They always seem to make the obvious choices, which is good, but discovery sometimes requires more intuition and insight than average. We’ve made the first step today towards

15

81

413

Michael Skarlinski

@m_skarlinski

5 months

https://t.co/UJhZJvVoJU

github.com

Documentation and tutorials for the FutureHouse platform API - Future-House/edison-client-docs

0

Michael Skarlinski

@m_skarlinski

5 months

The FutureHouse platform now has a public documentation repository for raising issues and sharing demos. Please open any issues you find with our API client here! (link in reply)

2

4

18