Michael Hla @hla_michael X Profile

Michael Hla

@hla_michael

Followers

1K

Following

1K

Media

14

Statuses

117

bio + cs | prev @harvard @shv

Joined July 2020

Don't wanna be here? Send us removal request.

Michael Hla

@hla_michael

4 months

I taught an LLM to optimize proteins. It proposed a better carbon capture enzyme. Introducing Pro-1, an 8b param reasoning model trained using GRPO towards a physics based reward function for protein stability. It takes in a protein sequence + text description + previous

93

341

3K

Michael Hla

@hla_michael

10 days

RT @alexisohanian: Thrilled to announce our 2025 @776foundation Fellows! I’m giving each fellow $100k to tackle one of the biggest threats….

0

17

0

Michael Hla

@hla_michael

25 days

Thanks again to the @adaptyvbio team! Uploaded a csv of sequences and got super detailed assay results with no overhead. Would highly recommend.

Adaptyv Bio

@adaptyvbio

25 days

Pro-1, a protein design model by @hla_michael, doesn’t just propose mutations — it explains why it made them. We tested 19 of its FGF-1 designs in our lab and 3 of them improved thermostability while maintaining binding. In this protein designer spotlight we explain how

0

14

Michael Hla

@hla_michael

1 month

RT @andrewwhite01: I want to point out that over the last few weeks there has been other great work on building reasoning models in biology….

0

10

0

Michael Hla

@hla_michael

2 months

Sequences and Thermostability Data: Binding Affinity Data:

2

1

2

Michael Hla

@hla_michael

2 months

Special thank you to @julian_englert @danielnzg85 and the @adaptyvbio team for sponsoring this validation. The entire process was seamless and would highly recommend their services!.

1

0

7

Michael Hla

@hla_michael

2 months

Nevertheless, these sequences serve as the first ever LLM optimized proteins and serve as valuable baseline validation. Looking forward to synthesizing the carbonic anhydrases and pushing the model’s capabilities.

1

0

3

Michael Hla

@hla_michael

2 months

None of the successful sequences had very interesting modifications (as indicated by the top performer being a single point mutation variant) or truly impressive reasoning. Most focused on point mutations and keywords similar to those provided in the prompt. This raises a valid.

1

0

4

Michael Hla

@hla_michael

2 months

Comparing results to existing variants, the Pro-1 sequences are competitive with some of the most stable publicly available sequences in literature. The K116E variant (v3) in particular demonstrated exceptional improvement in melting temperature, demonstrating a 23.9 degree

1

0

4

Michael Hla

@hla_michael

2 months

The successful variants all had similar reasoning traces, typically referencing generic properties of stable proteins such as low flexibility, solubility, etc. On occasion, the model would reference details more specific to FGF-1, such as integrin or heparin binding affinity.

1

0

2

Michael Hla

@hla_michael

2 months

Of the 19 variant sequences:. 16/19 were able to be expressed.7/16 showed reliable thermal stability signal.3/7 had higher melting temperature.3/3 preserved binding affinity to FGFR1 (compared to 6/16 for all of the expressed variants)

1

0

2

Michael Hla

@hla_michael

2 months

The prompt included general information about FGF-1 (function, known interactions), mutagenesis data from UniProt, and select excerpts from papers that have previously engineered more stable FGF-1 variants. The base and creative model instances were sampled 50 times each, with.

1

0

2

Michael Hla

@hla_michael

2 months

Why FGF-1?. Human fibroblast growth factor (FGF-1) is a 155 amino acid protein implicated in processes such as cell differentiation, tissue repair, and metabolic regulation. It also has been shown to have some therapeutic potential in parkinson’s, type 2 diabetes, and.

1

0

5

Michael Hla

@hla_michael

2 months

First Lab Validation for Reasoning Model Proteins. With @adaptyvbio, we tested 19 FGF-1 sequences optimized by Pro-1 for thermal stability and binding affinity to human FGFR-1. Pro-1 produced 3 novel sequences that maintained binding affinity and expression compared to wild

6

25

153

Michael Hla

@hla_michael

4 months

Full blog post: Codebase: Model Weights:

7

11

120

Michael Hla

@hla_michael

4 months

If you would like to contribute or have any feedback, don’t hesitate to reach out. This has been my pet project over the past 2 months and would love to hear your thoughts.

6

0

47

Michael Hla

@hla_michael

4 months

Pro-1 demonstrates the transferability of natural language models to sequence optimization tasks and presents a new possibility in leveraging language models for scientific discovery. With strong reward signals, language models can reason over complex scientific tasks and one.

2

1

43

Michael Hla

@hla_michael

4 months

Looking forward, the biggest priority is to synthesize the model generated sequences (actively looking for help with this). Wet lab validation is absolutely necessary for a project like this, and synthesizing these sequences is the ultimate test for any model designed sequences.

4

0

49

Michael Hla

@hla_michael

4 months

The creative model then reasoned through the insights from the literature provided and suggested novel modifications motivated by the themes of the papers provided. For example, in its best generation, the creative model reasoned that introducing a peptide tag would enhance

3

1

45

Michael Hla

@hla_michael

4 months

For the base model, I passed in the native HCA II sequence, effects of known mutations, excerpts from a review on the topic (Fiore, 2015), reaction mechanism, and residues that were known to be involved in the reaction. Out of 100 samples, the best proposal from the base model.

2

1

41

Michael Hla

@hla_michael

4 months

Optimizing Human Carbonic Anhydrase II (HCA II):. Enzymes have been an area of intense research for carbon capture due to their ability to catalyze CO₂ conversion with remarkable efficiency. Among these, HCA II is an exceptionally efficient candidate, speeding up the conversion

1

6

67