Daniil Boiko
@daniil_boiko
Followers
312
Following
203
Media
9
Statuses
72
ML-researcher, AI in chemistry
San Francisco, US
Joined May 2017
How much data is lost in #science? 🚨 New study: 50 - 90% of electron microscopy images never make it into publications. This hidden "lost data" pool can fuel #AI, #education & new discoveries. https://t.co/58Iz4n5g8F
@Hitachi_EM #Microscopy #Artificialintelligence #nano
0
2
3
CATNIP for the win! Read our newest work with the Gomes group- https://t.co/lILWdTxbVR
@Apatoneh @gabepgomes @daniil_boiko @AlisonNarayanUM
0
11
47
Every person who is working on ML for drug discovery must set up a biological assay themselves. After that you will either reconsider your life choices or understand why the models are not working.
0
0
1
🧲Magnetic stirrers - used daily in millions of labs - may silently sabotage reproducibility. Our @JACS_Au article describes drastic variability in reaction outcomes based on vessel placement https://t.co/cFzO08CBQL
@ACSPublications @ACSCatalysis @ChemistryNews @NatureChemistry
pubs.acs.org
Magnetic stirrers, the most widely used and ubiquitous devices for performing chemical reactions in laboratory settings, may cause reproducibility problems. Reproducibility in a range of chemical...
6
28
89
you should legally be required to disclose what quantization level you are serving your current model at like it was a nutrition label. you should also be banned from dynamically adjusting quantization based on demand without notification. (you know who you are ...)
96
176
3K
Discovering organic reactions with a machine-learning-powered deciphering of tera-scale mass spectrometry data #machinelearning #compchem
nature.com
Nature Communications - Mass spectrometry generates vast amounts of data in chemistry labs. Here, authors developed a machine learning-driven search engine that analyzes archived data to discover...
0
5
19
Discovering new reactions from terabyte-scale mass spectrometry data with a machine learning approach Large experimental datasets have become a staple of modern chemistry research, yet the sheer volume of stored information often outstrips researchers’ ability to interpret and
0
9
31
(7/n) And finally, thanks 🙌 to all the coauthors and everyone who has supported this work.
0
1
3
(6/n) But this is only beginning and there is still a lot of work to be done. You can play with this platform at https://t.co/EkbxDsiFZR and reach out if you have any questions or excited about this research direction.
1
1
3
(5/n) Having all that in mind, we have built a system that takes a substrate, selects its neighbors in chemical space, looks up known reactivity, collects enzymes and their neighbors in enzyme space, and finally a separate model reranks this list to produce final output. 🤖
1
1
7
(4/n) ... of course, it's not always the case, so some screening will be required. What we want here is that a researcher finds working biocatalyst sooner ⏲️, than later.
1
1
4
(3/n) So, what's next? We need to somehow recommend 🧠enzymes that might work for a particular substrate. Here we exploit two very simple ideas — similar substrates should have similar reactivity and similar enzymes should have similar reactivity as well. And ...
1
1
4
(2/n) For building such a system, one would need a big enough dataset 💾 first. The preprint introduces an enzyme library 📚, aKGLib1, and a dataset, BioCatSet1, that contains information about reactivity of enzymes from the library against a set of more than 100 substrates.
1
1
4
At @gpggrp we published a preprint in collaboration with @NarayanLab and particularly awesome @Apatoneh. #ML #AI #chemistry #science The paper introduces a ML-driven platform for guiding biocatalyst selection in small molecule synthesis. But how does it work? 🧵 (1/n)
1
1
15