
Marc Brockschmidt
@mmjb86
Followers
1K
Following
408
Media
14
Statuses
468
Learning to learn about structured things (molecules, programs, ...) @GoogleAI. Ex @MSFTResearchCam, ex @Debian. Opinions my own.
Joined October 2013
In programs such hyperedges appear naturally all the time - the token sequence, function calls, data flow, etc. In the paper we show how our model can exploit this to outperform strong GNN and Transformer baselines. 3/3
0
0
3
The intuition is that the concept of a qualified hyperedge (e.g. student(name=..., major=..., ...)) can be treated like a sequence in a Transformer, with qualifiers as "positions". That yields a "message" for each adjacent vertex, and you can treat that like message passing. 2/3
1
0
3
Some of my last work at MSR, led by @DobrikG and @miltos1. It introduces a transformer variant capable of processing hypergraphs, and shows how that is useful for program and knowledge base tasks. 1/3
HEAT: Hyperedge Attention Networks Dobrik Georgiev Georgiev, Marc Brockschmidt, Miltiadis Allamanis https://t.co/0I6t8RWO7e
2
4
27
Sweet news: today, I'm starting as a Research Scientist at @GoogleAI, working with old friends such as @dtarlow2, @RandomlyWalking and @miltos1 on learning to assist software engineers. The team, the data, and the opportunies for impact are truly exciting!
8
3
74
Bittersweet news: a week ago, I had my last day at Microsoft Research. I've spent almost 9 years there, and learned so much from the lovely and brilliant people working at MSR. A new adventure is coming, but first, I'll enjoy 3 months of a break.
14
2
163
Excited for our #ICLR2022 Grammformers work: 💡A grammar-based transformer model generates code inserting holes where it is uncertain about the concrete completion. 👉Reduces mistakes (~potential bugs), ie, handles uncertainty, instead of generating 🗑. 📄 https://t.co/6Xi0f9KZzF
1
15
71
Come chat with us about MoLeR (or anything else) during Poster Session 10 @iclr_conf (1st one on Thursday) 🧪
Designing new pharmaceuticals in the lab requires knowledge, creativity, experience, and intuition—and it can take years. Discover how Microsoft Research and Novartis are collaborating to help chemists streamline this complicated process:
0
5
19
Another paper on our work with the drug discovery experts at @Novartis, showing how to learn to generate drug-like molecules starting from desired scaffolds. Work led by @MaziarzKris, presented in session 10 tomorrow at @iclr_conf.
Designing new pharmaceuticals in the lab requires knowledge, creativity, experience, and intuition—and it can take years. Discover how Microsoft Research and Novartis are collaborating to help chemists streamline this complicated process:
0
2
19
The paper about MoLeR - our generative model of molecules - got accepted to #ICLR2022! It has all the good stuff: simple modelling (no random novelty enhancements), flexibility (works both from scratch and from scaffolds), and lots of experiments. See https://t.co/AQCUlD107b 1/3
5
34
230
The ELLIS ML4Molecules workshop will take place on Monday 13th (co-located to NeurIPS), and there is still a chance to register! Fantastic line up of speakers! https://t.co/mgv0gq5xdY
moleculediscovery.github.io
Information and call for papers of the ELLIS Molecule Discovery workshop
0
21
82
We released baselines, data, eval scripts - everything and the kitchen sink - on https://t.co/rtOjVfjQrg. Try your hand at this now! We are also hiring in this space:
github.com
FS-Mol is A Few-Shot Learning Dataset of Molecules, containing molecular compounds with measurements of activity against a variety of protein targets. The dataset is presented with a model evaluat...
0
0
1
(2) actually useful! Precise in-silico modelling allows to speed up early-stage drug discovery projects, which is why we collaborated on this with our @Novartis partners. Crucial if you want to build automatic tools helping chemists to refine drug candidates. 3/4
1
0
1
Exciting for 2 reasons: (1) few-shot learning outside of computer vision! Totally different game if you can't start with an ImageNet-pretrained feature extractor. How do you learn from many small datasets? Our baselines show that DL methods (e.g., MAML) do not do well so far. 2/4
1
0
2
Some great work led by @mjanestanley on a dataset and benchmark for few-shot learning on drug activity prediction. Just presented as part of the @NeurIPSConf dataset & benchmarks track, with all code & data on https://t.co/rtOjVfjQrg. 1/4
github.com
FS-Mol is A Few-Shot Learning Dataset of Molecules, containing molecular compounds with measurements of activity against a variety of protein targets. The dataset is presented with a model evaluat...
The new FS-Mol dataset and pretrained baseline models demonstrate the promise of few-shot methods in the challenging domain of low-data drug activity prediction.
1
3
10
Our recent work with @mmjb86 at #NeurIPS2021 💡Play hide-and-seek: a model learns to hide bugs, another learns to find them 📄 https://t.co/sRbyWFHgKj 🕸While code gen LMs are now popular, generating correct-ish code will first require methods that detect bugs in existing code.
arxiv.org
Machine learning-based program analyses have recently shown the promise of integrating formal and probabilistic reasoning towards aiding software development. However, in the absence of large...
Fixing bugs in code can be time-consuming and frustrating for software developers. A promising deep learning model can be taught to detect and fix bugs, without using labeled data, through a “hide and seek” game called BugLab. #NeurIPS2021
1
3
31
@miltos1 Oh, and we are hiring: https://t.co/LTBHG5pYbD - talk to me or @sebnowozin if you are interested!
1
1
5
The paper has a large number of additional ablation results, with somewhat surprising results (GNNs doing better than GREAT, for example). Come to our poster session on Friday, 4:30pm GMT:
0
0
1