I am looking for strong PhD interns to join Apple MLR early 2024! Topics will be around diffusion generative models broadly speaking and you’ll be in the bay area (SF/Cupertino). Apply here
Excited for this to be out! Introducing GAUDI: a generative model for 3D indoor scenes. We tackle the problem of learning a generative model of 3D scenes parametrized as radiance fields. This has been a great collaboration across multiple teams at
@Apple
.
Introducing Generative Scene Networks (GSN), a generative model for learning radiance fields for realistic scenes. With GSN we can sample scenes from the learned prior and move through them with a freely moving camera.
Arxiv:
Scenes sampled from the prior:
I find it interesting that the perception of the ML community is that
@Apple
"does not publish" or that it "does not contribute frameworks". Anyways, I'm going to start actively sharing my colleagues works to gently push back on that perception :)
Interested in neural fields and generative models? Check out Diffusion Probabilistic Fields (DPF) ! A diffusion model that can be trained directly on fields in a single stage. DPF outperforms recent approaches based on latent representations of fields. 1/5
We are looking for residents to join MLR at
@Apple
for 2023! We are specially interested in candidates with a strong expertise (MSc/PhD) on physical sciences (eg. physics, climate, bio, chem) and exposure to computational models/ML.
Introducing Manifold Diffusion Fields (MDF), our new work on learning generative models over fields defined on curved geometries. This is joint work with our intern
@Ahmed_AI035
(who hasn’t even started his PhD yet!) and
@jsusskin
at
@Apple
MLR 🧵
Manifold Diffusion Fields
present Manifold Diffusion Fields (MDF), an approach to learn generative models of continuous functions defined over Riemannian manifolds. Leveraging insights from spectral geometry analysis, we define an intrinsic coordinate system on the manifold via
Two papers accepted at
#ICLR23
with great colleagues at
@Apple
MLR!
- f-DM: introducing progressive latent transformations in image diffusion models.
- Diffusion Probabilistic Fields: training diffusion models directly on neural fields in a single stage.
More details soon!
Congrats to my lead authors at Apple on getting two great papers accepted to
#ICLR2024
:
-
@Ahmed_AI035
with Manifold Diffusion Fields ()
-
@xmzhao_
with Dynamic Novel View Synthesis ()
We will be presenting GAUDI at
#NeurIPS2022
! Excited to chat about this and more generative models for 3D in Nola. I’ll share more about code release and checkpoints for GAUDI after ICLR deadline.
Excited for this to be out! Introducing GAUDI: a generative model for 3D indoor scenes. We tackle the problem of learning a generative model of 3D scenes parametrized as radiance fields. This has been a great collaboration across multiple teams at
@Apple
.
Thanks for sharing
@ak92501
! I’ve always been interested in simple yet effective methods that scale. In this recent
@Apple
paper on learning 3D view synthesis we follow that philosophy.
Fast and Explicit Neural View Synthesis
pdf:
abs:
model obtains comparable or even better performance than recent sota approaches using radiance fields, while rendering objects at over 400x speed up
Some intern positions still open at the Machine Learning Research team
@Apple
! If you are a PhD student interested in generative models, neural rendering and language grounded 3D vision please consider applying through the links or DM me.
We have open positions for interns/ft research scientist in our team
@Apple
! If you are interested in generative models of the 3D world, neural rendering or language grounded 3D vision please apply and reach out.
#iccv2021
We wrote a blogpost summarizing our generative model for scene level radiance fields (GSN) paper to be presented at
@ICCV_2021
. If this post and area of research are interesting to you check out FT/intern opportunities on our team
Heading out to
#ECCV2022
, happy to be back to in-person conferences! Ping me if you want to chat about the research happening at
@Apple
around generative models for 3D, we have open internship/FT positions.
Official code release for our Generative Scene Networks
@ICCV_2021
paper! We provide code, training data and pre-trained models for you to try out. Check out the interactive exploration notebook, you can move through scenes sampled from the generator!
Here’s a way to speed up training even more. Optimize the parameters of the neural network jointly on a training set of multiple samples (eg. multiple scenes for NeRF, multiple objects for SDFs, etc). Let the hash tables be specific for each sample. Boom! Now you are amortizing!
Attending
#NeurIPS2022
after a few year hiatus! I will be giving an expo talk about “Generative Understanding of 3D Scenes”
@Apple
Mon at 3pm, and presenting the poster for GAUDI () on Thu at 9:30am. Will also be at the booth on Tue from 3-5pm. Come say hi!
Fast and Explicit Neural View Synthesis
pdf:
abs:
model obtains comparable or even better performance than recent sota approaches using radiance fields, while rendering objects at over 400x speed up
Happy to announce that our paper “On the generalization of learning-based 3D reconstruction” was accepted to
@wacv2021
!!! I would like to highlight that we got very insightful reviews and comments that will be included in the camera-ready version.
If you are attending
@NeurIPSConf
check out all the papers we are presenting! Also swing by our booth if you want to hear more about internships and FTE opportunities! I will be there on Weds 11am-1pm to answer all your questions :)
Interested in generative models and neural fields/INRs? Come by our poster where we present “Diffusion Probabilistic Fields” an approach for learning distributions over fields that can be trained in a single stage. Tuesday morning poster session!
#ICLR2023
If you are at
#ICML2023
please check out the following Apple papers . I won’t make it in person this year but please reach out to any of my fantastic colleagues that are around!
Starting the long trip to Kigali! Excited for
@iclr_conf
and catching up colleagues! If you want to chat about internship/FT opportunities at
@Apple
let me know!
Planning to attend
@CVPR
? Check out the workshop sessions on Sunday, I will be talking about generative modeling for fields and manifolds at the
@_LXAI
workshop!
👨🎙️ Miguel Bautista, Research Scientist at Apple MLR. Ph.D in Machine Learning from the University of Barcelona.
🚀🪩 He will be a Keynote Speaker at
@_LXAI
@CVPR
! Introducing his current research focus:
🖌️ "Generative Modelling: from images to functions and manifolds."
We then learn a generative model over latent representations using a diffusion model. This allows us to tackle both unconditional and conditional inference tasks. Like generating 3D scenes and camera trajectories from text prompts (additional results on ):
More cool stuff on 3D scene generation! I’ve been waiting for someone to look into 3D consistency for inpainting-style objectives. IMO, having the text prompt being spatially distributed is the next layer of complexity.
SceneScape: Text-Driven Consistent Scene Generation
abs:
project page:
text-driven perpetual view generation -- synthesizing long videos of arbitrary scenes solely from an input text describing the scene and camera poses
We have open positions for interns/ft research scientist in our team
@Apple
! If you are interested in generative models of the 3D world, neural rendering or language grounded 3D vision please apply and reach out.
#iccv2021
I finally have some time to engage in the discussion of our conformer generation paper sparkled by
@tkipf
's tweets. There's 3 things I'd like to clarify:
1) Symmetries are really important for any learning algorithm! Without structure learning gets harder!
Since this tweet sparked quite a bit of lively discussion, I'd like to add a bit more nuance:
1) I think we absolutely should study symmetry in the context of (scalable) ML; this particular result only reinforces this IMO. Understanding trade-offs w.r.t. symmetry group "size",
Interesting times ahead, as bigger 3D datasets become available I predict the community will shift to “3D gen models from the ground up” as opposed to “distilling 2D models into 3D”.
Generative models for 3D are 🔥. This Friday at
@ml_collective
I will be talking about GAUDI, our approach to learn generative models of unconstrained 3D scenes. Check if you are interested in attending. Really looking forward to a great discussion!
Excited for this to be out! Introducing GAUDI: a generative model for 3D indoor scenes. We tackle the problem of learning a generative model of 3D scenes parametrized as radiance fields. This has been a great collaboration across multiple teams at
@Apple
.
Introducing Generative Scene Networks (GSN), a generative model for learning radiance fields for realistic scenes. With GSN we can sample scenes from the learned prior and move through them with a freely moving camera.
Arxiv:
Scenes sampled from the prior:
NeRF papers have become quite frequent nowadays (and I predict it will become even more so). Out of all the papers that have come out recently, to me this is the one that points to the most interest direction so far.
Zero-Shot Text-Guided Object Generation with Dream Fields
abs:
project page:
combine neural rendering with multi-modal image and text representations to synthesize diverse 3D objects solely from natural language descriptions
This may be Apple's biggest move on open-source AI so far: MLX, a PyTorch-style NN framework optimized for Apple Silicon, e.g. laptops with M-series chips.
The release did an excellent job on designing an API familiar to the deep learning audience, and showing minimalistic
More work from MLR at Apple! Check out this fantastic paper by
@AggieInCA
and team. How can we effectively evaluate SSL models without requiring labels? :)
ICLR24 Spotlight: To train general-purpose SSL models, it's important to measure the quality of representations during training. But how can we do this w/o downstream labels?
We propose a new label-free metric to eval SSL models, called Linear Discrimination Analysis Rank(LiDAR)
What would happen if we pretend all samples are neural fields in diffusion generative models? I’ll be talking about work that we have been doing on this direction at
#Apple
MLR on Sunday 1:30pm
@_LXAI
workshop!
#CVPR2023
#neuralfields
Our incredible Keynote speakers joining us at the upcoming
@_LXAI
workshop at the
@CVPR
conference next month in Canada! 🌟 Get ready to be inspired by their expertise and insights.
🚩Sunday, June 18, 2023.
🏦Vancouver Convention Center, Canada.
Congratulations
@Ahmed_AI035
this is so exciting! It was a pleasure to host you for your first internship at
@Apple
even before you started your Phd!!! Cant wait to see all the cool stuff you will do!
The great
@YuyangW95
and I will be presenting this tomorrow at
@genbio_workshop
in the morning poster session! Come to learn about why a non SE(3) equivariant model gets state of the art performance in conformer generation!
1/n New preprint alert! Introducing Generative Molecular Conformer Fields (MCF) a generative model for molecular conformer generation that obtains state-of-the-art results without using any domain specific inductive biases!
Happy to get an outstanding reviewer award from
@ICCV_2021
, I will be donating my free registration to a researcher from an under-represented group that wishes to attend. Will share details soon.
MDF is an oral at the Diffusion Models workshop tomorrow at
@NeurIPSConf
! Catch
@Ahmed_AI035
’s talk (via zoom because well…visas) and also
@YuyangW95
and I will be around in the poster session! Come say hi and lets chat about practical diffusion models on manifolds!
Introducing Manifold Diffusion Fields (MDF), our new work on learning generative models over fields defined on curved geometries. This is joint work with our intern
@Ahmed_AI035
(who hasn’t even started his PhD yet!) and
@jsusskin
at
@Apple
MLR 🧵
Our incredible Keynote speakers joining us at the upcoming
@_LXAI
workshop at the
@CVPR
conference next month in Canada! 🌟 Get ready to be inspired by their expertise and insights.
🚩Sunday, June 18, 2023.
🏦Vancouver Convention Center, Canada.
Cool stuff! Although I wonder how would the training speed/accuracy compare if you replace the 27 SH parameters per voxel vertex with a single linear layer of the same dimension. IMO that’s the right baseline to compare with. Are SH params easier to learn?
Plenoxels: Radiance Fields without Neural Networks
abs:
project page:
propose a view-dependent sparse voxel model, Plenoxel, that can optimize to the same fidelity as NeRFs without any neural networks
Catch
@emidup
and myself at
#ICML2020
tomorrow at 1pm PDT if you want to chat about Equivariant Neural Rendering. We show that you can learn neural representation of scenes that allow for real-time view synthesis by enforcing equivariant relationships during training.
Equivariant neural rendering - by learning neural representations that transform like 3D scenes, we build models that can render novel views of complex scenes from a single image, without requiring 3D supervision. With collaborators
@Apple
.
Paper:
I will visiting
@CMU_Robotics
next week to talk about generative models of fields in the VASC seminar . Excited to chat with the awesome faculty and students! If you are around and want to chat please ping me :). Thanks for the invite
@FerranDeLaTorre
!
Diffusion models tend to be notoriously slow during inference. Check out this amazing piece of work by great colleagues at
@Apple
looking at the problem of distilling diffusion models for single-step sampling. Congrats
@D_Berthelot_ML
and team!
New paper TRACT - Faster diffusion model sampling
- Single-step diffusion SotA for CIFAR10 and ImageNet64 with L2 loss without architecture changes
- Up to 2.4x FID improvement
Data scale + transformers + autoregressive objective is the gift that keeps on giving! Now also in vision :) What an incredible work led by
@alaaelnouby
and team from Apple MLR. Check out the repo with checkpoints and bindings to MLX/Jax!
Excited to share AIM 🎯 - a set of large-scale vision models pre-trained solely using an autoregressive objective. We share the code & checkpoints of models up to 7B params, pre-trained for 1.2T patches (5B images) achieving 84% on ImageNet with a frozen trunk.
(1/n) 🧵
Publication link also available at:
Project page with some additional visualizations now online at (source code will be available in the next few weeks)
Introducing Generative Scene Networks (GSN), a generative model for learning radiance fields for realistic scenes. With GSN we can sample scenes from the learned prior and move through them with a freely moving camera.
Arxiv:
Scenes sampled from the prior:
I’ve had several situations where a paper wasn’t ready to submit by conf. deadline date. This usually causes added stress (specially to phd students/interns), being able to submit when *work is ready* is going to be great for the community.
Today,
@RaiaHadsell
,
@kchonyc
and I are happy to announce the creation of a new journal: Transaction on Machine Learning Research (TMLR)
Learn more in our post:
Check out our new work on learning generative models of functions on graphs for molecular conformer generation, led by the amazing
@YuyangW95
! A few things that I found really exciting about this work:
1/n New preprint alert! Introducing Generative Molecular Conformer Fields (MCF) a generative model for molecular conformer generation that obtains state-of-the-art results without using any domain specific inductive biases!
Check out this new work lead by
@bogdan_mazoure
and
@waltertalbott
using diffusion models to capture the distribution of the value function! IMO this is a very interesting way to think about how do we leverage large probabilistic models in RL settings.
Latest preprint from
@Apple
MLR - we use conditional diffusion models + Perceiver I/O to learn the policy's state visitation and the value function on hard offline robotic tasks . Work with
@waltertalbott
,
@itsbautistam
, Devon, Alex and
@jsusskin
.
Only by going through this path will we be able to point the camera back at simple internet images and not just see the "Egyptian cat" class, but condition on the image to instantiate full generative 3D reconstructions of worlds consistent with that observation.
We've open sourced a
@PyTorch
implementation of our paper "Equivariant Neural Rendering"! This includes the weights of all trained models as well as the MugsHQ 🍵 and 3D Mountains 🏔️ datasets we created
💻 Code:
📄 Paper:
We decompose the generative model in two stages, we first learn latent representations that encode the 3D radiance fields and corresponding camera poses for thousands of trajectories. This task is formulated as optimization problem over latents and network parameters.
The fundamental problem of unsupervised correspondence learning is oftentimes formulated using this framework. Things become trickier when f^{-1} is not defined and needs to be approximated, a couple of cool papers dealing with it:
A powerful idea in math (that nobody teaches you directly…):
If you don't know how to map between two "things," you can often map each of them to the same "canonical thing."
Then you can just go from the 1st thing to the canonical thing, and back to the 2nd thing. [1/n]
Introducing Generative Scene Networks (GSN), a generative model for learning radiance fields for realistic scenes. With GSN we can sample scenes from the learned prior and move through them with a freely moving camera.
Arxiv:
Scenes sampled from the prior:
Just in time for the holidays, we are releasing some new software today from Apple machine learning research.
MLX is an efficient machine learning framework specifically designed for Apple silicon (i.e. your laptop!)
Code:
Docs:
Catch me and
@YuyangW95
at
@NeurIPSConf
next week! Happy to chat about anything from generative models, geometric deep learning, applications in scientific domains, as well as in vision and 3D. Also happy to chat about internship and FTE opportunities! Come join Apple MLR!
I’ll be attending
#NeurIPS2023
next week. Please feel free to reach out if you want to chat about generative models, AI4Science, geometric deep learning, and more!
On my way to
#CVPR2019
, looking forward for an exciting conference week. Ping me if you are interested in the CV/ML research happening at
@Apple
and don’t forget to come by our booth!
If you are attending
#ICML2020
and want to chat about the research going on at
@Apple
(including opportunities for internships/FT or our paper on Equivariant Neural Rendering) you can chat live with me tomorrow at 11am or 3pm PDT.
In GAUDI we don’t rely on a pre-trained text-to-image model and design a generative model for 3D indoor scenes from the ground up . It’s only a matter of time until similar models can be trained on internet scale 3D datasets.
I will be participating in a Meet Apple session at
#ICCV2021
on October 13 at 1:30 pm PDT. Join to learn more about our ML teams and the different ways you can work at Apple. Visit our virtual booth for information:
#apple
Check out our latest
#icml2020
paper where we show that equivariance is a powerful inductive bias for neural rendering. Work lead by
@emidup
during his internship at
@Apple
Equivariant neural rendering - by learning neural representations that transform like 3D scenes, we build models that can render novel views of complex scenes from a single image, without requiring 3D supervision. With collaborators
@Apple
.
Paper:
Learning latents for radiance fields and camera poses is critical. As opposed to single objects (which can always be rendered from cameras on the sphere), the set of valid camera poses depends on each scene. Therefore, we need to encode which camera poses are valid for each scene
StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D
paper page:
In the realm of text-to-3D generation, utilizing 2D diffusion models through score distillation sampling (SDS) frequently leads to issues such as blurred appearances and
This has been Terrance DeVries project as an intern with our team at Apple, with collaborators
@nitishsr
, Graham W. Taylor (U. Guelph / Vector Institute) and
@jsusskin
Check out the fantastic video explanation by
@artsiom_s
of a couple of papers on self-supervised learning that we worked together on a few years ago before CPC was cool. Those were good times
@artsiom_s
!
My new video on self-supervised representation learning (also easy to understand for beginners). I explain CliqueCNN which builds compact cliques for classification as a pretext task and I discuss other self-supervised learning approaches.
@itsbautistam
I love seeing more papers on scene generative models. I believe that to make substantial progress in RL we need very powerful world models. We are just seeing the beginning of what these models are capable of!
Pathdreamer: A World Model for Indoor Navigation,
@kohjingyu
et al. ()
A neural network hallucinating indoor scenes from a single given observation in a previously unseen building.
Possibilities are endless:
I wanted to advertise that we have opportunities in my team at Apple MLR in scalable/distributed ML for multimodal foundational models, embodied AI, planning and reasoning. More info below but feel free to send me a note.
"The company is publishing regularly, it's doing academic sponsorships, it has fellowships, it sponsors labs, it goes to AI/ML conferences. It recently relaunched a machine learning blog where it shares some of its research"
Neat idea! I’ve been waiting for lightfields to strike back in 3D and compete with radiance fields, I think we are going to see more work in this direction.
Point cloud code releases in
#geometrycentral
C++, with Python bindings on pip!
Fast computation of geodesic distance, nearest-geodesic-neighbor interpolation, parallel transport, and the logarithmic map.
(C++)
(Python)
(1/4)
In order to model radiance fields for unconstrained scenes we decompose them into many small locally conditioned radiance fields which are conditioned on a latent spatial representation of a scene W.
The prior learned by GSN can be used for view synthesis: by inverting GSNs generator we can complete unobserved parts of a scene (T) conditioned on a sparse set of views (S).
Excited to be at NeurIPS in New Orleans next week and hope to see many of you there! On Wednesday, my co-authors (
@jramapuram
,
@PierreAblin
, Tatiana Likhomanenko, Xavier Suau, Russ Webb) and I will present our🥳spotlight-awarded🎉work “How to Scale Your EMA”.
Sharing this because I believe this work deserves more attention! I really enjoyed the clean and elegant formulation of the problem from the lens of interpolating densities.