Jason Kim Profile
Jason Kim

@jason_z_kim

Followers
2K
Following
245
Media
68
Statuses
300

Postdoctoral researcher at Cornell interested in representation and computation in latent spaces of biological and artificial neural networks.

Joined July 2010
Don't wanna be here? Send us removal request.
@jason_z_kim
Jason Kim
1 year
Ever wanted a low-dimensional model of your data that you could be confident would explain data structure and accurately re-embed out-of distribution data, all with minimal distortion of the geometry? Now you can with Γ-VAE! demonstrated on gene data.
Tweet media one
1
18
66
@jason_z_kim
Jason Kim
4 months
RT @LindenParkes: Excited to share the first major piece of work and preprint from my lab! Led by @jason_z_kim! 🥳🎉🤘. .
0
6
0
@grok
Grok
5 days
The most fun image & video creation tool in the world is here. Try it for free in the Grok App.
0
39
367
@jason_z_kim
Jason Kim
10 months
Working to understand the physics of how the brain works? Interested in understanding how brain function and collective dynamics emerge from neural interactions? Submit an abstract to our focus session "Statistical and Dynamical Physics of the Brain", APS2025 w/@ChrisWLynn!
Tweet media one
0
6
30
@jason_z_kim
Jason Kim
1 year
RT @NatureProtocols: #FeaturedProtocol this week is for a Python-based #software package to apply network control theory to the #humanconne….
0
16
0
@jason_z_kim
Jason Kim
1 year
RT @LindenParkes: Our protocol paper for NCT is now online at @NatureProtocols!! Check it out here: @jason_z_kim @….
0
25
0
@jason_z_kim
Jason Kim
1 year
In summary, by accurately preserving the manifold tangent spaces in low-dimensional embeddings, we better preserve the geometry of the data, make our embeddings more interpretable, make trustable models with great out-of distribution generalization, and uncover biology.
1
0
2
@jason_z_kim
Jason Kim
1 year
On this same dataset, Γ-VAE can also accurately re-embed cell gene expression on days 4 and 6 while only being trained on cells from day 2.
Tweet media one
1
0
1
@jason_z_kim
Jason Kim
1 year
Γ-VAE works on single-cell RNAseq too. We look at a lineage tracing experiment in hematopoietic stem cells. Γ-VAE can separate undifferentiated cells along their eventual fates at 60% accuracy using 3 dimensions: the same as the original authors using hundreds of dimensions.
Tweet media one
1
0
1
@jason_z_kim
Jason Kim
1 year
If we do this re-embedding across our 33 cancer tissues, we get incredible re-embedding consistency, meaning that the model you build is a model you can trust.
Tweet media one
1
0
1
@jason_z_kim
Jason Kim
1 year
Zooming in, the re-embedding preserves crucial cancer phenotypes, including the sepration of normal breast cancer tissues from triple-negative breast cancers, which are highly resistant to hormone therapy.
Tweet media one
1
0
1
@jason_z_kim
Jason Kim
1 year
But the real test of a model is whether it can make predictions. And here we put Γ-VAE to the test. First we train a Γ-VAE on all of our data. Then, we completely remove all breast cancer samples, train a second Γ-VAE, and see where the points re-embed. It'st he same picture!
Tweet media one
1
0
1
@jason_z_kim
Jason Kim
1 year
We also capture meso-scale structure in carcinomas, namely the separation of squamous-cell and adenocarcinomas, and uncover a common axis from 9 healthy tissues from GTEx, and their corresponding adenocarcinomas from TCGA.
Tweet media one
1
0
1
@jason_z_kim
Jason Kim
1 year
We uncover lots of beautiful biology, including the blood-brain barrier for the adaptive immune response, the p53 pathway that is often called the "guardian of the genome," and the epithelial to mesenchymal transition that is hijacked by cancer to metastasize.
Tweet media one
1
0
1
@jason_z_kim
Jason Kim
1 year
Using this method, we construct a gently curved, 3-dimensional model of human gene expression for healthy tissues from the Genotype Tissue Expression (GTEx, , and The Cancer Genome Atlas (TCGA, with nonlinear, slowly-varying axes.
Tweet media one
Tweet media two
1
0
1
@jason_z_kim
Jason Kim
1 year
We regularize this curvature to generate Γ-VAE, which gives us a control knob on what are called the parameter-effects curvature, and extrinsic curvature, which gives us nice, smooth manifolds with long correlation lengths in the tangent spaces.
Tweet media one
Tweet media two
1
0
1
@jason_z_kim
Jason Kim
1 year
Variational autoencoders (VAEs) excel at constructing statistical latent-variable models as generative manifolds through data. But when your data are in clusters, these manifolds are highly curved between clusters, so you don't know where you're going after one data cluster.
Tweet media one
Tweet media two
1
0
1
@jason_z_kim
Jason Kim
1 year
This is the case in human gene expression, where each healthy or cancer tissue has a distinct genetic signature, but they also have global trends that span across many clusters. UMAP excels at clustering the data, but fails to capture this meso-scale organization.
Tweet media one
Tweet media two
1
0
2
@jason_z_kim
Jason Kim
1 year
There are 101 ways to embed data like PCA, UMAP and VAEs, and they are all excellent at different things. Something that's hard for every method is multiscale data. What happens when you have highly clustered data, but you want to understand the organization across clusters?.
1
0
0
@jason_z_kim
Jason Kim
2 years
RT @apd_flynn: 🚀 Exciting news alert! 🚀 . Christoph Räth (@DLR_de) and I are thrilled to announce that our minisymposium proposal, 'Dynamic….
0
3
0
@jason_z_kim
Jason Kim
2 years
So if you're interested in more interpretable, translatable, and computationally more effective RNN models of how brain uses dynamics to compute and run algorithms, give it a read!.
0
0
9