Leo Dirac Profile Banner
Leo Dirac Profile
Leo Dirac

@leopd

Followers
6,088
Following
824
Media
144
Statuses
5,019

CTO and Co-founder of Groundlight AI, training AI to see the world. Ex-physicist, ex-google, ex-amazon. Deep learner since 2012. (he/him)

Seattle, WA
Joined April 2007
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@leopd
Leo Dirac
5 years
I love this photo of my grandpa, just listening, like he did.
@phalpern
Paul Halpern
5 years
Titans of quantum electrodynamics: Paul Dirac and Richard Feynman, conversing at a gravitational physics conference held in Warsaw, Aug 1963 #histSTM
Tweet media one
7
94
350
15
137
1K
@leopd
Leo Dirac
4 months
I have mad respect for Karpathy. But RL agents will not find exploits in physics that give us infinite energy, or anything like that. As somebody who knows a thing or two about both AI and physics, I am quite certain of this. The so-called standard model of particle physics…
54
86
848
@leopd
Leo Dirac
2 years
Banning GPU sales to China will slow them down in AI for a few years, but will push them to develop their own GPUs faster. Ironically this political move threatens NVIDIA's global dominance by encouraging a well-funded competitor to invest urgently.
56
83
809
@leopd
Leo Dirac
5 years
Waaat? You can find eigenvectors for a hermitian matrix from the eigenvalues of it and its submatrices? New linear algebra fact discovered by physicists researching neutrinos I didn't quite believe it so I tried it myself: It works.
Tweet media one
20
226
818
@leopd
Leo Dirac
2 years
Deep Learning replaced linear models because it automated feature engineering by trying thousands of possible nonlinearities and learning which ones worked. Similarly, transformers are replacing task-specific NN architectures by learning how to combine the input signals. 1/
13
107
729
@leopd
Leo Dirac
3 years
Nothing says "startup growth" like a thick stack of laptops and monitors for new hires. Super excited about our awesome new team members!
Tweet media one
23
19
604
@leopd
Leo Dirac
5 years
Named Tensors! Built in to PyTorch 1.3. Now I won’t need a comment on every single line of math code documenting the Tensor dimensions. #ptdc19
Tweet media one
4
88
471
@leopd
Leo Dirac
4 years
This is pretty cool - AWS announces a public quantum computing (QC) platform named after my grandfather's notation for wave functions. Great that even in its infancy, QC is already getting democratized, not centralized in the hands of a few powerful research groups.
@jeffbarr
Jeff Barr ☁️
4 years
Amazon Braket - Get Started with Quantum Computing - #awsreinvent
11
193
411
9
101
439
@leopd
Leo Dirac
2 years
Visiting Cambridge and ⁦ @lawrennd ⁩ showed me the portrait of my grandpa hanging in the St John’s College dining hall. Both humbling and inspiring.
Tweet media one
10
14
409
@leopd
Leo Dirac
4 years
Stochastic Weight Averaging (SWA) is totally magic! It takes me days to train my validation score down from 2.40 to 2.30, but just taking the mean of a bunch of checkpoints I have lying around and I'm down to 2.07. Thanks @andrewgwils and team for figuring this stuff out.
7
27
344
@leopd
Leo Dirac
4 years
Nice video interview of my grandpa. Content rambles between symmetry, gravity, time -- the deep physics questions of the late 20th century. Not scientifically important, but I think the best video recording of him I've seen.
@phalpern
Paul Halpern
4 years
Rare interview with extraordinary quantum physicist and Nobel Laureate Paul Dirac, who successfully predicted antimatter: #histSTM
Tweet media one
12
46
159
8
43
282
@leopd
Leo Dirac
5 years
My thoughts on Google’s #QuantumSupremacy claim, which I haven't seen elsewhere. Even though my last name is Dirac, and my grandpa discovered lots of quantum theory, I’m no expert in QC. But I did get A’s in my quantum classes, and understand numerical computing very well. 1/8
8
68
283
@leopd
Leo Dirac
2 years
Most ML hyper-parameters are naturally log-scaled. If you're doing an automated configuration sweep, don't let your tools linearly search over things like learning rate, weight decay, or embedding dimension.
3
28
267
@leopd
Leo Dirac
5 years
Trying to speed up your python programs? Tired of writing "print(time.time() - start_time)"? Check out "timebudget", a new tool I built for very simple profiling in python. Inspired by tqdm's simplicity. Literally just a few lines of code.
5
48
258
@leopd
Leo Dirac
4 years
Multi-headed self-attention is the new fully-connected layer.
2
13
190
@leopd
Leo Dirac
4 months
This is currently my favorite quick introduction to the standard model. Does a great job of separating the things in physics that effect our reality from all the details we understand but have no bearing on anybody’s life except physicists.
6
12
188
@leopd
Leo Dirac
4 years
When building Amazon Machine Learning in 2013, I was the only deep learner on the team. I remember in an early planning meeting sharing that I wanted to be able to check my training jobs from my phone. Never happened. Now that I'm using @weights_biases I finally can.
4
14
173
@leopd
Leo Dirac
4 years
I feel like the fate of the world lies in leaders being able to understand logarithmically scaled charts.
13
27
152
@leopd
Leo Dirac
2 years
When will we see NN chips built specifically for Transformers? (I'm often asked.) With NVIDIA, I'd say we're already there. Look at benchmarks for the new H100 chip - the task it is best at is a Transformer. Seems they're already optimizing their silicon for Transformers. 1/
Tweet media one
5
16
144
@leopd
Leo Dirac
1 year
People still love this talk I gave on how Transformers work and the history of NLP that led to them. Years later people keep finding and watching it. Makes me think I should record more.
@Jeande_d
Jean de Nyandwi
2 years
LSTM is dead. Long Live Transformers This is one of the best talks that explain well the downsides of Recurrent Networks and dive deep into Transformer architecture.
Tweet media one
Tweet media two
12
156
828
12
17
141
@leopd
Leo Dirac
4 years
Transformer models have dramatically changed NLP in recent years, outperforming previous techniques like LSTM in almost every way. An exception has been that they don’t scale well to large documents because they cost O(N^2) in document length. This paper offers an O(N) solution.
@sinongwang
Sinong Wang
4 years
Thrilled to share our new work! "Linformer: Self-attention with Linear Complexity". We show that self-attention is low rank, and introduce a linear-time transformer that performs on par with traditional transformers. Check our here:
Tweet media one
Tweet media two
7
86
355
2
20
115
@leopd
Leo Dirac
4 months
There are so many good reasons to use on-prem GPU workstations for AI instead of putting everything in the cloud. Cloud is great, of course. But local workstation hardware can not only be cheaper, but enable higher productivity than shared cloud resources.
@hardmaru
hardmaru
4 months
First day back into the office in 2024 and setting up more GPU workstations
Tweet media one
24
33
958
8
5
111
@leopd
Leo Dirac
4 years
Cool new AutoML from AWS: SageMaker Autopilot. Unlike other AutoML black boxes which just give you a model, this also gives you working code that you can learn from, adapt, and customize. Great to set baselines and get started. (Disclosure: this was my baby before I left Amazon.)
3
19
101
@leopd
Leo Dirac
2 years
Previously anybody could buy GPUs, so it was a commercial matter for Chinese companies to compete on, who would need to justify investment for possible return. Now it's the CCP's problem, and they have a massive firehose of money suddenly motivated to fix the problem.
5
3
91
@leopd
Leo Dirac
4 years
Remember when quarantine seemed cute?
@gnuman1979
jamie
4 years
Quarantine day 6.
21K
700K
3M
2
13
78
@leopd
Leo Dirac
2 years
It's typically an order of magnitude more work to write code that's generic for lots of purposes than to write one-off code that just does what you're trying to do right now.
10
5
78
@leopd
Leo Dirac
4 years
I was recently asked if I'd prefer to use @PyTorch or @TensorFlow for a small task, with indication they'd prefer TensorFlow. I invoked this tweet to support my choice.
@karpathy
Andrej Karpathy
7 years
I've been using PyTorch a few months now and I've never felt better. I have more energy. My skin is clearer. My eye sight has improved.
36
428
2K
6
8
74
@leopd
Leo Dirac
3 years
Self-supervised learning (SSL) can seem confusingly magical - without labels how could it learn a semantically useful representation? It learns based on the input data's distribution, and the augmentations that change the input but that you insist not change the representation.
7
9
75
@leopd
Leo Dirac
5 years
In this sense, every spectrometer in every chemistry lab could be considered a quantum computer that has achieved #QuantumSupremacy for decades. But only if you include very specific computational problems that are intrinsically quantum in nature. 8/8
7
8
75
@leopd
Leo Dirac
5 years
I built this “electronic funhouse mirror” for Halloween to turn people’s faces into animals and monsters using modern deep learning methods in real time. Thanks to great tools from @nvidia and @PyTorch I could do this in a weekend on my laptop!
3
11
71
@leopd
Leo Dirac
4 years
I've said it before and I'll say it again: SWA is like magic. I think for many deep learning practitioners this will be the fastest easiest way to improve your model quality with almost trivial code changes.
@PyTorch
PyTorch
4 years
Stochastic Weight Averaging (SWA) is a simple procedure that improves generalization in deep learning over Stochastic Gradient Descent (SGD). PyTorch 1.6 now includes SWA natively. Learn more from @Pavel_Izmailov , @andrewgwils and Vincent:
5
206
864
2
16
69
@leopd
Leo Dirac
2 years
So it does seem fairly inevitable to me that Transformers will effectively take over deep learning, ML, and AI in coming years. Even if they're not optimal for every task, their generality will lead to a useful standardization of tools, algorithms, and even hardware. 6/
5
1
68
@leopd
Leo Dirac
4 years
"AI" in 2020 := Any algorithm that is so complex it defies explanation or interpretation, and is capable of both impressive results and embarrassing failures.
2
7
66
@leopd
Leo Dirac
3 years
I'm convinced legged robots like this will be commonplace before too long. It's amazing watching how they recover from failures.
5
11
63
@leopd
Leo Dirac
2 years
Instead of manually engineering the inductive bias by carefully picking which neurons to connect, and which weights to re-use, Transformers connect all the neurons in every combination, and learn which connections to actually use through attention mechanisms. 2/
1
1
61
@leopd
Leo Dirac
4 years
Playing with @PyTorch JIT compiler. Very cool stuff that lets you write your code within standard python, and later compile it to TorchScript for production inference or embedded use. I wrote a simple function decorator to make it easier to try:
1
12
54
@leopd
Leo Dirac
1 year
My company Groundlight is coming out of stealth today, offering Computer Vision powered by Natural Language for industrial and commercial applications, integrated from edge to cloud to real-time human monitoring.
8
9
51
@leopd
Leo Dirac
2 years
In this way Transformers perform something a lot like NAS (Neural Architecture Search) but within a single simple SGD process instead of complex inner and outer optimization loops needing RL techniques or surrogate models. 5/
1
2
50
@leopd
Leo Dirac
4 years
My kids' room has glow-in-the-dark stars on the ceiling, accurately showing Orion and Taurus. About a year ago during some horseplay, Betelgeuse got knocked down. Now it seems this accident might just turn out to be prophetic.
3
8
46
@leopd
Leo Dirac
4 years
@hardmaru Open source licenses. Newer versions of bash use GPLv3 which is a pretty business hostile license. Apple was stuck on a very old version of bash that had GPLv2. Licenses matter.
3
4
47
@leopd
Leo Dirac
3 years
I wish GitHub issues were more like StackOverflow questions. I see at the top it's closed, but I have to sift through pages and pages of comments to figure out why it was closed. Was it ever fixed? Did the team say they'd never fix it? Was it merged? Auto-closed?
4
0
44
@leopd
Leo Dirac
2 years
Me to 11yo daughter: "Tomorrow I'll teach you the quadratic formula." Her: "YAAY!!! Finally..." ❤️❤️❤️
2
1
42
@leopd
Leo Dirac
3 years
Postgres is adding the ability to query for similar vectors with kNN. All those years at AWS lobbying for a kNN service, seeing so groups with huge databases of embeddings, struggling to deploy them in production.
4
3
42
@leopd
Leo Dirac
4 years
Mag-lev self-solving cube. Wow.
@wyrm06
サカン竜一郎@民族楽器愛好家
4 years
メーカーズフェアにて。目を疑うんですが、机の上に浮いて自力で色が揃っていくルービックキューブです。時々詰まっちゃうのがちょっと可愛い。隣で見てたおじさんが僕の心中を完全に代弁してくれててウケました。
65
27K
85K
2
8
40
@leopd
Leo Dirac
2 years
I love Spring time in Seattle!
Tweet media one
1
1
41
@leopd
Leo Dirac
4 years
How is it that Tensorboard stays so bad while the rest of deep learning tools improve so rapidly?
11
1
40
@leopd
Leo Dirac
4 years
However everybody should be fully aware and honest about the fact that QC has exactly zero practical applications today. Top QC researchers are still trying to find anything that today's QCs can do that's actually useful. You wouldn't know this by reading
1
5
38
@leopd
Leo Dirac
5 years
For me, Google’s #QuantumSupremacy claim is similarly unimpressive. They carefully designed a problem that could be solved much faster on QC than on a classical computer. But it’s not an interesting problem that anybody would want to solve in any other setting. 5/8
1
11
40
@leopd
Leo Dirac
4 years
Many hyperparameter optimization (HPO) runs yield negligible gains. Too often these "gains" are just a lucky sample from the training noise. BUT some big steps can also be seen as HPO. VGG was AlexNet with HPO. EfficientNet largely too. Scaling up often requires careful HPO.
2
8
39
@leopd
Leo Dirac
2 years
Transformers can learn to apply something a lot like a convolution when needed, or like a recurrent connection if only the previous input is useful to process the next. But critically they can learn much more complex relationships between inputs. 3/
1
1
38
@leopd
Leo Dirac
4 years
I'm thinking about how the world changes in coming years if NVIDIA succeeds in buying ARM. CUDA on every mobile device? Sounds pretty cool actually.
6
4
38
@leopd
Leo Dirac
2 years
As with linear models -> NNs, with transformers we again have a step up in computational complexity in exchange for less problem-specific analysis, which is almost certainly sub-optimal from a model quality perspective. 4/
1
1
37
@leopd
Leo Dirac
4 years
You know it’s been a good vacation when you can’t remember your laptop password. 🧐
3
0
35
@leopd
Leo Dirac
4 years
Overheard, my 5yo talking to herself: "Live from NPR news, I'm Jack Speer. President Trump said Blah blah-blah blah BLAH." My little self-supervised learner seems to be overfitting.
3
2
33
@leopd
Leo Dirac
5 years
Last day at Amazon today. Wow what an amazing ride! Six years building world class ML infrastructure.
6
1
32
@leopd
Leo Dirac
5 years
Reading RL papers with amazing results from @OpenAI and @DeepMindAI I'm struck by a stylistic difference. One spends most of the paper proving how awesome their result is compared to other techniques. The other goes into great detail explaining how and why their techniques work.
2
3
32
@leopd
Leo Dirac
5 years
I don't see a fundamental problem using massive compute to advance AI - research means pushing what's possible with today's tech to inform tomorrow. But I fully support the proposal by @etzioni and others to publish compute cost and efficiency in papers.
@karpathy
Andrej Karpathy
5 years
💻🧠+🌍🌳 recent reads: Green AI vs Red AI and "Tackling Climate Change with Machine Learning"
Tweet media one
9
132
488
3
7
30
@leopd
Leo Dirac
4 years
For an engineering perspective on this I offer a talk I gave last year which details how the transformer works and why it is expensive.
3
4
32
@leopd
Leo Dirac
5 years
Impressive AutoML paper from my old team at Amazon: An elegant solution to the critical real-world problem in hyperparameter tuning of picking your search ranges. A simple data-driven approach works surprisingly well. Hope this gets into SageMaker soon!
2
7
31
@leopd
Leo Dirac
3 years
I think it's awesomely hilarious when people assume the lab workspace behind me is a zoom background. Then I walk back into it and play with the robot.
6
0
30
@leopd
Leo Dirac
5 years
Automatic Domain Randomization (ADR) is one of the key innovations enabling @OpenAI 's recent impressive Rubik's cube result. Having absorbed the whole paper today (great use of a day!) I'll summarize the ADR algorithm here #MLTLDR 1/7
2
7
31
@leopd
Leo Dirac
4 years
I designed SageMaker Autopilot to be AutoML for all skill levels. New users can build a decent model, and look at the generated code for data prep and training to learn good practices. Hand it to expert coworkers for an easy baseline to build upon.
Tweet media one
2
8
30
@leopd
Leo Dirac
4 years
No better way to learn some material than to teach it. Now that I’ve volunteered to teach it, I gotta really understand it.
0
1
30
@leopd
Leo Dirac
4 years
If I wasn't clear, I'm a huge fan of the @huggingface transformer package. NLP problems that took dozens of engineer/scientist years of effort just a few years back are now straightforward for motivated individuals. Total sea-change.
1
2
30
@leopd
Leo Dirac
4 years
Today I ran been every Dick’s Drive-In in Seattle, eating at every one. #BurgerRally2020 with @rachelbeda and others. 5 stops, 18 miles, three milkshakes, three fries, three deluxe and a special. (Minus beef.) Still kinda hungry.
2
1
29
@leopd
Leo Dirac
4 years
It's very satisfying to finish a careful hyperparameter search and realize that the configuration you'd already hand selected after a bit of experimenting is, in fact, just about optimal.
5
1
29
@leopd
Leo Dirac
2 years
My mind is kinda blown. Did you know you can just run python+numpy code directly in a browser? On my mac it's 15x faster than plain javascript (but still 40x slower than native CPU). I'm becoming convinced WASM is an important trend.
Tweet media one
Tweet media two
Tweet media three
1
5
27
@leopd
Leo Dirac
5 years
In 2014, a bot beat the Turing Test by pretending to be a non-native-English speaking teenager who understandably avoided lots of questions. They won, but who cares? We're still a long way from AI capable of convincing conversation. 4/8
1
2
28
@leopd
Leo Dirac
4 years
@_brohrer_ It really doesn’t help to leave out sensitive features like race or gender. These are highly correlated with many useful behavioral features. And you can’t leave out everything they’re correlated with - there would be nothing left sometimes.
3
2
27
@leopd
Leo Dirac
3 years
That time I trained an LSTM as a generative music model using Daft Punk as training data. It sounded like... Daft Punk. 😂
4
2
26
@leopd
Leo Dirac
3 years
Managed to catch Jupiter and Saturn in the ~15 minutes between getting dark and disappearing behind clouds. That’s Seattle astronomy for ya.
Tweet media one
4
0
26
@leopd
Leo Dirac
4 years
Now that we understand deep learning reasonably well, it means any process or system that can be differentiated through can be directly optimized with neural networks. Making a differentiable physics simulator is a difficult task, but would enable some amazing things.
@Synced_Global
Synced
4 years
DiffTaichi, a new differentiable programming language based on Taichi and specially tailored for building high-performance differentiable physical simulators. #MachineLearning #ArtificialIntelligence #programming
0
1
7
2
7
26
@leopd
Leo Dirac
3 years
Breakfast options for the kindergartner: “Would you like corn spheres, oat toroids, wheat manifolds, or corn matrices?” “Corn matrices please.” #geekparenting
Tweet media one
1
4
25
@leopd
Leo Dirac
5 years
By analogy, it's possible in theory, but computationally intractable, to discover through quantum simulation that water molecules “vibrate” at 2.45 GHz. But this result is very easy to obtain in a physical lab by measuring the interacting wave functions of actual water. 7/8
2
1
26
@leopd
Leo Dirac
5 years
I have a feeling this technique could become standard for all computer vision in coming years. Big claim I know. But this seems to elegantly solve a fundamental problem with CNNs.
@HaohanWang
Haohan Wang
5 years
Neural Networks Are Not Robust Enough: they exploit the association between local patches (e.g., background) and the label. Our #NeurIPS2019 paper w. @zacharylipton fights against this tendency. Camera-ready: It also introduced ImageNet-Sketch dataset.
Tweet media one
3
69
276
1
5
26
@leopd
Leo Dirac
5 years
Google's problem involves simulating interacting quantum wave functions. Computational chemists knows these calculations are notoriously hard. But ironically (or obviously) physical systems perform these “calculations” almost instantaneously in the real world. 6/8
1
2
25
@leopd
Leo Dirac
4 years
Classic Google: "Cloud Print needs some engineers to do some boring work to keep it running. Bueller? Let's just cancel it. The world doesn't need our expertise for this anymore." Contrast Amazon's customer obsession against Google's employee obsession.
2
2
25
@leopd
Leo Dirac
3 years
One of my friends that I used to go to this thing in the desert with became fascinated with a little town in the desert we drove through, got to know some people there, wrote a book about them, then that book became a movie, and people voted it the best movie of the year. Wow.
2
0
24
@leopd
Leo Dirac
5 years
Yesterday I ran my first marathon. Surprised at how fast I ran - under 4 hours including a wrong turn (extra 1/3 mile) and a couch break for some cake and to snuggle the Blerch. Thanks @Oatmeal for organizing a super fun event! Finished 10th out of 131.
2
0
24
@leopd
Leo Dirac
5 years
@evainfeld @physicsjackson It goes against the standard narrative for Paul that he enjoyed doing things other than equations, but it was true.
1
0
24
@leopd
Leo Dirac
4 years
I wonder if any countries other than China will be able to contain their covid-19 outbreaks. I wouldn't be surprised if it's only been possible there because of a population that truly believes in collective responsibility, coupled with a very decisive government.
4
2
23
@leopd
Leo Dirac
2 years
I dropped my phone in the ocean today while SUP’ing. I’m about to go see a friend for dessert. Without a phone. Leaving the house without Internet. First time in … years. Slightly terrified. Send me strength. I can do it.
2
0
24
@leopd
Leo Dirac
2 years
So what comes next after feature engineering and then NN architecture engineering? I’m guessing “sequence engineering” where we try to figure out the right way to phrase our tasks as questions that Transformers can answer. 7/
1
0
23
@leopd
Leo Dirac
4 years
What’s wrong with this picture? (It’s a seattle parking kiosk.)
Tweet media one
9
2
21
@leopd
Leo Dirac
3 years
I realized how silly it was to have a space heater under my desk in the garage when I have a rack of GPUs in the other corner. Now the GPUs vent right at my desk. Mmmmm, toasty warm training jobs.
1
1
23
@leopd
Leo Dirac
4 years
I gave a talk this week on methods for quantifying uncertainty in neural nets, and posted the notebooks I used to generate the example data. Fun to compare Deep Ensembles, MC Dropout and GP's on a toy regression problem.
1
3
23
@leopd
Leo Dirac
5 years
I see this as very similar to when the Turing Test was officially beaten in 2014. While the accomplishment met the original criterion, it missed the spirit of the goal. The Turing Test was a (THE?) key goalpost in AI for decades, but the actual victory seems meaningless now. 3/8
1
2
22
@leopd
Leo Dirac
3 years
My cozy garage lab workspace. GPUs to keep me warm. UR5 to lend a hand. e-Stop button in easy reach.
Tweet media one
4
0
22
@leopd
Leo Dirac
4 months
@dhuang26 True! But it's a far cry from exploiting buffer overflows in reality - this is the natural 2024 extension of many decades of fusion research. Dream big. But ground in realism for getting things done. AI won't discover magic, it can help us build things that will seem magical.
2
1
21
@leopd
Leo Dirac
4 months
"Four years ago! Shocking." "This aged like fine wine." "I need that belt." ^^^ three most recent youtube comments on my 2019 video explaining transformers.
0
1
21
@leopd
Leo Dirac
5 years
My hard-working GPU laptop sounds like it grew an angry floppy drive that's constantly seeking. Poor thing wasn't built to optimize neural nets 24/7. And the warranty just expired. So I'm grabbing the precision screwdrivers and going in! Wish me luck...
6
0
21
@leopd
Leo Dirac
3 years
I have a number in [-1,1] range I want to stretch toward 0, so I'll raise it to a power like 2 or 3. I'll have to correct the sign, right? Let's check... >>> -0.5**2 -0.25 Huh - I guess python has some math magic here. Cool. NOPE! >>> (-0.5)**2 0.25 Gets me every time. 😂
1
1
18
@leopd
Leo Dirac
3 years
Finished installing solar panels on our VW camper van (named Beethoven). People often ask if it can drive on solar power. It can't. But if it could, it would probably take about 2 hours of full sun to get enough charge to drive a single mile. 😂
Tweet media one
3
2
21
@leopd
Leo Dirac
6 months
HR tip: Don't schedule interviews for the day before a holiday. Four years ago I interviewed at OpenAI, on the Wed before Thanksgiving. I thought all the interviews went well except the last one of the day, where the interviewers seemed just annoyed from the start. Go figure.
2
2
20
@leopd
Leo Dirac
2 years
One-off code is especially important in data science, where you often face questions like "does this technique help or even work for us?" The answer is often "No" and the less code you can write to reach that answer the better.
2
0
20
@leopd
Leo Dirac
3 years
I am super impressed with University of Washington's COVID testing. I developed some symptoms, so yesterday morning I called them at 8am. Got a test appointment for 2 hrs later, and (neg) results posted by 10pm that night. 14hrs from calling for a full PCR test result. So good!
2
0
20
@leopd
Leo Dirac
3 years
Chewing on this paper showing that a single biological neuron needs a time convolution net (TCN) with at least 5 layers to replicate its behavior. RNN seem like a much better choice than TCN - harder to train, but much more biologically plausible.
2
5
20
@leopd
Leo Dirac
4 years
I love the sentiment here to try to avoid algorithmic bias. But the scientific advice is actually bad and misleading. Excluding sensitive features like gender from your inputs really doesn’t help. Better to identify them and force algorithmic fairness.
1
4
20
@leopd
Leo Dirac
6 months
LoRA is the fine-tuning algorithm I always wished I had when I was building models in the early days. It's easy to ensure you don't catastrophically forget the original data, and gives you a simple knob to say how important your fine-tuned dataset is vs the original.
3
1
20