This has come up again, so I’m going to repeat it:
If you’re learning ML and want to “reimplement a paper”, you should work from the *github code*, NOT the pdf.
The algorithm that the authors actually ran is often subtly (& unintentionally) different from what the paper says.
I've been working on this for months, and I'm super excited to share it 🤩
It's a tool to quickly turn a scenario like "riding in a lyft" into an estimated probability of getting COVID (screenshot)
We hope this helps people make more informed decisions!
We are delighted to introduce , a tool to numerically estimate the COVID risk of specific ordinary activities.
We hope you’ll use this tool to build your intuition about the comparative risk of different activities and to make safer choices!
Ok, I realize now that
if you didn't do half a PhD studying human vision with fMRI, this quote doesn't make sense; and
"OMG" isn't an explanation;
also
@goodfellow_ian
is messaging me on chat asking great questions;
so let me just broadcast an explanation of why "OMG": 1/n
OMG:
"We scanned with fMRI a unique group of adults who, as children, engaged in extensive experience with a novel stimulus, Pokemon. [...] the experienced retinal eccentricity during childhood predicts the locus of distributed responses to Pokemon in adulthood."
I've given a lightning talk twice now about "why should we care about adversarial examples?"
At popular request, here's a written-up version of it:
My views here align strongly with what
@IAmSamFin
and
@jeremyphoward
said on twitter a few days ago.
Our paper "Skill Rating for Generative Models" is now up!
tl;dr: A new idea & proof-of-concept for evaluating generative models. Train a bunch of GANs. Have the generators "play against" all the discriminator snapshots. Rate them like chess players. 1/n
Hey twitter - I'm looking for some recommendations for math that is fun and satisfying to learn, and at least a bit relevant to ML/AI. Ideally with a textbook or set of lectures that's clear and engaging. What do you suggest?
(Multi-agent systems / game theory? Control theory?)
When people ask how to get good at ML, it's common advice to "implement papers". It's seldom explained exactly what that involves!
This *fantastic* article details the full journey of one person's side project to implement Deep RL from Human Preferences:
Exciting news - I've accepted an offer to join the Open Philanthropy Project (
@open_phil
)!
This will be one step "meta" for me: instead of direct research/engineering work on ML security, I'll be helping fund others to do similar work (in a broader set of related areas).
1/3
This reminds me of watching folks test a vision system in Patrick Winston's lab at MIT.
Lab members would do actions (jump, lift) in front of a camera, and the system would label them- flawlessly! But guests couldn't make it work, because they hadn't learned to "jump" correctly.
I am starting a twitter circle, for a smaller audience for thoughts/blurtings on How To Be A Person In AI Land; like, What We Should Do Given All This Is Going On
If you feel like an ally to me in this and would like to help me in thinking stuff through, please LMK to add you!
I'm co-organizing an ICML workshop to host debates on the future of AI:
Notably, the focus is *not* on smack-downs and controversy; rather on making space for nuance, discussing falsifiable predictions, and changing your own and other's minds. [1/2]
Hi, I'm an AI engineer with an interest in policy. You may know me from my greatest hits "When you say 'AI', do you mean linear regression or far-future systems?" "When you say 'AI will never', do you mean 'current methods don't'?" and "No, we haven't solved adversarial examples"
Hi, I'm a creative AI user. You may know me from my greatest hits "No, it's not self-aware," "Actually, I'm the creative one, not the algorithm," "Stop generating birds with no feet and two heads," and "Okay yes technically I DID ask for that but that's not what I meant"
A few months ago I joined
@AnthropicAI
!
It has been super delightful working with
@ch402
and the rest of the team 😁
My job is to hang out with neurons in language models (to try to figure out what they're doing), which involves building tools to help us explore and inspect.
Everyone in ML research complains about the existing peer review tools/systems.
If you're not a researcher, but you ARE someone who cares about ML as a field being sane, has a knack for product engineering, and understands communities, you could make a HUGE impact. [...]
Plug-and-play differential privacy for your tensorflow code:
where you would write `tf.train.GradientDescentOptimizer`
instead just swap in the `DPGradientDescentOptimizer`
The tutorial at is quite clear and good!
Our team
@GoogleAI
has just released a new library for training machine learning models with (differential) privacy for training data! Very excited to share it with the research community and we look forward to your contributions! Check it out on GitHub:
I just saw someone at
#NIPS2017
drinking their coffee using a spoon. Definitely makes me feel better about my own quirks when I see someone else do something quirky. You go, spoon-coffee dude. Don’t let anyone try to change you.
Why does it take you so long to figure out what’s going on this image? Your visual system is constantly on the lookout for *faces* and gets stuck on spurious configurations.
(hint: rotate your display!)
h/t
@UofGCSPE
I would argue there's an *even more* underpriced asset: people who are not *yet* on an incredible growth trajectory, because they've never been given the resources and support they need.
To find them, don't just sit back and observe people's trajectories. Step up - support them.
If you can't motivate yourself to do something because you *don't care about it*,
and you don't care about it because it truly and genuinely *doesn't matter*,
=> then there's *nothing wrong with you*.
Your motivation system is working as designed 👍
To summarize:
The answer to
"What determines the physical layout of category-selective visual areas in the brain?"
is likely, at least in part,
"Retinal eccentricity"
that is
"Which part of your eye you use: do you look at this category straight-on, or peripherally?"
/fin
TIL that chickens can be hypnotized by drawing a line on the ground in front of their face. (video here: )
So much for "biological visual systems aren't susceptible to adversarial examples"!
This happened on both papers I submitted this year - approximately "Well-organized and clear evidence that the effect is real, under many conditions. But authors don't explain *why* it happens. Weak reject"
I refuse to fabricate explanations... but I'm being incentivized to :(
Similarly, reviewers often read a submission about a new method hat performs well and say to reject it because there is no explanation of why it performs well
"Birds vs bicycles" is easy for ML in the average case, but *totally unsolved* in the worst case. For safety-critical applications, we *need* to fix this.
We're launching the Unrestricted Adversarial Examples Challenge - *any* image of a bird or bike is a valid attack.
In case you missed it-- my favorite part of Activation Atlases () is this novel method of generating unrestricted adversarial examples!
1) Inspect the class activation atlas for the difference between source and target (see image)
...
Q1: do pokemon avatars end up represented in the same part of the brain for everyone?
A1: YES, if you played pokemon for a bajillion hours as a kid. NO, if you didn't. /8
Bad news: Neural nets can be trained with a backdoor (BadNets: )
Good news: want to redistribute your network? install a backdoor to label it as your intellectual property (Watermarking: )
"BadNets: It's not a bug, it's a feature!"
I want to just highlight something important that's mentioned in the latest OpenAI release, but has been said before, and stands out to me as a key motif in human feedback and alignment:
*You can't just freeze a reward model and maximize it*
1/
since apparently twitter is all about WFH takes right now, I just want to say I *love* being able to work "insane" work hours
currently ~1pm-7pm and again from midnight until "whenever I feel done" which is sometimes 2am and sometimes literally 5am
sleep 4am-noon
it's great
I wish I felt more socially "allowed" to be excited about stuff the same way 3-year-old boys are excited about trucks.
Instead, I feel that if I claim to be interested in something, I need to back it up with experience or skill.
Prob a combo of gender and programming culture :/
RIP Patrick Winston.
I'm grateful that you invited me to spend time in your lab, a very special community of curious folks.
That you passed on to me your narrative of the larger arc of AI research over the decades.
And taught me how to think, speak, and teach with clarity.
We're just like the baby monkeys in the other study, forced to look at made-up glyphs for hours & hours per day.
You can't approve a study to force human 8-year-olds to stare at a small set of little symbols daily for years.
But children can voluntarily do it to themselves! /7
I'm very grateful to the Anthropic colleagues who put Claude on our slack a year ago. As a result I've watched the whole company interact with it since then, and have a pretty good feel for its vibe and behavior.
There's no replacement for sheer time spent with actual behaviors!
In research (ML and in general) neither academia nor industry seems to have figured out how to systematically teach talented newcomers how to become productive researchers. Individual mentors yes, but not robust and transferrable best practices. This is a huge missed opportunity.
So: this study.
Some 8-year-olds in my generation spent *HOURS* staring at avatars of Pokemon. Always with the gameboy held right in the center of our vision, at the same position.
Some 8-year-olds didn't.
This is a *perfect* natural experiment for neuroscience. Hence "OMG" /6
Check out our paper! Make your GAN training more stable by keeping the generator well-conditioned during training:
@nottombrown
and I had a lot of fun training deliberately-misbehaving generators for Appendix B, don't miss it!
And you're not an ML engineer unless you can tell what model architecture is training by listening to the noises your GPU makes. ... (note: I believe
@AlecRad
can/did indeed actually do this)
The
@OpenAI
charter released today includes a commitment to join up with other projects (rather than competing) in case of a race to build AGI first.
IMO, this is a big deal - they hadn't promised anything like that publicly before.
An interesting punchline: claimed improvements in previous work were due to implementation mistakes. So, the improvements were real, but appeared only in the code and not in the equations in the papers, and had nothing to do with what the authors believed they were doing.
Delighted to share this with you!🎉😁
For months, I filled our spare cluster capacity with single-GPU tiny-transformer jobs, to bring you this exploration of in-context learning!
If you get a chance, try playing around with induction heads in your own models or public models ->
In our second interpretability paper, we revisit “induction heads”.
In 2+ layer transformers these pattern-completion heads form exactly when in-context learning abruptly improves.
Are they responsible for most in-context learning in large transformers?
@abiylfoyp
some reasons:
- unambitious peers
- got wrapped up in a very ideological group
- trauma, esp. sexual assault, unaddressed with therapy or social support
- academically inclined, then dead-end-ish and unsupportive PhD/postdoc environment
Private training data can easily be extracted from the predictions of a trained model. Your user data (health data, private information) isn't safe by default.
The good news? Adding just a little randomness can fully eliminate the memorization effect.
Turns out it's possible to recreate training data from a NN using only black box api access--no need for params. Upshot for medical researchers and vendors is that if you train on unanonymized patient records, your model is PHI.
Excited to share our first interpretability paper!
I particularly want to highlight the release of PySvelte, without which none of my work would've been possible.
IME, learning to write your own extreeeemely janky javascript visualizations is a hugely powerful research skill!
Our first interpretability paper explores a mathematical framework for trying to reverse engineer transformer language models: A Mathematical Framework for Transformer Circuits:
New preprint by
@smnh_azadi
,
@catherineols
, Trevor Darrell,
@goodfellow_ian
, and me: . We perform rejection sampling on a trained GAN generator using a GAN discriminator. This helps quite a lot for not-much effort.
Fantastic question ("Is there a good reason why many basic laws of physics are linear or quadratic (for example, F=ma), not much more complex?") and fantastic answer!
@LauraDeming
Linear or quadratic laws often come from a Taylor series expansion around an equilibrium point. Usually the first derivative is non-zero, so you get a linear law. If the first derivative vanishes (e.g. due to a symmetry), you get a quadratic law instead. Rare for both to be zero.
OMG:
"We scanned with fMRI a unique group of adults who, as children, engaged in extensive experience with a novel stimulus, Pokemon. [...] the experienced retinal eccentricity during childhood predicts the locus of distributed responses to Pokemon in adulthood."
Wow, thanks for the recommendations everyone!
Here's a spreadsheet of everything recommended:
Comments are turned on, feel free to suggest corrections or additions!
Hey twitter - I'm looking for some recommendations for math that is fun and satisfying to learn, and at least a bit relevant to ML/AI. Ideally with a textbook or set of lectures that's clear and engaging. What do you suggest?
(Multi-agent systems / game theory? Control theory?)
The more I wrap my mind around *scale* (Eg orders of magnitude of money - $10k vs $1m vs $100m etc), the more blindingly obvious it is that people earning wages are playing a TOTALLY different (and vastly shittier) game than the one behind so many large shifts in the world
I love that constitutional training doesn't shy away from admitting that there's always principles, and makes them explicit & transparent.
"Who decides?" becomes more tractable this way. I spent a little time using UN documents to write constitutional principles, it was great!
We’ve trained language models to be better at responding to adversarial questions, without becoming obtuse and saying very little. We do this by conditioning them with a simple set of behavioral principles via a technique called Constitutional AI:
I've decided to offer a *mutually counterfactual* donation match on this! 💖
That is: If you donate $ that you would not otherwise have donated anywhere, reply with screenshot and I'll 1:1 match with money I likewise would've kept for personal spending (above my usual 10%/yr) ⭐️
Christmas is a time of peace and gift giving.
@xriskology
and I are putting aside our differences to give to the poorest people in the world, via
@GiveDirectly
. Perhaps you'll join us.
I'm quoted in saying "Today’s algorithms do what you say, not what you meant", which feels delightfully meta:
Despite my fear that today's journalists report *neither* what interviewees said nor meant,
@tsimonite
here seems to have done *both*!
Our paper "In-context Learning and Induction Heads" is now available as a PDF on arxiv!
... that said, it's still typeset like the interactive web version, so it's long. Compact, LaTeX-typeset versions are on our eventual roadmap!
When I quit my PhD, I would tell people it was "lonely" compared to my experiences as a software engineer. Sometimes they'd ask "wait, why? don't you have collaborators in academia?" It was hard to explain the difference, but I would usually try to point at "truly shared goals."
I shouldn’t have to say this, but... if you *must* classify people (which... do you have to?? 😬) at least don’t *train on actors* if you’re gonna use it to classify real people! 😵
(This turns up in “emotion detection”, too. The face I make to “look sad” isn’t real sadness!)
Today I learned that if I see a rectangular grid of multicolored natural images (especially faces), I immediately think I'm looking at GAN samples. ¯\_(ツ)_/¯
... that said, if this were a face GAN, it would get top marks for diversity & quality. Looks like a fun conference!
Announcing our 2018 speaker lineup! Join us May 13-14 in NYC (or on our live stream) to hear about the joy, excitement, and surprise of computing from these cool people.
What have been your favorite *on-the-merits* *pro-release* OpenAI GPT-2 takes (on twitter or elsewhere)?
I'm looking for clear good-faith explanation of the pro-release (or anti-media-attention?) position right now, not clever snark.
If you know what adversarial examples are, and you think they probably seem important... but you're not sure *exactly* why... (or if you think the importance has something to do with crashing cars by putting stickers on stop signs)... then READ THIS.
Motivating the Rules of the Game for Adversarial Example Research:
Fantastic and nuanced position paper by
@jmgilmer
@ryan_p_adams
@goodfellow_ian
on better bridging the gap between research on adversarial examples and realistic ML security challenges.
When I'm thinking about something challenging, and I notice that it's harder than I thought, some part of my mind nags at me to tab over to some other happier task.
I just realized that this is the mental equivalent of an RL agent pausing Tetris to avoid losing the game. T_T
So your visualization method can explain a trained net's decisions? Don't forget the control group!
@julius_adebayo
&al show that many methods *also* give broadly the same "explanation" for the "decisions" of an *untrained, randomly-initialized* net.
Yikes!
If we're going to keep using human preferences & raters as a crucial part of training AI systems (which IMO is necessary, if we're gonna use AI, for it to go OK!), we need to design robust & humane processes for those workers!
"They were all wearing adversarial masks [...] our object detectors told our security system that 'three chairs are running at 15 kilometers per hour down the corridor'"
More delightful fiction from
@jackclarkSF
's Tech Tales
I often advise new Research SWEs that researchers often need a good *implementation*, not a good *framework*.
A clean, readable, tried-and-true, already-debugged-and-tested implementation can be copied, forked, and modified with confidence.
6/ As a researcher who also builds research infra, I think that SWEs underestimate how disposable code is, and spend an inordinate amount of time designing over-generalized abstractions. A common mistake in AI field is to invest a quarter building infra for algos that don't work.
This short article by Richard Sutton encapsulates an important part of how I currently think about AI:
"We have to learn the bitter lesson that building in how *we think* we think does not work in the long run.” (emphasis added)
The mechanistic interpretability team at Anthropic is hiring! Come work with us to help solve the mystery of how large models do what they do, with the goal of making them safer.
One of my fav results in this field shows that "what gets recognized where" is NOT shaped by the *order* you learn the categories.
They taught baby monkeys 3 types of totally made-up shapes, a different order per monkey. Each type still went to a consistent brain location. /5
Today's pet peeve:
"We will/won't achieve <AI milestone X> by <year Y>"
without any reason whatsoever for the *specific number Y*.
If you see this happening, you can help by just asking "How did you get that number?"
My favorite part of the
@OpenAI
blog post (debate as a framework for human supervision of AI systems that are more expert than us) is their fantastic use of
@distillpub
-style mouse-over visualizations, enabling a deeper understanding of the behavior of their MNIST prototype.
I tweeted earlier about finding people who aren't yet on a steep trajectory, but could be with support.
Evaluating candidates not on raw performance *or* raw trajectory, but on *how well they took advantage of opportunities*, seems like a great way to find them. cc
@sama
We got *really strong* applicants -- 1300 in all (up from 850 last year) for a target class size of 50. This year, we added to our evaluation: "how well did they take advantage of opportunities?"
If you'd like access to GPT-2 in order to work on socially beneficial applications or extensions of it, defenses against generated content, etc., then IMHO you should contact OpenAI and actually make a request. The type of requests they get will shape their policy around sharing.
I couldn't be more excited to be running the AI Fellowship program for the 3rd year - it's my primary priority in my work at
@open_phil
and I'm very passionate about it!
If you have any Qs, please just ask! Many of the current fellows are also on twitter, and very friendly :)
Applications are open for the Open Phil AI Fellowship!
This program extends full support to a community of current & incoming PhD students, in any area of AI/ML, who are interested in making the long-term, large-scale impacts of AI a focus of their work.
I haven't been using this account as much for the past ~year, but I'd like to start again!
What I'm looking for is heartfelt intellectual curiosity, thoughtfulness, and object-level observations - eg
@juliagalef
@michael_nielsen
@albrgr
@kanjun
-
Who else should I follow? 😁
I'm extremely excited about the new DC AI policy center, led by Jason Matheny! (CSET - the Center for Security and Emerging Technology)
This interview gives some flavor of Jason's skilled and humble leadership:
Good luck to all!
What I find most interesting about this:
1. Self-play & randomization. If you can frame your task as adversarial training in a simulated env, and randomize such that the test env is in the distribution, it may be solvable today with no new techniques, "just" boatloads of compute
Super jazzed about this. If you think you can code-switch well enough to convince these readers that this AGI risk thing is wrong/crazy, there's a prize.
(I do think you need to "speak their language", and I am happy to help serious entrants with editing/feedback on that front!)
We're making big bets—a substantial fraction of our capital—on AGI xrisk. But we think it's really possible that we're wrong!
Today we're announcing prizes from $15k-$1.5M to change our minds (or those of superforecasters) on AGI. Enter by Dec 23!
One of my favorite things to do on AI Twitter is disambiguate when people are just talking past each other 😊
I wish this could scale more! But even if I could make a Claude bot do this, I suspect a harder problem is socially judging if piping in to clarify will be welcome 🤔😇
Folks, this is *thirty five* lines of code* to modify a cat image so that style transfer turns it into guacamole. If you're a beginner and have some patience, you can understand this!
… …
*(not counting the pre-trained style transfer network)
Rephrase:
If I show you images of faces, a particular side of a particular fold of your brain will react to those pictures WAY more than other pictures.
It's the *same* side of the *same* brain fold in everyone.
Different areas for houses/places, body parts, text, etc. /3
"Keep in mind nets are lazy and if you can "solve" a task by doing something "basic" you'll only learn "basic" things." <- YES, this.
And the flipside: Unless we (somehow) have a task that can *only* be solved by doing "the right thing", nets will NOT be doing "the right thing".
We *are* as a field developing and training models that *are* using more context but where exactly where we are on that trend-line is a great question.
Keep in mind nets are lazy and if you can "solve" a task by doing something "basic" you'll only learn "basic" things.
Today I'm the illustration from the manual for the Matra & Hachette Ordinateur Alice, a computer from 1983 😍
(This is an obscure callback to a tweet I saw over a year ago, which leveled up my understanding of why representation in tech matters! )
A2-a: There's NO preferential response in the face, body, or animal areas.
A2-b: The "face" spot is more lateral than the "place" area: there's a separate Pokemon area, and it's is EVEN MORE lateral than that.
The best predictor of this is looking *directly at* Pokemon:
/10
Cool note from
@sidorszymon
:
Q: How could bots learn that killing Roshan is valuable, from random exploration?
A: Roshan's HP was randomized in training, sometimes suuper easy to kill.
Result? Bots still sometimes go "check out" Roshan, to see if "today" is an "easy-Roshan" day
OAI5's Riki investigated Roshan but didn't go for it. 1% win percentage. Gold advantage is down to 21k - nope, back up to 22k. 43-21. The small momentum that the OAI5 team had seems to be transient. 43-22 after another good gank by the OAI5 team.
This whole statement is so transparently written 💜
Epistemic clarity in public statements like this is a huge boon to the commons, allowing others to learn and make good plans. It's so much more common for the underlying mental models, bets, and new information to be concealed.
Distill will be taking a one year hiatus starting today, which may be extended indefinitely. Our editorial team has written some reflections on what we've learned over the past five years, and the factors that led to this decision.
If I were to estimate, the
@OpenAI
hackathon felt like it was 20-40% female. That's not 50%, so there's still progress needed. But I want to emphasize that, in general, 20-40% makes for a *vastly better* subjective experience (personally) than 0-10%. Hard to overstate the delta.
a very ingenious use of non-robust features (a la Ilyas et al 2019 Features Not Bugs):
deliberately inject *new* synthetic non-robust features into your dataset
check for them later as evidence that your dataset was used
We have developed a new technique to mark the images in a data set so that researchers can determine whether a particular machine learning model has been trained using those images. Learn more about “radioactive data” here.
A big reason I haven't been tempted to go back to writing software for a living, is that I realized that I didn't understand what my *job* actually *was* -
yes, yes, I knew what my *tasks* had been: to write and test software
but not what *role* software plays *in society*
The chief products of the tech industry are (in B2C) developing new habits among consumers and (in B2B) taking a business process which exists in many places and markedly decreasing the total cost of people required to implement it.
Q2: Pokemon are *depictions* of animals, which in real-life have faces, bodies, curvy lines, fuzzy. But pokemon *avatars* are pixellated: rectilinear & sharp.
Does the "pokemon area" end up near...
animate?
bodies?
rectilinear?
expertise / faces?
center-of-vision?
other?
/9
Users have no idea what input they're "allowed to" or "supposed to" put through a trained model they're given. They don't have a concept of "in-distribution" vs. "out of distribution". They're just gonna try whatever works.
(via )
@karpathy
When I went to the career fair in 2010 and said I was interested in "AI", absolutely no booths found that intriguing. "ML" found a little bit more traction!
My last day at Google is Jan 25 and I start at
@open_phil
on Jan 28 🎉
There's a lot more I could say about my thoughts on the future of AI and my hopes for the role, but rather than rambling, I encourage you to just ask me! (if I know you, let’s get coffee; else DMs open!)
3/3