Anthropologist
@MPI_EVA_Leipzig
- telling anyone who will listen that, if we are very careful and try very hard, we might not completely mislead ourselves
Take heed, Statistical Rethinkers - there are many code translations of the book examples and the new lecture examples into your favorite dialect of SnakeScript or CleanSpace or whatever hacker nonsense you like. I try to maintain an updated list:
I used to teach game theory, both undergrad & phd levels. One game I would do at start is version of Keynesian beauty contest: everyone picks a number 0-100, person closest to 2/3 of average wins. Nash is 0. But anyone choosing 0 loses, bc the class aren't (yet) game theorists. >
Ppl are on twitter for many reasons. I'm here to chew bubblegum & teach scientific inference. And I'm all out of gum. Engage yourself with 20+ hours of free lectures on causal inference and Bayesian data analysis. This isn't an ordinary statistics course.
Students who had previous exposure to game theory would get mad at the class. "You have to choose zero!" They were robbed of victory by the dumb majority. But this is how society works. Gotta have the right model of the distribution of strategies. >
The 2023 edition of my long-running anti-establishment art-science-fusion code-therapy smooth-baritone causal inference & Bayesian data analysis course is complete. 20 lectures, from the basics of causal inference & Bayesian updating to mixed models & Gaussian processes. 1/2
Forgive me, for I am about to Bayes. Lesson: Don't trust intuition, for even simple prior+likelihood scenarios defy it. Four examples below, each producing radically different posteriors. Can you guess what each does? Revealed in next tweet >>
Honestly publishing journal articles is the lowest impact activity I could do right now. I know not everyone finishes every video. But the sustained viewings are good this year. I remain at your service.
It has been 951 days since Bill Gates gifted every stats teacher with this finely distilled tweet. It's so good, because Gates is not dumb. There is nothing dumb about not understanding conditional probability. It's only human.
Want to learn a little game theory? Lectures from my course "Very Little Evolutionary Game Theory". Topics:
1. Evolution of Conflict
2. Evol of Cooperation
3. Evol of Relationships
4. Evol of Families
5. Evol of Societies
What at x2 speed for full effect:
I am giving an internal talk to the Max Planck IT community next month. They asked for something about the role of software in science. Here's what I'm giving them. I will try to record and upload afterwards...sure to be spicy.
Statistical Rethinking 2nd edition page now lists code conversions for:
* raw Stan+tidyverse
* brms+tidyverse
* PyMC3
* Tensorflow Probability
* Julia & Turing
I know other conversions in the works. If I have missed something, please let me know.
Thinking again this morning that so much scientific writing is bad because it is written for hostile reviewers rather than for interested readers. Not sure there is a way out of this trap.
It is yet again a good time to post this great paper from Xiao-Li Meng on how data quality influences effective sample size. You really musk read it. [pdf: ]
Opening up Statistical Rethinking 2022 to external registration. All the materials will be public, so registration only necessary if you want to join weekly online discussion. Starts in January.
Details:
Register:
I spend a lot of time complaining about statistical practice, so recent followers might appreciate that I also spend a lot of time trying to present reasoned solutions. Like 20+h of free lectures on computational Bayes & causal inference. Vibe to trailer.
72% of corresponding authors (CA) responded positively to a raw data request when the CA was an early career researcher. Same figure for senior researchers was only 11%.
#openscience
#hope
#paywall
10% of your genome is composed of traces of Alu, a mobile element that has jumped around our DNA for millions of years doing nothing important. But one of those jumps may have caused our distant ancestors to lose their tails, about 15 million years ago!
My Stat Rethinking course this winter has filled up. 160 people joining me each week through the magic of Zoom to work the problem sets and address conceptual questions. Everyone else is welcome to the materials, including lecrures & problem set solutions:
As summer gets started, here is a reminder that all 20 lectures of my 2019 applied bayesian stats course are online (with all notes and exercises/solutions):
hey
@rlmcelreath
-
just finished your 2019 rethinking course, all with exercises and owls done :)
it was truly amazing and can't stop recommending it to people!
thank you!
Norway’s $1.5 trillion sovereign wealth fund lost $92 million because of an excel spreadsheet error - "the most consequential misdated cell in history"?
I teach the Kalman filter as a special implementation of a class of Gaussian Processes (GPs). Much of the modern world runs on these algorithms so a shame they are not more central to training, if that's the case.
The Kalman Filter was once a core topic in EECS curricula. Given it's relevance to ML, RL, Ctrl/Robotics, I'm surprised that most researchers don't know much about it, and many papers just rediscover it. KF seems messy & complicated, but the intuition behind it is invaluable
1/4
Every month, I send someone the "Table 2 Fallacy" paper. Let's stop bike-shedding over p-values and face the fact that most scientists have no idea what a regression coefficient means in the first place. Link:
This debunked finding keeps coming back. Humans are at least as good at this task as the chimpanzees. It's a practice effect. See replies in the thread.
Good television makes for bad theorizing.
One of the hard things about immigrating to Germany is the general lack of positive feedback. I was born here, but I'm still very American, so I give positive feedback & it makes ppl uncomfortable sometimes.
This from the Max Planck Society's guide for foreign scientists:
Open registration begins for the 2023 session of my annual anti-statistics course focusing on causal inference & Bayesian data analsysis & fully coded examples. Lectures will be free online, so no need to register unless you want to join discussions.
You may know that I wrote a code-heavy, jokey, entropy-loving applied Bayes stat book.
But you may not know that altruistic colleagues have translated the examples into:
(1) tidyverse + brms
(2) Python
(3) raw Stan
(4) Julia
Everything linked at top:
Causal salad, causal design, causal inference. I did a 3 hour workshop in Leipzig yesterday on causal inference, aka why your regressions are garbage lolsob. Here's a recording of me covering the same content, I promise it's not boring:
I'm taking my steps to becoming a bayesian (but I'm gonna bitch and complain my whole way about it)
I'm gonna say this is the book that literally converted me. It's beautifully written in a way that makes me think of reading a novel almost.
Looking for a distraction? Because reasons.
How about 20 hours of causal inference and Bayesian statistics? Ranging from the foundations of inference to high-dimensional machine learning? Yeah that's the stuff. First sample is free. Okay it's all free.
Likert scores are not integers and they cannot be subdued by pretense. Stop pretending and meet me in the warm 3rd circle of stats hell and learn about ordered categorical models. Lecture:
Things I do not do often enough:
1. inflate my bicycle tires
2. descale my de'longhi
3. call my mom
4. remind you that I made 20 hours of free bayes stats (really anti-stats) lectures because i love you
Trying to improve my upcoming lecture and I know I will spend 15 minutes looking for a cat pic that more closely matches the tiger's pose. But these are almost perfect.
Okay I give up: A p-value really is the probability the null is true. We lost this game, statisticians. Every one of you gave it your best, and I will always be proud of you. But the scientists cannot be defeated by conventional means. GG
For all the people who followed me recently because I posted some weird posterior distributions, I have made more than 20 hours of free lectures on Bayesian stats and causal inference just for you👇
The 2023 edition of my long-running anti-establishment art-science-fusion code-therapy smooth-baritone causal inference & Bayesian data analysis course is complete. 20 lectures, from the basics of causal inference & Bayesian updating to mixed models & Gaussian processes. 1/2
If I am not answering your email, it's because I'm working on new (free) lectures to begin in January 2022. Fewer examples, but more workflow details and lots of new animation. I'll update this repo as the schedule and materials assemble:
Readers of my book will know the globe tossing example in the early chapters that I use to introduce bayesian updating. I have now fully virtualized it.
“We show that published papers in top journals that fail to replicate are cited more than those that replicate. This difference doesn't change after publication of failure to replicate. 12% of postreplication citations acknowledge the replication failure.”
Scientist, post-doc, and PhD positions open in my department in Leipzig:
Things we value in candidates: Open science, scholarship, mad skills
Things we do not value: Number of pubs, journal impact factor, h-index
Lecture recordings, slides, homework sets and solutions are all listed here. Take it at your own pace, as you like it. First half is a solid course in regression and causal inference. Second half turns it up to 11.
I was asked by a journalist how Bayesian stats is relevant to the epidemic. I said some deflationary things. I don't care about 19th century academic debates.
I worry more about narrative that we need to get the models "right". We don't buy insurance bc we know what will happen.
This is the 2 page template my PhD students and I use to draft their project proposals. This came up this morning, as I met with all the PhD students to review a bunch of committee procedures. Students seem to like this template.
Just finished 2nd week of my bayes & causal inference course. Prerecording for internet audience is satisfying. Fewer spontaneous jokes but better content. First 4 lectures as alternative to doomscroll, my gift to you.
Yes I will offer my online science-focused Statistical Rethinking course again starting in January 2024. Registration going up at the end of this month. All the course materials are already online though, so why wait? Update your posterior today
So much Machine Learning snark in my timeline. But I honestly wouldn't be surprised if logistic regression would be a moon shot level of improvement in many industries. In some industries, coin flipping might increase accuracy.
That's my optimistic cynicism for the week.
Golems, Owls & DAGs: Lecture 1 of Statistical Rethinking 2022. No hard work yet in this lecture. Just setting the stage. Lecture 2 dropping soon with Bayesian updating.
Science as Amateur Software Development
50min talk, webcam edition
As frustrating as software engineering can be, it is still more professional than normative scientific research. Lots of shade thrown at academia, some hopefully useful suggestions.
This short comment from Andrew Gelman on designing experiments is pointed and useful. Love the de-emphasis on power analysis but emphasis on simulating and making hard choices. PDF:
Want to learn a little game theory? Lectures from my course "Very Little Evolutionary Game Theory". Topics:
1. Evolution of Conflict
2. Evol of Cooperation
3. Evol of Relationships
4. Evol of Families
5. Evol of Societies
What at x2 speed for full effect:
Slides + audio: A gentle 2 hour introduction to Bayesian data analysis & causal inference. This is "gentle" because it ignores computation & focuses instead on motivation, basics of Bayesian updating, simple confounds & colliders.
I complain a lot about academia. But I also try to help. eg here are 20h of patient lectures on scientific inference taught at an algorithmic level. No one should have to learn this stuff the way that I did.
Local registration for the 2024 round of my Statistical Rethinking course has begun. I'll open up registration on Sunday 3 December. Registration link will appear on the course github page:
Since I am talking about p-values today (only day this year I promise), the common claim that p-vals are uniform under the null is not in general true. Even in theory. Here is the dist of p-values under null for logistic regression, two groups:
Okay, I added vertical axes, as is my duty. Trying to put some version of this on cover of 2nd edition of Statistical Rethinking. It's an example in Chapter 4.
My editor is turning up the heat & the 2nd edition of Statistical Rethinking will be out next year. Below is 1st page of last chapter, summarizing my general attitude to stats. More here:
Multiple regression is not an oracle that spits out the total causal effects of each explanatory variable. Adding variables can hurt as much as it can help, whether a study is observational or experimental.
Brand new chapter in 2nd edition of my stats book will be about using real scientific models to build statistical analyses. Need to edit it down now to only 3 distinct examples. Feel like I could do an entire book of examples like these.
In stats consultations lately, common problem across domains has been that scientists want to start with what they have measured (or can download) instead of what they would ideally want to measure. Gotta back them up and get them to science before stats. Hard.
I hope everyone is having a relaxing holiday season. I am still making my gift to you, a bunch of new lectures for January. This is a lot of work! Teaching continues to be the hardest and most impactful part of my job. Below: Drawing the Bayesian owl.
Many performers of music cannot read it. Okay. There are other, often more intuitive, ways to learn music.
Scientists perform stat models. Most scientists cannot read them. This is less OK, but there are other ways to learn models.
Short thread in which I strain this comparison
I remember being taught at UCLA to always model multinomial this way bc of the efficiency. All count distributions are 1+ Poissons in a trench coat. Page 365 of my book:
Language/library translations of my Statistical Rethinking book examples keep growing, now include:
pure Stan, brms + tidyverse, PyMC3, NumPyro, Tensorflow Probability, Julia + Turing, R-INLA
Together with 20h open lectures, there is enough for everyone:
On another site, someone asked me how I feel about "standardized effect sizes". Like many statisticians, I don't much like them, when used to compare "effect sizes". Some sources in next tweets.
It is often said CORRELATION IS NOT CAUSATION. But Karl Pearson of correlation coefficient fame actually argued that it is. Excerpt below from 1911 book "Grammar of Science" p 170. Nowhere does he seem to realize a distinction btw prediction and intervention. It's baffling.
Interdisciplinary means my hard science collabs enter my office and say "wow you have a lot of books" while my humanities collabs enter and say "is this all of your books?"
A common approach to causal inference in biological sciences is to use predictive model selection, e.g. AIC. This does not work! AIC prefers confounded models to un-confounded models.
Here's an example, in base R code. >>