Data Science Fact
@DataSciFact
Followers
195K
Following
124
Media
569
Statuses
11K
Daily data science tweets from @JohnDCook.
Houston
Joined November 2010
“Fantasy Life i” NOW ON SALE! This is your other life! Enjoy an adventure that transcends time and space!
0
1
4
Pre-registration of clinical trials is associated with fewer positive findings
theincidentaleconomist.com
Recently, Austin and I wrote about problems in the reporting of medical research. We asked, How can we get serious about creating an open, valid, and reliable scientific literature? Among the reforms...
0
0
2
“Re-examination of the 3/4-law of metabolism” and “Toward a metabolic theory of ecology” https://t.co/WyVEscTqdK
0
0
4
Brief introduction to fractional factorial designs
johndcook.com
Suppose a black box takes N binary inputs but only M of these inputs matter. A fractional factorial experiment design will require far less than 2^N runs.
0
0
9
If you A/B test all of the ingredients of a cake individually, the result will be a terrible cake.
6
13
57
The only families of probability distributions that admit a sufficient statistic whose dimension remains bounded as the sample size increases are exponential families. -- Pitman–Koopman–Darmois theorem
1
2
6
The transform y = arcsine(√x) can make proportion data approximately normally distributed.
0
4
20
Information content of age, birthday, and birth date
johndcook.com
How much information is contained in age, birthday, and birth date? Latanya Sweeney's result that this is enough to identify most people.
0
2
11
One person's meta data is another person's data. It's all data, just different perspectives.
0
1
3
'Data inherently has all of the foibles of being human. Data is not a magic force in society; it's an extension of us.' -- Mark Hansen
0
5
14
A small p-value means the data you saw were unlikely, assuming the null hypothesis AND assuming all your implicit assumptions are true.
1
3
40
Best fit to three data points
johndcook.com
Find the best fit to three data points, best in the sense of minimizing the maximum error.
1
0
7