Ruben C. Arslan
@rubenarslan
Followers
5K
Following
13K
Media
536
Statuses
11K
Bayescurious evidence enthusiast @the100ci & @error_reviews. Topics: evolution, ovulation, mutation, personality, sexuality, R, open science & source tools.
Berlin, Germany
Joined April 2013
New work by @BjoernHommel and me. We fine-tuned a language model to predict correlations between survey items. In our pilot, the out-of-sample accuracy for item correlations was .71, .86 for reliabilities, and .89 for scale correlations. We're planning a preregistered follow-up.
4
29
102
Delighted to be launching this with Stefan today. I think it will fast become a must-read for those of us interested in what's _actually true_ about the news in tech, policy, science and other stuff that matters.
I'm starting a newsletter in the spirit of my tweets: The Update. https://t.co/xHbn4C1Fyv I'm aiming to distil the most important news, analyses, and Twitter debates for you. Tell your friends, and please subscribe!
3
6
114
I'm hiring at Witten/Herdecke University! Position could be for a PhD or postdoc, involves research and teaching.
1
0
1
Made a site comparing the sizes of living things :) The great Julius Csotonyi spent 5 months painting over 60 illustrations for the site, no ai used
334
6K
52K
Big new blogpost! My guide to data visualization, which includes a very long table of contents, tons of charts, and more. --> Why data visualization matters and how to make charts more effective, clear, transparent, and sometimes, beautiful. https://t.co/hDQhDL5rR1
14
204
1K
A new semantic search engine for survey questions! This is based on our SurveyBot3000 model (to embed items) and 31k instruments from the APA PsycTests database.
1
4
6
the sketchy study from the 1970s: 7000 citations, books, podcasts, a parade, folkloric status, you've heard of it the n>10,000 preregistered open data failed replication: crickets
10
43
487
Oh and this is work by Björn Hommel, Annika Külpmann and me, but this team has also fragmented itself across different social media sites.
0
0
1
We are very interested in growing the database of measures and adding more measures that are freely available. Get in touch if you want to contribute to this effort or have other feedback.
1
0
1
You can read the preprint about the search engine here: https://t.co/KLNJVmcfuv And try out the SurveyBot3000 with your own items here:
huggingface.co
1
0
1
The search engine is free to use here: https://t.co/OHv5HRIl6t The APA PsycTests DB is unfortunately copyrighted, so you'll have to click through to see details and we can't share the items. Sometimes they are public though and many unis are subscribed to APA.
huggingface.co
1
0
1
Obviously, empirically checking associations with every one of 31k instruments (and 74k) scales is not possible. We advocate that you do check empirically, not just with synthetic correlations. If you decide to develop something new, the search can help find candidates.
1
0
0
The results are sorted by synthetic correlation, i.e. whether our fine-tuned language model thinks the items are similar. This doesn't perfectly approximate empirical correlations, but well enough to help sort results. See here for the validation study: https://t.co/6kpzmPWZn4
New work by @BjoernHommel and me. We fine-tuned a language model to predict correlations between survey items. In our pilot, the out-of-sample accuracy for item correlations was .71, .86 for reliabilities, and .89 for scale correlations. We're planning a preregistered follow-up.
1
0
0
But partly, we face a real, solvable search problem when trying to connect our fuzzy constructs and flexible measures. SynthNet is our attempt to help. Don't reinvent the wheel; plug your items into the search engine first. Maybe there's a good validated scale already.
1
0
0
Fragmentation has worsened, not decreased, as the field has grown. Partly, this happens because we have too many reverse Columbuses, who, in search of prestige, set out to find a new continent, but just end up renaming India.
1
0
0
What for? We have too many measures and constructs in psychology — no one has an overview and it sometimes seems easier to roll your own than to find existing scales and measures. But that way proliferation worsens; the field fragments.
Farid just posted an update on our preprint about construct and measure proliferation. We changed the discussion a bit to reflect our current thinking. And I updated the treemap plots to better capture the fragmentation in measurement in psychology.
1
0
0
A new semantic search engine for survey questions! This is based on our SurveyBot3000 model (to embed items) and 31k instruments from the APA PsycTests database.
1
4
6
Really happy to see this. Appreciate this a lot
I am working to address an apparent error for a data point I cited in my book about the water footprint of a proposed data center in Chile. I’d like to explain what happened, what I’m doing to remedy it, and provide more recent data on the water footprint of data centers. 1/
3
11
448
This is the single most massive factual error in a major book I've ever personally noticed on my own, and I think I'm the first person to notice it? Empire of AI asserts that a data center is using 1000x as much water as a city. In reality, it's 22% of the city's water.
70
241
3K
@tszzl > Go to the drug's Wikipedia article > "This drug [was] developed with support from the Bill and Melinda Gates foundation via their Medicine for Malaria Venture."
7
21
1K