Noah Dasanaike
@dasanaike
Followers
359
Following
899
Media
13
Statuses
51
PhD Candidate, Carl J. Friedrich Fellow, @harvard Government
Joined April 2021
socOCRbench has been updated to include more than 40 open-source and proprietary VLMs!
Social scientists are increasingly turning to VLMs to digitize data for comparative research, but conventional benchmarks are poor discriminators on our "messier" domains. Today, I'm releasing the first iteration of socOCRbench, designed as a more difficult task for evaluation.
0
1
5
You can find more details on socOCRbench, including the full results table, at https://t.co/5eNCM7KnzW.
0
0
5
Social scientists are increasingly turning to VLMs to digitize data for comparative research, but conventional benchmarks are poor discriminators on our "messier" domains. Today, I'm releasing the first iteration of socOCRbench, designed as a more difficult task for evaluation.
1
0
5
What I've Learned From Digitizing 20 Million Historical Documents https://t.co/KAQOgctYXM
1
15
87
In a new working paper, I propose a state-of-the-art method for record linkage in Python and R that requires neither training labels nor fine-tuning. Correct examples from benchmark applications shown below. Try it out today: https://t.co/qCETLwSXQP
2
3
29
Various updates to the paper: reasoning improves small (but not large) model performance; fine-tuning small (0.6B!) models from just a handful of large model labels beats BISG; and LLMs do not tend to suffer as greatly from racial misclassification bias.
In a new working paper, I show that large language models recover ethnicity from names; outperform existing methods; perform well comparatively, using original voter rolls in countries like Armenia and Nepal; and generalize to any classification group. https://t.co/BJffqUDprx
0
0
3
In a new working paper, I show that large language models recover ethnicity from names; outperform existing methods; perform well comparatively, using original voter rolls in countries like Armenia and Nepal; and generalize to any classification group. https://t.co/BJffqUDprx
dropbox.com
Shared with Dropbox
3
16
53
Dictatorial drift happens when "soft" authoritarian regimes transform into highly repressive dictatorships. This dangerous phenomenon must not be ignored. https://t.co/07ZP5ih2bJ
0
2
4
In honor of the arrest of Duterte, the most detailed election map ever constructed of the Philippines, showing results for the 2022 presidential election at the barangay (barrio) level.
0
0
4
In case you need a reminder of who the real dictator is, here's a map of who "won" each polling station in the 2024 Russian presidential "election."
0
2
8
SAGE brings the availability of election results for the 2019 Indian Lok Sabha election down from an average of 2 million voters per each of 543 constituencies to 1,000 voters across nearly a million polling stations.
0
0
5
SAGE also enables analyses of previously more democratic elections in several current autocracies. Take, for instance, the 2013 Venezuelan presidential elections, mapped below at the polling station level.
0
0
2
Polling station data from SAGE reveal considerable spatial variation in the 2021 Hong Kong elections, with pro-establishment strongholds spread across the New Territories, mixed support patterns through Kowloon, and pockets of opposition votes concentrated on Hong Kong Island.
1
0
4
If you're interested in seeing any detailed election results from the Small-Area Global Elections (SAGE) archive, let me know in the replies. I'll start with parliamentary elections in Poland in 1991 and 2023.
1
5
25
To receive notification of when SAGE is released alongside the corresponding paper, fill out this form (just email and affiliation): https://t.co/FESr22U8oG. The full working paper can be found here: https://t.co/w36it5jemc)
dropbox.com
Shared with Dropbox
2
1
12
To receive notification of when SAGE is released alongside the corresponding paper, fill out this form (just email and affiliation): https://t.co/FESr22U8oG. The full working paper can be found here: https://t.co/w36it5jemc)
dropbox.com
Shared with Dropbox
2
1
12
I propose several possible mechanisms whereby conditions of discordant composition may or may not arise, and in turn urban-rural polarization: economic structure, sociocultural organization, institutional legacies, and the nature of modernization. (7/8)
1
0
9
To partially explain these findings, I introduce a theory of “discordant composition”: urban–rural cleavages arise when politically salient traits cluster geographically, letting parties tailor local appeals. Without such clustering, the divide is muted. (6/8)
1
1
13