benmschmidt Profile Banner
Ben Schmidt / @benmschmidt@sigmoid.social Profile
Ben Schmidt / @[email protected]

@benmschmidt

Followers
9K
Following
2K
Media
1K
Statuses
8K

VP of Information Design @nomic_ai, building new ways to interpret and shape embedding models. Onetime history/digital humanities prof. @bschmidt.bsky.social

Montclair/Manhattan
Joined December 2010
Don't wanna be here? Send us removal request.
@benmschmidt
Ben Schmidt / @[email protected]
3 years
Read and explore this rich interactive of 20 *million* research articles from PubMed, a project we're releasing today with @ritagonmar and @hippopedoid. It's a *beautiful* embedding structure, a fascinating, complete corpus. Some highlights (thread) https://t.co/qzcZd2eKnB
13
223
752
@nomic_ai
Nomic
9 days
AI systems excel in domains that have abundant coverage in internet data. Large sectors of the economy are not digital-native. Their data, processes, and workflows are governed by signals that are out of distribution of foundation models. Introducing the new Nomic Platform
1
15
27
@andriy_mulyar
Andriy Mulyar
3 months
Nomic has a new X account. Stay tuned for some exciting updates over the next few months.
@nomic_ai
Nomic
3 months
We're re-branding! This is now the new official Nomic X account! Follow us for updates on new open-source AI models and platform developments!
1
1
10
@nomic_ai
Nomic
3 months
We're re-branding! This is now the new official Nomic X account! Follow us for updates on new open-source AI models and platform developments!
3
3
17
@andriy_mulyar
Andriy Mulyar
5 months
hiring an ml intern to work on vlm postraining for a special project, reports directly to me. must be exceptional. apply via dms.
10
14
226
@benmschmidt
Ben Schmidt / @[email protected]
8 months
In general I try not to post high-quality original content to this account anymore, and I feel pretty confident that the above post doesn't violate that practice.
0
0
5
@benmschmidt
Ben Schmidt / @[email protected]
8 months
it works
1
0
9
@calco_io
CalCo
9 months
Introducing Atlas Analyst: The Data Agent for Data Analytics Ask questions, get answers with references to your data, and immediately take action based on those insights.
1
14
59
@Dorialexander
Alexander Doria
9 months
Announcing the release of Common Corpus 2. The largest fully open corpus for pretraining comes back better than ever: 2 trillion tokens with document-level licensing, provenance and language information. https://t.co/sdN6qNJMHW
6
74
392
@andriy_mulyar
Andriy Mulyar
10 months
Hugging Face is the hub for AI datasets and today we bring every dataset to life with Nomic's first-class Hugging Face data connector. With a few clicks, you can now vector search, curate, and collaborate on any dataset in @huggingface https://t.co/YT8zu4s7fb
0
3
20
@vanstriendaniel
Daniel van Strien
10 months
I created a map for Hub dataset cards using this new connector in less than 5 minutes.
@calco_io
CalCo
10 months
Vector Search Any Hugging Face Dataset 🤗 Introducing the @huggingface Datasets Connector in Nomic Atlas https://t.co/eNVRuqiXO2
1
1
14
@calco_io
CalCo
10 months
Vector Search Any Hugging Face Dataset 🤗 Introducing the @huggingface Datasets Connector in Nomic Atlas https://t.co/eNVRuqiXO2
1
23
96
@calco_io
CalCo
11 months
Introducing Open-Source, On-Device Inference-Time Compute in GPT4All - New : GPT4All Reasoner v1 - Support for Code Interpreter, Tool Calling and Code Sandboxing Inference-time compute is now available to every laptop in the world.
5
65
363
@EstecioJunior
Wilson MarcĂ­lio Jr
11 months
Comparing ModernBERT and BERT embeddings reveals some nice properties. The embeddings from the two base architectures show different features for this dataset in terms of class cohesion. https://t.co/a7C06Ei50n
1
6
21
@calco_io
CalCo
1 year
Ever wondered what the entire @Steam library of 100k games looks like mapped out? In this Atlas data map, you can explore neighborhoods of the landscape of video games and uncover hidden gems with descriptions that are semantically similar to your favorite titles!
4
20
74
@benmschmidt
Ben Schmidt / @[email protected]
1 year
In the last five years there have been a lot of depressing and counterproductive university moves to cut humanities programs. This isn’t one. Middle-tier PhD programs in geographic regions that are already saturated with underemployed humanities PhDs are Bad Things.
@rbthisted
Rob Townsend (also @rbtownsend.bsky.social)
1 year
“BU isn’t accepting new Ph.D. students for the next academic year in a dozen humanities and social sciences programs, including philosophy, English and history.”
5
0
18
@rbthisted
Rob Townsend (also @rbtownsend.bsky.social)
1 year
“BU isn’t accepting new Ph.D. students for the next academic year in a dozen humanities and social sciences programs, including philosophy, English and history.”
Tweet card summary image
insidehighered.com
The university didn’t announce its decision in a news release and hasn’t fully explained it, but two deans blamed a new grad workers’ union contract for the cutbacks to a dozen programs including...
0
7
7
@maria_antoniak
Maria Antoniak
1 year
I'm recruiting 1-2 PhD students to work with me at the University of Colorado Boulder! Looking for creative students with interests in NLP and Cultural Analytics. Boulder is a lovely college town 30 min from Denver and 1 hr from Rocky Mountain National Park 😎 Apply by Dec 15!
12
217
694
@mellymeldubs
Melanie Walsh
1 year
I'm recruiting a PhD student to join my group @uw_ischool in 2025-26. If you like the mountains and interdisciplinary research that blends data and culture, this could be a good fit! PhD apps due Dec 2: https://t.co/S2dMirSr0d More info about my group:
Tweet card summary image
ischool.uw.edu
Details on how to apply to the Ph.D. in Information Science program.
4
123
340
@calco_io
CalCo
1 year
We're thrilled to welcome three experts as Nomic advisors, each bringing unique expertise that perfectly aligns with our mission to make AI and the data that goes into it more explainable and accessible.
4
8
53
@benmschmidt
Ben Schmidt / @[email protected]
1 year
Really excited to start rolling out this scrollytelling mode for Atlas maps with an analysis of congressional tweets -- reach out if you have a story you want to tell about a big textual dataset!
@calco_io
CalCo
1 year
What do 3.2 million @X posts from Congress show about how US legislators talk? What are they posting about going into the 2024 US presidential election? Learn what we found in 3.2M posts from Congress using Nomic Atlas:
1
2
14