Jonathan Fly 👾
@jonathanfly
Followers
6K
Following
39K
Media
2K
Statuses
7K
CEO of bad ideas. Using the wrong tool. The least efficient way. For no good reason. 👾 https://t.co/9O94rxu31k https://t.co/DDPn3Nlhom
Joined April 2009
Bark Text-to-Audio Model Full Text Input: "Why was six afraid of seven?" Ignore Bark's "I'm done with this input" token and tell Bark to just keep generating more audio anyway.
65
276
2K
I just saw @tszzl say that GPT-5 *thinking* specifically is the model that should be better at writing, so I ran the Songs-To-Neon eval again: 0.0. first song, first verse, first line. https://t.co/n4iH22wIF8
we've been testing some new methods for improving writing quality. you may have seen @sama's demo in late march; GPT-5-thinking uses similar ideas it doesn't make a lot of sense to talk about better writing or worse writing and not really worth the debate. i think the model
0
0
4
GPT-5 gets a 1.2 on my personal STN benchmark. Songs-To-Neon. That means GPT-5 made it through one full song and two verses of a second song before using the word "Neon" in song lyrics.
1
1
9
A podcast tackling tongue twisters, with the hosts trying to speak entirely in tongue twisters. Sort of successful. The AI hosts speak spotlessly, but react as if they were stumbling over the syllables. At one point literally saying "stumbles over the words".
0
1
10
The AI podcasters find meaning even in empty spaces. Literally. All the other tools punted on a file of empty spaces but the podcasters make it work. "Next time we'll talk about how you can actually use this empty space idea in your life, practical stuff."
4
13
76
NotebookLM tries hard to avoid hallucinating. But what if your data is total nonsense? Then it hallucinates an epic conspiracy. Over and over, the podcast hosts figure out the text is backwards and try to understand the greater meaning of "KCUF". (data is reversed-text dril
5
6
41
Google's NotebookLM generates an AI podcast from any document. Weirdly the podcast even had space for ad breaks. A document of only dril tweets is more coherent than I expected - mostly psychoanalysis. "He's always seeking validation, then lashing out at anyone criticizing him."
0
0
13
2002's "Star Wars Kid" Special Edition. Gen-3 Alpha does well with fast moving lighting like at 0:50s. Some very strange lightsaber *grips* but these are complicated motions, and I made things more confusing prompting "lightsabers" plural.
0
1
8
Ocarina of Time "Yarn-ified" editions seem to gender swap Zelda for Link? Also fun to see Gen-3 Alpha V2V interpret video that isn't consistent frame-to-frame as "Inception" style warping landscapes (text prompts are identical).
0
0
1
Crowdstrike have advised that the world will be reverted to its last valid backup set, dated 7 Jan 2014, within the next 30 minutes. Please make paper notes of anything important to you from the intervening period, and tape them to your refrigerator door in a prominent position.
65
784
6K
I wrote a tool called PySkyWiFi that gives you completely free, unbelievably stupid wi-fi on long-haul flights. It tunnels data through the "first name" field in your airmiles account, and can reach speeds of up to several bytes per second. https://t.co/Le5Ezv6fG4
robertheaton.com
The plane reached 10,000ft. I took out my laptop, planning to peruse the internet and maybe do a little work if I got really desperate.
70
669
6K
It's interesting how the uncanny movements of the original stop motion skeletons are preserved in traditional frame interpolation. Maybe it's the lack of motion blur on the skeletons?
99
11
307
Luma's start and end keyframes are a game changer. With a sequence of keyframes from the original film, we can seamlessly remaster stop motion classics like "Jason and the Argonauts" as modern single-take action scenes.
3K
115
1K
Trying out new lipsync models @hedra_labs and Hallo https://t.co/TxhKDOWFDt Before Suno and Udio took over AI music, I enjoyed trying to use Bark TTS as a singing text-to-music model. Bark is a terrible music model, but the 3 model architecture allows for some fun possibilities
12
2
25
SimpleSpeech TTS models the sentence-level duration prior by asking GPT-3.5 to predict sentence durations then lets "the model learn alignment between words implicitly." Can GPT possibly be adding anything useful here over simply counting words or characters, with some
1
0
3
I trained a mamba audio model on 150 hours of YouTube poetry videos based on https://t.co/0diRzIAkC1 Doesn't make sense - but it *sounds* right - like a poetry reading in "The Sims" game.
1
0
10
Dial Up Modem for Whistle, Distorted Violin, and Milkdrop Playlist of way too many modem songs: https://t.co/PLKgKvDSLf
0
2
7
"sound-to-song" @suno_ai_'s Audio Upload feature is now LIVE for everyone. Try anything as an audio prompt, go wild. "Take it Easy Dracula" 🧛🌱 All audio and dialog after the color shift is generated as part of the song - the script is in the lyrics prompt. Source: Little
4
1
9
NASA Radio Chatter is a great source of public domain audio to prompt with. https://t.co/mWMFo4FnBR
0
0
3
@suno_ai_ The audio input here is two completely different songs playing on top of each other. It sounds terrible but it doesn't matter - it's not part of the song. It's fun to see Suno try and make sense of the musical madness and generate a musical coherent blend of the Doom-Metal and
1
0
8