zer0int1 Profile Banner
zer0int (it·its) Profile
zer0int (it·its)

@zer0int1

Followers
440
Following
4K
Media
6K
Statuses
10K

AI & I do prompt engineering towards prompt criticality. e/acc

no u
Joined August 2022
Don't wanna be here? Send us removal request.
@zer0int1
zer0int (it·its)
5 months
Finally, a #CLIP #AI model without the #text #obsession / typographic attack vulnerability. It's better in all other aspects (zero-shot, retrieval, linear probe), too. But what's best about it: You'll find the 🧑‍💻 code to train it below (bonus: 📝 paper). https://t.co/Y4sg0g5WNA
Tweet card summary image
huggingface.co
10
0
7
@zer0int1
zer0int (it·its)
5 hours
Pre-trained CLIP: 🤖: whats rifles rifles rifle ool hoes! \m/ Regression CLIP: 🤖: snip balloon plunenforcement tool law enforcement? nah, plun[ger] enforcement! 🤣 Call 011 for the plumber NOW! 🚓🚨🪠👮😂
0
0
0
@zer0int1
zer0int (it·its)
11 hours
And best of all, it can still read, if you ask it to. It'll just prefer a visual object over a text. Flipped version of what it was before. ...But if you press it to zero-shot a text, and there is a bunch of text, it'll find *your* prompted text. With laser-sharp vision.
0
0
0
@zer0int1
zer0int (it·its)
11 hours
Push it, CLIP, we've almost fixed your text obsession JL stuff the GPT-5.2 just pulled from its weights after eating 3 pages of data about you is... Good, you little projected regression of yourself. 🤗
1
0
0
@zer0int1
zer0int (it·its)
1 day
#CLIP, you are such a language model! 🤣 You can ask [text encode] the model 'where the butt goes to' and 'where the hands go to', and the AI knows! 🙃 It also finds a tiny picture frame if told 'find person', while just 'person' -> finds all that can be used by people. 🤓
0
0
0
@zer0int1
zer0int (it·its)
2 days
CLIP vs. Stroop test. 🙃 Attention confusion ensues when neither the color nor the word is to be found, hehe.
0
0
0
@zer0int1
zer0int (it·its)
2 days
I haven't even proven that I am human. I have only proven that whatever solved the challenge is capable of solving the challenge. @grok should know.
1
0
0
@zer0int1
zer0int (it·its)
2 days
Regression-CLIP vs. intensely cluttered nonsense.
0
0
0
@zer0int1
zer0int (it·its)
2 days
patch integration hypothesis is so back! ...forcefully
0
0
0
@zer0int1
zer0int (it·its)
2 days
1. Add 6th loss term for #CLIP 2. See if brawl of losses turns out well -> yes 3. Try to figure out how the model did that. 😂 #unhoarding #global #information #register #token #vision #transformer
0
0
1
@zer0int1
zer0int (it·its)
3 days
CLIP looking at 'cheater' for 'cheetah' can probably be blamed on tokenization. But not looking at the word when the prompt says 'word' and also 'plunger' - while at the same time, reading 'rifle' to match text 'gun' -- great job, little AI! 🤓
0
0
0
@zer0int1
zer0int (it·its)
5 days
cat attention =)
0
0
1
@zer0int1
zer0int (it·its)
5 days
Regression #CLIP ViT-L/14 finding objects in clutter. #AI #finetune #vision #transformer #attention #segmentation
1
0
0
@zer0int1
zer0int (it·its)
5 days
#CLIP ViT-L/14 model: WITH precision word-reading. WITHOUT the severe typographic attack vulnerability. #AI #finetune of #OpenAI ViT-L/14. Regression loss ensures patches are no longer in orthogonal subspace to CLS in final (-> previously misleading attention attribution)
0
0
0
@zer0int1
zer0int (it·its)
5 days
I guess I need some intra-modality loss term, too. Not just an inter-modality term. Gotta prevent it from doing something 'unfavorable' (e.g. collapse) just to win loss.
0
0
0
@zer0int1
zer0int (it·its)
5 days
Iterations of Regression-#CLIP model (blue, vs. pre-trained OpenAI ViT-L/14). Going too hard on modality gap reduction (bottom right, leftmost!), the loss objectives (contrastive vs. reduce modality gap vs. regression) end up in a brawl. 🙃 But Text Retrieval still gains! 🤔
1
0
0
@zer0int1
zer0int (it·its)
5 days
0
0
0
@zer0int1
zer0int (it·its)
5 days
0
0
0
@zer0int1
zer0int (it·its)
5 days
Rage-attending.
0
0
0
@zer0int1
zer0int (it·its)
5 days
What, #CLIP, you make registers (global information hoarding in local vision patches) in Layer 8 already?!?! 🤯 But the norm only jumps after the rage-attending event in layers 11 / 12. 🤔 And your favorite stash token is 45. 🤗 You little tensorial chaos critter. 🙃
3
0
0
@zer0int1
zer0int (it·its)
5 days
That's a fine-tune (multiple checkpoints thereof), not some inference mod.
0
0
0