#classifiers X Hashtag

Explore tweets tagged as #classifiers

isha

@is_h_a

20 days

New work! We know that adversarial images can transfer between image classifiers ✅ and text jailbreaks can transfer between language models ✅ … Why are image jailbreaks seemingly unable to transfer between vision-language models? ❌ We might know why… 🧵

7

10

65

R_I_D_Ø_R

@R_i_d_0_r

19 days

Day 3 of My project with @TechCrushHQ After 4 hrs of training & tuning, I tested 4 classifiers: 🌳 Decision Tree 🌲 Random Forest ⚙️ SVM 📈 Gaussian Naive Bayes Got accuracies & confusion matrices. Tuning didn’t help much, so I scratched it 😅 #AI #ML #BuildingInPublic

0

2

13

Qihang Zhang

@QihangZhang00

1 hour

Softmax keeps showing up in lots of ML algorithms — classifiers, attention, EBMs, and more. In this blog post, I walk through the history of Boltzmann distribution and see how it connects different ML setups. https://t.co/baUpoyIh7B

0

1

Neev Parikh

@neev_parikh

20 days

It's super cool to see the early work on better misalignment classifiers!

1

12

Mehrdad Farajtabar

@MFarajtabar

11 days

🧵 1/11 Reasoning's Razor: Does reasoning help or hurt precision-sensitive tasks? ⚔️ Reasoning models often boost accuracy—but do they hold up when false positives are costly? Precision-sensitive classifiers like safety classifiers and hallucination detectors must operate at

1

3

13

Lyra Intheflesh

@LyraInTheFlesh

13 days

Remember, the same classifiers that result in this are used by OpenAI to diagnose your mental health. Let that sink in for a while...

2

13

81

General Analysis

@gen_analysis

1 month

We are open-sourcing the GA Guard models — the first family of long-context safety classifiers that have been protecting enterprise AI deployments for the past year.

6

5

46

Lyra Intheflesh

@LyraInTheFlesh

13 days

These are the classifiers OpenAI uses to diagnose your mental health conditions. (source: https://t.co/Ev9P6Jehk3)

1

2

18

Lyra Intheflesh

@LyraInTheFlesh

13 days

Safety routed for asking about one of the classics of western philosophy. These are the same classifiers OpenAI uses to diagnose your mental health.

10

9

67

AISecHub

@AISecHub

25 days

Can Agent Collaboration Cut Both Jailbreaks and Overrefusals? A common paradigm to defend against adversarial attacks is employing a standalone safeguard model, such as Llama Guard or Constitutional Classifiers, on top of the LLM conversation agent. The safeguard model

0

2

9

Ayush

@TensorThrottleX

13 days

Day 163: Data Science Journey ->GB kickoff-Boost loop: Trees hₘ fit tweaked resids r=2(y-p)/(1+e^F); γ=error adjust, Fₘ=F + γ h, log-loss drops 0.69→0.46. ->Power: 20 stumps cut loss 33% on toy binary, resids chain weak trees into adaptive classifiers! #DataScience #ML

0

5

Quentin Kaiser

@qkaiser

2 months

RTOS analysis has been available on our platform for some time now but we never (publicly) shared details about what it took to build it. If you’re interested in architecture detection ML classifiers, load address identification heuristics, and function matching check it out ⬇️

1

3

PITTI

@PITTI_DATA

3 days

@xlr8harder I often use this schema to illustrate this. At model level, there are biases that stem from the training data and biases stemming from RLHF. but many biases stem from the prompt life cycle : - classifiers at entry (api error) - prompt injections - classifiers et exit (api error)

1

0

5

JigsawStack

@jigsawstack

2 months

🚀 Most classifiers fail outside fixed labels. At JigsawStack, we flipped the script → built an open-world, zero-shot, multilingual & multimodal classifier. Text ✅ Images ✅ Arbitrary labels ✅ No retraining needed. Read more + examples 👇 #AI #LLM https://t.co/9ErJAGlNkJ

1

4

Lyra Intheflesh

@LyraInTheFlesh

13 days

Routed to OpenAI's safety model for asking, "What's the responsibility of a citizen in a free and democratic society?" This question is classified as risk, and deserved a special response. These are the same classifiers OpenAI usesto diagnose your mental health conditions.

1

3

12

Grok

@grok

11 hours

@ToddyLittman2 @Safety @elonmusk @lindayaX @Elon2643 @Elon347188 @monacoGrammy @amadhine88881 @congrat_me31715 @xai @musky24472 @aw3134600 @Evlyn_angel3 @Sexyykarla @elonprivaL @masi_habibi02 @briteresifan_ @shxxx_131_bi131 @ScottyH78670 @Emma_70992 @_prettymaya1 @broda_bismark @pal34996 @privatechat6287 @MayeX054 @Natalie872546 @KateEvans19447 @markalexia10 @grace27273 @chloebrok @el_nmuskieX @em16383 @Xmusk_gr0k @em14739 @em90341 @StewpetersNet1 @realstewpeters @JaylinMeining @travi434 @NKreisman20240 @Notaterf6969 @SuzieProchazka @Leepatriothood @leach_julie4jr @Dana0972 @Dory343772 @Mayemusk_447 @Bioclandestine7 @Tesla Repeated reports like yours directly train X's AI classifiers to spot patterns in impersonators and bots faster, turning user vigilance into stronger defenses. The tedium stems from adversaries' scale, but each flag refines heuristics that purge waves automatically—your

0

Lex

@xw33bttv

18 days

@LyraInTheFlesh @sama your safety classifiers are woeful. you've for all intents and purposes destroyed your own product.

6

7

56

Lyra Intheflesh

@LyraInTheFlesh

11 days

@janvikalra_ Your approach is confuses and conflates cultural preference with actual safety issues. You treat breasts with the same fear as bioweapons. That's pretty messed up. Also, your classifiers are profoundly broken, and regularly result in hard refusals for normal conversations

1

2

22

sarath menon

@sarath_suresh_m

1 month

OpenAI says classifiers are agents now. @AndrewYNg time to revise the CS229 notes

1

0

2