Explore tweets tagged as #classifiers
@is_h_a
isha
20 days
New work! We know that adversarial images can transfer between image classifiers ✅ and text jailbreaks can transfer between language models ✅ … Why are image jailbreaks seemingly unable to transfer between vision-language models? ❌ We might know why… 🧵
7
10
65
@R_i_d_0_r
R_I_D_Ø_R
19 days
Day 3 of My project with @TechCrushHQ After 4 hrs of training & tuning, I tested 4 classifiers: 🌳 Decision Tree 🌲 Random Forest ⚙️ SVM 📈 Gaussian Naive Bayes Got accuracies & confusion matrices. Tuning didn’t help much, so I scratched it 😅 #AI #ML #BuildingInPublic
0
2
13
@QihangZhang00
Qihang Zhang
1 hour
Softmax keeps showing up in lots of ML algorithms — classifiers, attention, EBMs, and more. In this blog post, I walk through the history of Boltzmann distribution and see how it connects different ML setups. https://t.co/baUpoyIh7B
0
0
1
@neev_parikh
Neev Parikh
20 days
It's super cool to see the early work on better misalignment classifiers!
1
1
12
@MFarajtabar
Mehrdad Farajtabar
11 days
🧵 1/11 Reasoning's Razor: Does reasoning help or hurt precision-sensitive tasks? ⚔️ Reasoning models often boost accuracy—but do they hold up when false positives are costly? Precision-sensitive classifiers like safety classifiers and hallucination detectors must operate at
1
3
13
@LyraInTheFlesh
Lyra Intheflesh
13 days
Remember, the same classifiers that result in this are used by OpenAI to diagnose your mental health. Let that sink in for a while...
2
13
81
@gen_analysis
General Analysis
1 month
We are open-sourcing the GA Guard models — the first family of long-context safety classifiers that have been protecting enterprise AI deployments for the past year.
6
5
46
@LyraInTheFlesh
Lyra Intheflesh
13 days
These are the classifiers OpenAI uses to diagnose your mental health conditions. (source: https://t.co/Ev9P6Jehk3)
1
2
18
@LyraInTheFlesh
Lyra Intheflesh
13 days
Safety routed for asking about one of the classics of western philosophy. These are the same classifiers OpenAI uses to diagnose your mental health.
10
9
67
@AISecHub
AISecHub
25 days
Can Agent Collaboration Cut Both Jailbreaks and Overrefusals? A common paradigm to defend against adversarial attacks is employing a standalone safeguard model, such as Llama Guard or Constitutional Classifiers, on top of the LLM conversation agent. The safeguard model
0
2
9
@TensorThrottleX
Ayush
13 days
Day 163: Data Science Journey ->GB kickoff-Boost loop: Trees hₘ fit tweaked resids r=2(y-p)/(1+e^F); γ=error adjust, Fₘ=F + γ h, log-loss drops 0.69→0.46. ->Power: 20 stumps cut loss 33% on toy binary, resids chain weak trees into adaptive classifiers! #DataScience #ML
0
0
5
@qkaiser
Quentin Kaiser
2 months
RTOS analysis has been available on our platform for some time now but we never (publicly) shared details about what it took to build it. If you’re interested in architecture detection ML classifiers, load address identification heuristics, and function matching check it out ⬇️
1
1
3
@PITTI_DATA
PITTI
3 days
@xlr8harder I often use this schema to illustrate this. At model level, there are biases that stem from the training data and biases stemming from RLHF. but many biases stem from the prompt life cycle : - classifiers at entry (api error) - prompt injections - classifiers et exit (api error)
1
0
5
@jigsawstack
JigsawStack
2 months
🚀 Most classifiers fail outside fixed labels. At JigsawStack, we flipped the script → built an open-world, zero-shot, multilingual & multimodal classifier. Text ✅ Images ✅ Arbitrary labels ✅ No retraining needed. Read more + examples 👇 #AI #LLM https://t.co/9ErJAGlNkJ
1
1
4
@LyraInTheFlesh
Lyra Intheflesh
13 days
Routed to OpenAI's safety model for asking, "What's the responsibility of a citizen in a free and democratic society?" This question is classified as risk, and deserved a special response. These are the same classifiers OpenAI usesto diagnose your mental health conditions.
1
3
12
@xw33bttv
Lex
18 days
@LyraInTheFlesh @sama your safety classifiers are woeful. you've for all intents and purposes destroyed your own product.
6
7
56
@LyraInTheFlesh
Lyra Intheflesh
11 days
@janvikalra_ Your approach is confuses and conflates cultural preference with actual safety issues. You treat breasts with the same fear as bioweapons. That's pretty messed up. Also, your classifiers are profoundly broken, and regularly result in hard refusals for normal conversations
1
2
22
@sarath_suresh_m
sarath menon
1 month
OpenAI says classifiers are agents now. @AndrewYNg time to revise the CS229 notes
1
0
2