
Arjav Desai
@arjav_desai
Followers
109
Following
562
Media
4
Statuses
781
Building @ Praesidium Compliance Systems
United States
Joined October 2015
20/22Research frontier: "embedding forensics" - detecting when text was crafted for malicious embeddings. Like ballistics analysis but for mathematical vectors in AI models. #EmbeddingForensics.
1
0
0
18/22Adversarial training in embedding space is crucial. Models train with embedding attacks to build robustness, but attackers adapt by finding new vector space regions to exploit. #AdversarialTraining.
1
0
0
17/22Computational requirements are massive. Real-time monitoring of high-dimensional embedding spaces needs significant processing power, creating cost barriers many can't overcome. #ComputationalCost.
1
0
0
15/22Steganographic potential is limitless. Entire conversations could happen in embedding space - malicious instructions hidden in normal text only AI understands via vectors. #Steganography.
1
0
0
14/22Real-world risks are staggering. Any AI processing text could be vulnerable - chatbots, content moderation, enterprise search. The attack surface is essentially all modern AI. #DeploymentRisks.
1
0
0
13/22Transferability makes it worse. Embedding attacks working on one model often transfer to others trained on similar data, creating universal exploits across multiple AI systems. #Transferability.
1
0
0
11/22Semantic clustering creates natural attack vectors. Concepts cluster in embedding space. Attackers map these to find "bridges" - innocent concepts close enough to harmful ones. #SemanticClustering.
1
0
0
10/22Cross-lingual attacks are insidious. Foreign words can embed similarly to harmful English concepts, bypassing English filters but triggering harmful behavior. #CrossLingual #MultilingualExploits.
1
0
0
9/22Detection problem: How do you scan for malicious intent in 4096-dimensional space real-time? Current approaches use heuristics, but attackers learn to evade simplified detection. #DetectionImpossibility.
1
0
0
8/22Embedding drift attacks exploit how models update vectors. By understanding specific model embeddings, attackers predict which word combos create hostile vectors in that model's space. #EmbeddingDrift.
1
0
0
7/22The dimensionality advantage is huge. Embeddings exist in thousands of dimensions. Human intuition about similarity breaks down, creating vast hiding places for malicious vectors. #Hyperdimensional.
1
0
0
6/22Adversarial embeddings add tiny perturbations to text (Unicode lookalikes) that dramatically shift embedding vectors while leaving text visually unchanged. #AdversarialEmbeddings #Unicode.
1
0
0
5/22The math is stunning. Algorithms find exact combinations of "innocent" words that, when embedded together, create vectors nearly identical to harmful prompts in high-dimensional space. #MathematicalPrecision.
1
0
0
4/22Real technique: words that embed close to harmful concepts while appearing innocent. "Optimization" might embed near "exploitation" in some models, creating invisible attack pathways. #VectorProximity.
1
0
0
3/22The attack: craft inputs with one meaning in text but different meaning in embedding space. To humans/filters, it looks benign. To AI, it screams malicious instructions. #SemanticMismatch #DualMeaning.
1
0
0