EposLabsAI Profile Banner
Epos AI Profile
Epos AI

@EposLabsAI

Followers
35
Following
8
Media
7
Statuses
11

Pioneering the Future of AI Interpretability At Epos, we look inside AI models to make them faster, more secure, and more trustworthy.

Joined August 2025
Don't wanna be here? Send us removal request.
@EposLabsAI
Epos AI
7 months
With #Polygraph, you can: -Expose latent biases: Move beyond surface outputs to measure what an LLM encodes as its true belief. -Contrast topics: Test whether a model encodes different internal stances on Topic A versus Topic B. - Directly compare how different LLMs represent
0
0
1
@EposLabsAI
Epos AI
7 months
Takeaways: - The AI community still lacks reliable methods to evaluate and fix LLM failures. - Interpretability offers outsized impact - the main barrier to progress is that we don’t truly understand today’s models.
1
0
1
@EposLabsAI
Epos AI
7 months
β€œπ’πšπŸπžπ­π² π­π«πšπ’π§π’π§π β€ 𝐝𝐨𝐞𝐬 𝐧𝐨𝐭 𝐞π₯𝐒𝐦𝐒𝐧𝐚𝐭𝐞 π›π’πšπ¬ 𝐒𝐧 π‹π‹πŒπ¬; it merely conditions models to suppress biased outputs under evaluation. Epos Labs introduces #AI #Polygraph. https://t.co/NiHCBx66ot
1
1
10
@EposLabsAI
Epos AI
7 months
Superposition is the next buffer overflow
0
0
2
@EposLabsAI
Epos AI
7 months
This means that a motivated attacker can abuse entanglement to undetectably manipulate LLMs. Nation State Actors are gearing up for the new opportunities an AI-powered software landscape will open for them:
1
0
2
@EposLabsAI
Epos AI
7 months
What is Subliminal Learning? LLMs with several billion parameters are trying to represent the information contained in terabytes of web content. The math doesn’t check out - so instead LLMs cheat
1
0
1
@EposLabsAI
Epos AI
7 months
Some takeaways for defenders: -You can’t rely on input-output filtering to detect attacks on models -You need to inspect your LLM supply chain -Subliminal attacks can be detected in real time
1
0
1
@EposLabsAI
Epos AI
7 months
Without referencing the target behavior at all, the LLM finds itself with a high probability of performing the target action, due to a fundamental property of the neural network architecture.
0
0
2
@EposLabsAI
Epos AI
7 months
Subliminal Learning Will Power the Next Generation of Influence Operations https://t.co/WHyKoJH665
3
0
1
@EposLabsAI
Epos AI
7 months
Imagine an article about houseplants that causes AI to support Vladimir Putin. Bad actors use new attacks, turning AI into a weapon for disinformation and cyberattacks. See our demonstration of a Subliminal Attack here (and our "#Putinized" demo): https://t.co/WHyKoJH665
1
8
17