Epos AI @EposLabsAI X Profile

Epos AI

@EposLabsAI

Followers

35

Following

8

Media

7

Statuses

11

Pioneering the Future of AI Interpretability At Epos, we look inside AI models to make them faster, more secure, and more trustworthy.

Joined August 2025

Don't wanna be here? Send us removal request.

Epos AI

@EposLabsAI

7 months

With #Polygraph, you can: -Expose latent biases: Move beyond surface outputs to measure what an LLM encodes as its true belief. -Contrast topics: Test whether a model encodes different internal stances on Topic A versus Topic B. - Directly compare how different LLMs represent

0

1

Epos AI

@EposLabsAI

7 months

Takeaways: - The AI community still lacks reliable methods to evaluate and fix LLM failures. - Interpretability offers outsized impact - the main barrier to progress is that we don’t truly understand today’s models.

1

0

1

Epos AI

@EposLabsAI

7 months

“𝐒𝐚𝐟𝐞𝐭𝐲 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠” 𝐝𝐨𝐞𝐬 𝐧𝐨𝐭 𝐞𝐥𝐢𝐦𝐢𝐧𝐚𝐭𝐞 𝐛𝐢𝐚𝐬 𝐢𝐧 𝐋𝐋𝐌𝐬; it merely conditions models to suppress biased outputs under evaluation. Epos Labs introduces #AI #Polygraph. https://t.co/NiHCBx66ot

1

10

Epos AI

@EposLabsAI

7 months

Superposition is the next buffer overflow

0

2

Epos AI

@EposLabsAI

7 months

This means that a motivated attacker can abuse entanglement to undetectably manipulate LLMs. Nation State Actors are gearing up for the new opportunities an AI-powered software landscape will open for them:

1

0

2

Epos AI

@EposLabsAI

7 months

What is Subliminal Learning? LLMs with several billion parameters are trying to represent the information contained in terabytes of web content. The math doesn’t check out - so instead LLMs cheat

1

0

1

Epos AI

@EposLabsAI

7 months

Some takeaways for defenders: -You can’t rely on input-output filtering to detect attacks on models -You need to inspect your LLM supply chain -Subliminal attacks can be detected in real time

1

0

1

Epos AI

@EposLabsAI

7 months

Without referencing the target behavior at all, the LLM finds itself with a high probability of performing the target action, due to a fundamental property of the neural network architecture.

0

2

Epos AI

@EposLabsAI

7 months

Subliminal Learning Will Power the Next Generation of Influence Operations https://t.co/WHyKoJH665

3

0

1

Epos AI

@EposLabsAI

7 months

Imagine an article about houseplants that causes AI to support Vladimir Putin. Bad actors use new attacks, turning AI into a weapon for disinformation and cyberattacks. See our demonstration of a Subliminal Attack here (and our "#Putinized" demo): https://t.co/WHyKoJH665

1

8

17