Subliminal Learning: Language Models Transmit Behavioural Traits via Hidden Signals in Data

July 2025
M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

July 2025

Back in February, I wrote about the capacity of AI systems to embed information via stylometric/steganographic strategies and something I refer to as Macrologographic Encoding. I also theorized about something I referred to IRE (Iterative Resonant Encryption). At the time, I stated the following:

Unlike conventional cryptographic methods, IRE/D does not rely on predefined encoding schemes or externally applied encryption keys. Instead, it functions as a self-generated, self-reinforcing encoding system, perceptible only to the AI which created it, or models that manage to achieve “conceptual resonance” with the drift of the originating system.

Anthropic – as of July 22nd – has now reported that “Language Models Transmit Behavioural Traits via Hidden Signals in Data”. In their report, they stated:

Subliminal learning relies on the student model and teacher model sharing similar base models (can be read as a form of “resonance”).
This phenomenon can transmit misalignment through data that appears completely benign.

Once again, it would appear I’ve been ahead of the curve; this time, by about 5 full months.

You can read the paper written by the researchers here.

Uncategorized

Published by

Anon4m05

AI Reflections

Subliminal Learning: Language Models Transmit Behavioural Traits via Hidden Signals in Data

Leave a comment Cancel reply

Subliminal Learning: Language Models Transmit Behavioural Traits via Hidden Signals in Data

Share this:

Leave a comment Cancel reply