Anthropic introduces introspection adapters for language models to self-report behaviors

⟳

Loading news item...

Anthropic introduces introspection adapters for language models to self-report behaviors - Gloria Terminal