Steering interpretable language models with concept algebra

(guidelabs.ai)

20 points | by luulinh90s  20 hours ago

1 comments