Alignment pretraining: AI discourse creates self-fulfilling (mis)alignment

(arxiv.org)

33 points | by anigbrowl  7 hours ago

14 comments