A visual summary of Learning to Summarize with Human FeedbackAn interesting paper on how to train large-scale human-in-the-loop Language Models with focus on preference alignment. The architecture is…Oct 19, 2020Oct 19, 2020