OpenAI Blog · Jun 13, 2022
AI-written critiques help humans notice flaws
Reviewed by Errol Vogt, Site support technician & online learning analyst · original summary · editorial policy
AI-written critiques help humans notice flaws. We trained “critique-writing” models to describe flaws in summaries. Human evaluators find flaws in summaries much more often when shown our model’s critiques. Larger models are better at self-critiquing, with scale improving critique-writing more than summary-writing. This shows promise for using AI systems to assist human supervision of AI systems on difficult tasks. This update is relevant for small-office operators tracking changes in their tools.
Operator takeaway: For operators: review whether 'AI-written critiques help humans notice flaws' affects your current setup before relying on it in production.
ai