How often the AI's work matches or beats a human expert's on real knowledge-work tasks. Higher is better.