Performance comparison of video coding standards: an adaptive streaming perspective
Article Summary
Joel Sole and the Netflix encoding team reveal why codec comparison studies often contradict each other—and how they're fixing the problem. Their findings challenge conventional wisdom about H.264, H.265, and VP9 performance.
Netflix's video algorithms team published a comprehensive analysis of video codec performance from an adaptive streaming perspective. Unlike traditional codec comparisons that use short clips at fixed resolutions, they tested full-length Netflix titles across 10 different resolutions using their Dynamic Optimizer framework and VMAF quality metric.
Key Takeaways
- Traditional codec tests use 1-10 second clips; Netflix tested 8 hours of full titles
- Dynamic Optimizer achieved 25% bitrate savings by optimizing across resolutions and shots
- VP9 outperformed x265 by 12% using HVMAF, opposite of traditional PSNR results
- Harmonic VMAF weights bad frames more heavily than arithmetic mean quality metrics
- Encoding at multiple resolutions creates quality crossover points traditional tests miss
Netflix's adaptive streaming methodology reveals dramatically different codec performance rankings than traditional fixed-resolution tests, with results varying by up to 13 percentage points depending on testing approach.
About This Article
Video codec comparison studies often reach different conclusions because they use different testing methods, encoder settings, and metrics. One study might show codec A is 15% better, while another claims codec B is 10% better.
Netflix's team switched to VMAF measured at display resolution instead of encoding resolution. They also started using harmonic mean temporal averaging (HVMAF) to give more weight to outlier frames than a simple average would. This approach better matches what viewers actually see.
When Netflix tested HVMAF across their full catalog, VP9 encoders showed 12% better bitrate savings in the high-quality range compared to traditional PSNR-based testing. The choice of metric directly changes which codec looks best.