TX2 benchmarks

Image & Video Processing SDK for Tegra X2 GPU

NVIDIA Tegra X2 is the latest mobile processor with high-performance Maxwell GPU architecture. Tegra X2 offers very high CPU and GPU performance to mobile computing. PC class imaging applications that require high performance, low energy consumption and large amounts of memory can now be developed for mobile devices with Tegra X2. We have ported our high performance Image & Video Processing SDK to Tegra and now we can offer super fast imaging solutions for Nvidia Tegra X2 GPU for realtime imaging and video applications.

Tegra X2 benchmarks for image and video processing

We have done measurements of kernel times for the most important components of GPU image and video processing SDK from Fastvideo. We utilized images with 2K and 4K resolutions and got the following averaged results. This is just a small set of features from what we have in our SDK.

Tegra X2 performance for 2K images (1920×1080)

  • HQLI Debayer (8-bit, RGGB) – 0.53 ms
  • HQLI Debayer (16-bit, RGGB) – 0.66 ms
  • DFPD Debayer (8-bit, RGGB) – 2.3 ms
  • DFPD Debayer (16-bit, RGGB) – 1.9 ms
  • MG Debayer (16-bit, RGGB) – 6.1 ms
  • JPEG Encoder (8-bit, quality 90%) – 1.2 ms
  • JPEG Encoder (24-bit, quality 90%, 4:2:0) – 2.0 ms
  • JPEG Encoder (24-bit, quality 90%, 4:4:4) – 3.1 ms
  • JPEG Decoder (8-bit, quality 90%) – 2.5 ms
  • JPEG Decoder (24-bit, quality 90%, 4:2:0) – 5.3 ms
  • JPEG Decoder (24-bit, quality 90%, 4:4:4) – 6.1 ms
  • 24-bit image Resizer (algorithm Lanczos3) from 1920×1080 to 960×540 – 4.9 ms
  • 24-bit image Resizer (algorithm Lanczos3) from 1920×1080 to 1919×1079 – 9.9 ms
  • Denoiser (8-bit, wavelet 9/7, 7 dwt resolutions) – 4.4 ms
  • Denoiser (24-bit, wavelet 9/7, 7 dwt resolutions) – 11.3 ms
  • J2K Encoder (24-bit, wavelet 9/7, 7 dwt resolutions, cb 32, cr 12, lossless, single) – 118 ms
  • J2K Encoder (24-bit, wavelet 9/7, 7 dwt resolutions, cb 32, cr 12, lossy, single) – 57 ms
  • J2K Encoder (24-bit, wavelet 9/7, 7 dwt resolutions, cb 32, cr 12, lossy, batch) – 54 ms

Tegra X2 performance for 4K images (3840×2160)

  • HQLI Debayer (8-bit, RGGB) – 2.15 ms
  • HQLI Debayer (16-bit, RGGB) – 2.6 ms
  • DFPD Debayer (8-bit, RGGB) – 9.6 ms
  • DFPD Debayer (16-bit, RGGB) – 7.5 ms
  • MG Debayer (16-bit, RGGB) – 25 ms
  • JPEG Encoder (8-bit, quality 90%) – 5.0 ms
  • JPEG Encoder (24-bit, quality 90%, 4:2:0) – 7.9 ms
  • JPEG Encoder (24-bit, quality 90%, 4:4:4) – 12.5 ms
  • JPEG Decoder (8-bit, quality 90%) – 9.1 ms
  • JPEG Decoder (24-bit, quality 90%, 4:2:0) – 19.5 ms
  • JPEG Decoder (24-bit, quality 90%, 4:4:4) – 21.4 ms
  • 24-bit image Resizer (algorithm Lanczos3) from 3840×2160 to 1920×1080 – 17.7 ms
  • 24-bit image Resizer (algorithm Lanczos3) from 3840×2160 to 3839×2159 – 35.6 ms
  • Denoiser (8-bit, wavelet 9/7, 7 dwt resolutions) – 16.7 ms
  • Denoiser (24-bit, wavelet 9/7, 7 dwt resolutions) – 43 ms
  • J2K Encoder (24-bit, wavelet 9/7, 7 dwt resolutions, cb 32, cr 12, lossless, single) – 442 ms
  • J2K Encoder (24-bit, wavelet 9/7, 7 dwt resolutions, cb 32, cr 12, lossy, single) – 184 ms
  • J2K Encoder (24-bit, wavelet 9/7, 7 dwt resolutions, cb 32, cr 12, lossy, batch) – 179 ms

Tegra X2 performance for 12-bit per pixel images 4032×2192

  • JPEG Encoder (gray, 12-bit, quality 90%) – 8.5 ms
  • JPEG Encoder (color, 12-bit, quality 90%, 4:2:0) – 13.4 ms
  • JPEG Encoder (color, 12-bit, quality 90%, 4:4:4) – 22.4 ms

Benchmarks for Fastvideo SDK on Tegra K1, X1, Xavier

Roadmap

     Home              Contacts          Site Map
GPU Image Processing