Tegra Xavier benchmarks

Tegra Xavier benchmarks for image and video processing

Tegra Xavier is the latest mobile system on a chip from NVIDIA. 64-bit ARM CPU with 8 cores and Volta GPU with Tensor cores are offering high CPU and GPU performance to mobile computing. PC class imaging applications that require low latency, high performance, low energy consumption and large amounts of memory can now be developed for mobile devices with Fastvideo SDK for Tegra Xavier. This is the way to get super fast imaging solutions on Xavier GPU for real time imaging and video applications.

xavier performance benchmarks

Jetson AGX Xavier Tech Specs

  • GPU 512-core Volta GPU with Tensor Cores
  • CPU 8-core ARM v8.2 64-bit CPU, 8MB L2 + 4MB L3
  • Memory 16 GB 256-Bit LPDDR4x | 137 GB/s
  • Storage 32 GB eMMC 5.1 DL
  • Accelerator (2x) NVDLA Engines
  • Vision Accelerator 7-way VLIW Vision Processor
  • Encoder/Decoder (2x) 4Kp60 | HEVC/(2x) 4Kp60 | 12-Bit support
  • Size 105 mm x 105 mm
  • Deployment Module (Jetson AGX Xavier)

We have done performance benchmarks at Tegra Xavier for the key components of Fastvideo SDK. We've tested images with 2K and 4K resolutions and got the following averaged benchmarks.

Tegra Xavier benchmarks for 2K images (1920×1080)

  • HQLI Demosaic (8-bit, RGGB) – 0.28 ms
  • HQLI Demosaic (16-bit, RGGB) – 0.51 ms
  • DFPD Demosaic (8-bit, RGGB) – 0.84 ms
  • DFPD Demosaic (16-bit, RGGB) – 1.12 ms
  • MG Demosaic (16-bit, RGGB) – 3.07 ms
  • JPEG encoder (8-bit, quality 90%) – 0.74 ms
  • JPEG encoder (24-bit, quality 90%, 4:2:0) – 1.1 ms
  • JPEG encoder (24-bit, quality 90%, 4:4:4) – 1.45 ms
  • JPEG decoder (8-bit, quality 90%) – 1.4 ms
  • JPEG decoder (24-bit, quality 90%, 4:2:0) – 2.4 ms
  • JPEG decoder (24-bit, quality 90%, 4:4:4) – 2.36 ms
  • 24-bit image resize (algorithm Lanczos3) from 1920×1080 to 960×540 – 2.9 ms
  • 24-bit image resize (algorithm Lanczos3) from 1920×1080 to 1919×1079 – 4.64 ms
  • Denoise (8-bit, wavelet 9/7, 7 dwt levels) – 2.6 ms
  • Denoise (24-bit, wavelet 9/7, 7 dwt levels) – 7.5 ms
  • J2K Encoder (24-bit, wavelet 9/7, 7 dwt levels, cb 32, cr 12, lossless, single) – 40 ms
  • J2K Encoder (24-bit, wavelet 9/7, 7 dwt levels, cb 32, cr 12, lossy, single) – 18.7 ms
  • J2K Encoder (24-bit, wavelet 9/7, 7 dwt levels, cb 32, cr 12, lossy, batch) – 17.6 ms

Tegra Xavier benchmarks for 4K images (3840×2160)

  • HQLI Demosaic (8-bit, RGGB) – 1.26 ms
  • HQLI Demosaic (16-bit, RGGB) – 2.0 ms
  • DFPD Demosaic (8-bit, RGGB) – 3.1 ms
  • DFPD Demosaic (16-bit, RGGB) – 4.12 ms
  • MG Demosaic (16-bit, RGGB) – 11.4 ms
  • JPEG encoder (8-bit, quality 90%) – 2.38 ms
  • JPEG encoder (24-bit, quality 90%, 4:2:0) – 3.72 ms
  • JPEG encoder (24-bit, quality 90%, 4:4:4) – 5.64 ms
  • JPEG decoder (8-bit, quality 90%) – 3.84 ms
  • JPEG decoder (24-bit, quality 90%, 4:2:0) – 7.2 ms
  • JPEG decoder (24-bit, quality 90%, 4:4:4) – 7.8 ms
  • 24-bit image resize (algorithm Lanczos3) from 3840×2160 to 1920×1080 – 10.2 ms
  • 24-bit image resize (algorithm Lanczos3) from 3840×2160 to 3839×2159 – 17.5 ms
  • Denoise (8-bit, wavelet 9/7, 7 dwt levels) – 8.3 ms
  • Denoise (24-bit, wavelet 9/7, 7 dwt levels) – 24.3 ms
  • J2K Encoder (24-bit, wavelet 9/7, 7 dwt levels, cb 32, cr 12, lossless, single) – 153 ms
  • J2K Encoder (24-bit, wavelet 9/7, 7 dwt levels, cb 32, cr 12, lossy, single) – 76 ms
  • J2K Encoder (24-bit, wavelet 9/7, 7 dwt levels, cb 32, cr 12, lossy, batch) – 71 ms

Tegra Xavier benchmarks for 12-bit per pixel images with resolution 4032×2192

  • JPEG encoder (gray, 12-bit, quality 90%) – 4.14 ms
  • JPEG encoder (color, 12-bit, quality 90%, 4:2:0) – 6.2 ms
  • JPEG encoder (color, 12-bit, quality 90%, 4:4:4) – 10.1 ms

Benchmarks for Fastvideo SDK on Tegra K1, X1, X2

     Home              Contacts          Site Map
GPU Image Processing