GPU Denoiser with very high performance for still images and video

Image/video denoising is widely used in many camera applications, especially for tasks with low-light illumination. We have developed several GPU-accelerated denoise kernels which run on existing hardware from NVIDIA on Windows/Linux/ARM. We've got very high performance both for image and video processing.

GPU Denoiser Library Features

  • Input format: 8/10/12/14/16-bit per channel input data array from CPU or GPU memory
  • Output format: 24/48-bit output data array in CPU or GPU memory
  • Denoising with 16/32-bit accuracy
  • High speed denoising without AI
  • Denoising algorithms
    • Wavelet denoiser (raw and rgb) CDF 5/3 and CDF 9/7 with Hard, Soft, Garrote thresholding
    • Bilateral denoiser
    • NLM denoiser
  • Compatibility with FastVCR software for machine vision cameras
  • Timing and performance measurements
  • OS Windows-10/11, Linux Ubuntu and L4T (Jetson)
  • Compatibility with NVIDIA GPUs (Jetson, GeForce, Quadro, Tesla), cc >=5.0, CUDA-12.3
gpu denoiser

Benchmarks for GPU denoiser

Image resolution: 4112×2176 (8.9 MPix), 16-bit per channel, RGB

Test description: all data in GPU memory, timing includes GPU computations only

2D Wavelet transform: CDF 9/7
Number of DWT resolutions: up to 7
DWT thresholds for YCbCr: 80;150;150

NLM denoiser parameters: windows 3×3 and 5×5, strength 800
Bilateral denoiser parameters: 3×3, sigmaColor 5, sigmaSpace 500

Software: OS Windows-10, CUDA-12.3
Hardware: NVIDIA GeForce RTX 4090

  • RAW DWT denoiser – 1.8 ms (4.9 GPix/s)
  • RGB DWT denoiser – 3.05 ms (2.9 GPix/s)
  • NLM denoiser (RGB) - 1.92 ms (4.6 GPix/s)
  • Bilateral denoiser (RGB) - 1.21 ms (7.3 GPix/s)

The above results show super fast performance and they are comparable with the processing time of our best MG debayer algorithm which is around 1.05 ms (8.5 GPix/s) for the same image on that GPU. Our denoisers used to be much slower than demosaicing algorithms.

We have designed that software as a part of our GPU Image & Video Processing SDK. Now our customers have opportunity to utilize these GPU-accelerated denoisers in their applications as a part of their image processing pipeline.

Testing

To test our GPU denoiser, please download Fast VCR software which is capable of working not only with machine vision cameras at real time, but also with RAW images from SSD. This is a real test to evaluate image quality and performance.

GPU-based denoising roadmap

  • Acceleration of NLM and Bilateral denoisers - in progress
  • Temporal denoiser on the GPU - in progress
  • Denoising algorithm which is based on "camera noise profile" and variance stabilizing transform (VST) - in progress
  • Total variation denoising (total variation regularization)

Contact Form

This form collects your name and email. Check out our Privacy Policy on how we protect and manage your personal data.