Realtime image processing for XIMEA CB500 on GPURealtime image processing for XIMEA CB500

High resolution cameras are getting more and more popular nowadays with resolution parameter ever growing due to advancement in the latest image sensor technology. The newest of them offer remarkable resolution of 50 or more Megapixel pushing the data bandwidth to the edge where it can become a bottleneck.

In the past, high resolution meant slow transfer speed (low fps) which was far from ideal for a smooth video on the monitor where you expect to get at least 20-30 fps to have an output with low latency and not just a sequence of separate frames. Even the simplest task of full RAW stream transfer from high resolution image sensor to computer with maximum bit depth and at full image resolution can be complicated.

 

XIMEA CB500 camera 48 MPix at 30 fps

Fig.1. High resolution 48 MP camera from XIMEA with active EF-mount

Camera vendor XIMEA offers a portfolio of various types of industrial cameras including the xiB series. The xiB camera line includes a model called CB500 with 48 MPix Global shutter CMOS image sensor providing 22 Fps (frames per second) at 12-bit readout or 30 Fps at 8-bit. This results in a substantial data stream where the camera like CB500 equipped with PCI-Express interface securing 20 Gbits throughput comes in handy. Real data stream of this 8K camera can reach 1550 MBytes/s.

One option to handle that RAW stream is to store data to high-end SSD, yet there could be a more effective solution. It is possible to process the data on GPU in real time, to show the results on the monitor and to save the compressed color frames to conventional SSD with final data rate being less than 500 MB/s. This approach allows to solve important tasks of real-time applications:

  • image acquisition
  • RAW image processing
  • image output to the monitor
  • image compression
  • storage of compressed frames on SSD

That's why we can send RAW data from XIMEA CB500 camera directly to system memory and then copy it to NVIDIA GPU memory, thus all image processing will be done on the GPU - below there are two pipelines and corresponding benchmarks for CB500 camera on NVIDIA GeForce RTX 2080TI.

 

pcb inspection with XIMEA CB500 camera 48 MPix

Fig.2. Application example: PCB inspection of motherboard with high resolution camera

Realtime image processing on GPU for 8k camera XIMEA CB500

This is not a full image processing pipeline, but an example of a common one. It includes camera calibration data (dark frame, flat field, dcp profile, lcp profile) and has an option for JPEG compression to get output bandwidth around 400-450 MB/s which should comply with conventional SSD.

  • Acquisition software gets raw data from the camera and sends raw frames to GPU
  • Unpacking module with transform from 12-bit to 16-bit
  • Dark image subtraction
  • Flat-Field Correction
  • White Balance
  • 1D LUT for RAW data
  • Image demosaicing
  • Base Color Correction
  • Curves and Levels with 1D LUT
  • DCP profile (DNG specification)
  • Remap with LCP profile
  • Gamma
  • Transform from 16-bit to 8-bit per channel
  • Resize and OpenGL output to monitor
  • JPEG compression with quality ~90%
  • Async write jpg images to SSD

Time measurements on NVIDIA GeForce RTX 2080 TI for XIMEA CB500

These are benchmarks for camera application at full resolution and 12-bit output. That pipeline can be tested with Fast CinemaDNG Processor software in offline mode to tune the parameters, to check image quality and performance, and it could be implemented in real time afterwards.

  • Input raw image: 7920 × 6004 pixels, 12 bits per pixel
  • Host-to-device transfer = 7.83 ms
  • Dark frame subtraction and flat-field correction = 0.88 ms
  • Linearization LUT = 0.37 ms
  • White Balance = 0.36 ms
  • MG Debayer = 4.70 ms
  • ProPhoto space transform = 1.28 ms
  • RGB Lut = 1.52 ms
  • Output color space transform = 1.39 ms
  • Geometry transform (undistortion) = 7.24 ms
  • Crop time (no crop after undistortion): 0.00 ms
  • 16 to 8 bit transform = 0.80 ms
  • JPEG encoder time (quality 90%, subsampling 4:2:0) = 2.69 ms
  • Viewport crop = 0.02 ms
  • Viewport resize (no viewport resize) = 0.00 ms
  • Total GPU = 29.07 ms
  • Total GPU + CPU = 33.92 ms

Time for GPU processing on NVIDIA GeForce RTX 2080 TI could be around 30-40 ms per frame which is better than maximum frame rate of the camera.

For more complicated image processing pipeline which could include bad pixels removal, denoising, intermediate color space transforms, defringe, resize, rotate, crop, sharp, histogram, parade, image and video compression, etc., the second GPU could assist in accomplishing these tasks in realtime.

If there is only one GPU installed, the total time to process one frame (GPU + CPU) could reach 60-70 ms making it important to optimize both software and hardware for getting the maximum performance from the system. To create a fast multithreaded solution with XIMEA CB500 camera, high-end software and hardware (CPU, GPU, SSD) is essential.

For such a workflow, JPEG store to fast SSD or NMVe is implemented in a separate CPU thread making it asynchronous and thus the time of jpg storing is not added to the total time. Video output is also implemented in a separate thread. This is the way to make as much work as possible in parallel to get maximum performance. The main idea is to divide the whole task to parts and to process them in parallel both on CPU and GPU.

 

aerial imaging XIMEA CB500 camera 48 MPix

Fig.3. Application example: Aerial shot from bird's eye view

Preview mode for XIMEA CB500

Here you can review the results of time measurements for preview mode. We need that mode in the case if it's not necessary to compress and to store processed frames. In that case image processing pipepline is very simple and the performance is higher.

  • Input raw image: 7920 × 6004 pixels, 12 bits per pixel
  • Host-to-device transfer = 8.49 ms
  • Linearization LUT = 0.38 ms
  • White Balance = 0.36 ms
  • MG Debayer = 4.59 ms
  • ProPhoto space transform = 1.29 ms
  • RGB Lut = 1.58 ms
  • Output color space transform = 1.38 ms
  • 16 to 8 bit transform = 0.81 ms
  • Viewport crop = 0.02 ms
  • Viewport resize (no viewport resize) = 0.00 ms
  • Total GPU = 18.90 ms
  • Total GPU + CPU = 21.86 ms

It's possible to get even better results by overlapping host-to-device transfers with computations to exclude transfer time from the benchmarks. The above benchmarks still include this time.

Custom software design for camera applications with GPU-based processing

There are a lot of different tasks which require 12-bit camera with high image resolution and high FPS. For example, XIMEA CB500 camera is successfully utilized in applications like aerial mapping, 3D scanning, flat panel inspection (FPD), solar panel analysis, printed circuit board (PCB) examination, wide area surveillance, persistent stadium and border security, cinematography, sports and entertainment, 360 panorama, UAV and Autonomous, Unmanned vehicles, etc.

The Fastvideo company offers high performance software solutions for such applications and most of them are based on GPU image processing pipeline from PRO version of Fast CinemaDNG Processor software, which is highly optimized and has digital cinema workflow inside to get excellent image quality. That solution can help you to process images with quality comparable to Adobe CameraRaw and Raw Therapee, but with much higher performance.

 

aerial shot XIMEA CB500 camera

Fig.4. Application example: Aerial architecture - houses

Additionally, the CB500 camera comes with a flat ribbon flex cable (MX500 model) making it a perfect fit for embedded vision systems or multiple camera setups. Software for that kind of complex, integrated solutions is based on Fastvideo SDK for Jetson and it's available for NVIDIA TK1, TX1, TX2, TX2i and AGX Xavier hardware.

With the help of Fastvideo SDK it is also possible to implement desired pipeline for any specific task for CB500 camera and similar models in a single or multi-camera system. For example, you can take advantage of 12-bit per channel JPEG compression on GPU at the end of image processing pipeline to compress and to store more information at each frame.

Contact Form

This form collects your name and email. Check out our Privacy Policy on how we protect and manage your personal data.