GPIO Latency Test for Camera Applications

Author: Fyodor Serzhenko
Tags: Machine vision camera, USB3, Robotics, Low latency, Jetson, GPIO, G2G, GPU ISP

In robotics applications, even a small delay can cause problems. Often, response time targets below 50 milliseconds are mandatory. This prompts Jetson and other platforms to optimize each stage of the workflow rather than focusing solely on throughput.

The G2G (glass-to-glass) test is widely used for latency evaluation. It measures the time between two "glasses". In this test, we typically capture a frame from a monitor, send the data to a PC for computation, and output the processed image to the monitor again. This method encompasses exposure time, sensor readout, data transfer, ISP, encoding, network, decoding, and display. Therefore, benchmarking must isolate each stage to identify the actual bottleneck in the entire pipeline.

 

gpio latency test

 

What are the most important limitations for low-latency benchmarks?

To identify the main limitations on the road to minimum latency, we conducted the following G2G test with the shortest pipeline:

  • High resolution timer output to the monitor
  • Image capture with machine vision USB3 camera
  • RAW image transfer from camera to Jetson over USB3
  • RAW image output to the monitor via OpenGL

We used a Jetson Orin NX 8GB with a connected XIMEA MC031CG-SY camera for testing. The camera has a USB3 interface, a frame rate of 150 fps, a ROI resolution of 1920×1080 at 8-bit, and an exposure time of 0.2 milliseconds. The monitor's frame rate was only 60 fps because it is not possible to achieve a higher frame rate with that Jetson.

It turned out that the minimum latency for G2G test was 70 ms or more (to achieve better results, we need a camera with a higher frame rate and a monitor with a much better FPS). As we can see, data transfer is very fast. There is no processing on the GPU, the camera frame rate is high, and the exposure time is low. The following could be reasons for the poor G2G test result: USB3 or OpenGL.

GPIO latency test without a monitor or OpenGL

For latency measurements, let's try bypassing the monitor and OpenGL. This makes sense because many robotics applications don't require a monitor. They just need to capture an image with a camera, transfer it to a Jetson or other hardware, run ISP and/or AI to make a decision, and send a command to a mechanism.

To estimate the latency, we first switched off both the ISP and OpenGL. Then, we acquired raw Bayer images from the camera and stored each captured image in the Jetson memory to shorten the pipeline. This doesn't affect the main idea because we can easily distinguish black frames from white ones, even if we use unprocessed raw Bayer frames.

We've run that test using GPIO from Jetson and LED. We can switch the LED on and off via GPIO, and we can see whether the LED is on or off in the captured raw image. We used the following scenario:

  • We've covered the camera with a plastic cap that has a small hole, and we put the LED into that hole
  • The LED is connected to the Jetson Orin via the general-purpose input/output (GPIO)
  • If the LED is off, the camera captures black frames because there is no illumination
  • We will continue capturing these frames and storing them in memory consecutively
  • Now, right after acquiring a frame, we send a command to the LED via GPIO from our camera software to switch it on
  • After that, we can count how many black frames are captured after switching on the LED
  • Then, we need to find the white frame number
  • The time difference between the last black frame before the GPIO is switched on and the first white frame is the latency estimation

We've performed numerous tests with the USB3 camera at frame rates ranging from 30 to 200 fps. In all cases, the interval between the final black frame and the initial white frame was consistently one frame. It means that we've captured a black frame, then we've sent a command to the LED, and the next acquired frame was white. For the 1920×1080 resolution at 150 fps (exposure time was 0.2 ms), we've got up to 6.6 ms latency in the case without ISP and without OpenGL. It's worth mentioning that the final result includes USB3 interface latency, so it's definitely not the main bottleneck, at least for this XIMEA camera.

Currently, the actual latency for that use case is unclear; we've only determined an upper bound. We also used smaller resolutions, such as 720×576, to achieve 200 fps performance (exposure time is 0.2 ms), yet the difference was just one frame. We've also checked that we haven't missed any frames (there weren't any dropped frames).

ISP Pipeline on CUDA

The case with the shortest pipeline is important, but it's only suitable for evaluation purposes. Now, let's consider a real use case for robotics. This pipeline offers data preprocessing prior to the AI part of the application:

  • RAW frame copy to the memory
  • Black/white points + conversion to 16-bit
  • White Balance
  • Demosaicing
  • Gamma
  • Convertion from 16-bit to 8-bit
  • Frame copy to AI app (device-to-device)

Assuming we are working with 8-bit RAW Bayer images at a resolution of 1920×1080 and 150 fps on a Jetson Orin NX 8GB, the total latency for that pipeline is up to 13 ms. After switching on the LED, the next frame comes black, an dthe next one is white. So it takes up to 6.6 ms to acquite and to transfer the RAW frame from the image sensor to the Jetson, and the GPU-based ISP time for the above pipeline is around 4-5 ms, so we can see the result within 13 ms. Total load of the Orin NX GPU is around 75% in that case.

We've done the same test with 8-bit RAW Bayer images at resolution of 720×576 and 200 fps on a Jetson Orin NX 8GB, and in that case the total latency is up to 5 ms, including ISP. It means that the next frame after switching on the LED comes white. Total load of the Orin NX GPU is around 60%.

As we can see, the latency is in the range of 1-2 frames for the case with/without GPU-based ISP and without OpenGL, and this is an upper estimate for the case without OpenGL.

How can OpenGL be tuned to achieve better latency?

Once we determine that the USB3 interface from XIMEA camera doesn't significantly impact latency, we can conclude that the main problem with the G2G test could be OpenGL. Let's check what's going on with the OpenGL workflow.

These are the main stages of data processing, from GPU memory to the monitor:

  • CPU to GPU command submission (~µs–ms). OpenGL commands are enqueued in the driver’s command buffer. The driver flushes the batch (e.g., via implicit flush, glFlush, or glFinish) before actual GPU work begins. Latency is usually less than 1 ms, but it can be higher if the driver batches commands.
  • GPU execution (~µs–ms). Rendering involves vertex processing, rasterization, fragment shading on the GPU. For simple scenes, it takes ~0.1–2 ms, depending on the GPU and workload. For complex scenes or low-end/embedded GPUs (e.g., mobile or Jetson), it could take 5–10 ms or more.
  • Framebuffer swap and VSync sync (~ms). Most applications use double or triple buffering with VSync to avoid tearing. With VSync enabled and triple buffering, a frame may wait in a queue until the next or next-next vertical blanking interval. At 60 Hz, one refresh interval equals 16.67 milliseconds (ms), so the worst-case additional delay is 33 ms (2 × 16.67 ms). This is often the dominant source of latency in typical rendering loops.
  • Display scan-out and panel latency (~1–10 ms). Once the front buffer is swapped, the monitor begins scan-out. Panel response time (gray-to-gray) adds 1–5 ms for modern gaming monitors and up to 10+ ms for budget or older LCDs. Some monitors also apply post-processing, such as motion blur reduction or color correction, which can add latency.

In summary, to achieve optimal OpenGL performance at G2G test, we require a high-frame-rate monitor, a high-FPS camera with a high-bandwidth interface and low exposure, VSync disabled, a high-end GPU, and a fast panel. The G2G test is viable, but we need to pay additional attention to the OpenGL implementation and usage.

Other blog posts:

Contact Form

This form collects your name and email. Check out our Privacy Policy on how we protect and manage your personal data.