Post-Processing

Post-processing is a critical phase in the inference process. It involves refining the raw outputs of the models to derive meaningful insights and actionable data.

Output Interpretation and Conversion

In the DepthAI library, we support a range of predefined models as well as custom ones. This section will guide you through interpreting and parsing outputs from both types of models.

Predefined Models

Object Detection

We provide decoding support for some predefined models with outputs that can be easily parsed using the Python API. These models provide a set of data for each detected object. This includes the following:

Bounding Box Coordinates (xmin, ymin, xmax, ymax): These define the rectangular area in which the detected object is located within the image. The coordinates are normalized, meaning they are expressed as values between 0 and 1 relative to the dimensions of the image.
Confidence Scores: Represent the model's certainty about the detection. A higher score indicates greater confidence in the accuracy of the detection.
Class Labels: Indicate the category of the detected object, as learned by the model during training.

Supported predefined models include:

MobileNet: can be used through the MobileNetDetectionNetwork node.
Yolo: can be used through the YoloDetectionNetwork node. Available for export via Luxonis Tools. We support several versions of Yolo models, each tailored for different detection needs:
- YoloV5
- YoloV6 (R1, R2, R3, R4)
- YoloV7
- YoloV8
- GoldYolo

For a more detailed understanding of model outputs and predifined model nodes, refer to the ImgDetections documentation. An example of parsing the data is as follows:

Python

1import depthai as dai
2
3# Assuming the pipeline and model setup are already done
4# For details on pipeline creation, refer to DepthAI documentation
5
6# Connect to the device and start the pipeline
7with dai.Device(pipeline) as device:
8
9    # Retrieve detections from the output queue
10    qDet = device.getOutputQueue(name="network_node_name", maxSize=4, blocking=False)
11    detections = qDet.get().detections
12
13    # Parse each detection
14    for detection in detections:
15        xmin, ymin, xmax, ymax = detection.bbox
16        confidence = detection.confidence
17        class_id = detection.label
18        # Further processing or utilizing...

Custom Models

For custom models, outputs can still be retrieved using methods like getLayerFp16(layer_name) or getLayerInt8(layer_name), where you need to provide the name of the specific layer. Given that these methods return a flattened output from the selected layer, it's necessary to reshape the output for subsequent use. You can find an example in
the EfficientDet experiment. The following code snippet briefly illustrates the process: