Back to the articles

What data do you need for multi-sensor labeling?

December 5th, 2023 - 2 min -
Avatar photo

In autonomous driving and robotics, combining camera images with lidar point cloud data can greatly improve the perception capabilities of the system. In the past, images and point clouds were often treated separately and labeled independently.

Today, multi-sensor labeling brings significant advantages to the table. By combining and visualizing the data from multiple sensors in one interface, computer vision, and data science teams ensure a high level of consistency across multiple sensors.

For instance, object IDs can be maintained consistently across different modalities and over time, resulting in more accurate and reliable machine-learning models.

At, our engineers frequently assist computer vision teams in establishing their multi-sensor labeling pipeline. We have compiled their advice and the most commonly asked questions in this post.

Prerequisites for multi-sensor labeling

Point Cloud data

Point cloud data plays a crucial role in multi-sensor data fusion and labeling. LiDARs, radars, or stereo camera pairs generate a 3D view of the surrounding environment. The data can come from a single sensor or multiple sensors combined into a single point cloud.

This data includes:

  • Position information: The spatial coordinates (x, y, z) of each point in the cloud
  • Additional values: Intensity values of objects, or RGB values that offer color information, enriching the point cloud

Camera data

Camera data provides rich, contextual visual cues complementing the structural depth data from point clouds. An array of camera streams may be used, including the vehicle or robot’s front, back, or side cameras.

To correctly overlay images with point clouds, each camera stream should be accompanied by:

  • Camera extrinsics: Information such as the extrinsic matrix and rotation matrix describe the camera’s position and orientation relative to a world reference frame. This information is crucial for accurate object placement.
  • Camera intrinsics: These describe the camera’s internal properties, including the intrinsic matrix and lens distortion parameters, which are especially important for wide-angle lenses or other non-standard lenses

Ego poses

Ego poses refer to the position and orientation of the vehicle or robot in the world coordinate frame. The ego pose is essential for:

  • Keeping static objects in place within the 3D environment, so you only need to label them once across the whole sequence. It also enables you to benefit from the Merged Point Cloud interface.
  • Creating realistic trajectories for dynamic objects. This helps smart labeling tools within the Batch Mode interface to accurately label an object across frames automatically.

Sensor synchronization

Sensor synchronization is crucial for ensuring proper alignment and accuracy.

  • Timestamps: Accurate timestamps are needed for interpolation techniques to be more precise. In the absence of timestamps, a default linear sampling will be assumed. This may reduce precision.

Additional metadata

Metadata provides contextual insights that are valuable during the labeling process. These may include:

  • The date and city where the data was recorded.
  • Identifier of the robot or vehicle that made the recording.
  • Data collection context that may influence the interpretation of the data, e.g. a note about a faulty sensor.

Data preprocessing

Before labeling can start, it might be necessary to preprocess the data to ensure it is consistent and ready for labeling. Examples include sensor calibration, point cloud subsampling, ground plane estimation and normalization.

Decisions you need to make before labeling

Before kicking off the labeling, it’s important to make some key decisions that will guide the labeling process:

  • Type of Labeling: Determine which types of labels you need, e.g. cuboid or segmentation labels. This choice must be made based on the eventual use case and will dictate the approach and tools required.
  • Labeling guidelines: Establish clear and comprehensive labeling guidelines to ensure consistency and accuracy in the labeling process. {to link}
  • Export format: Choose a suitable export format, such as the available defaults in, that will guarantee compatibility with your machine-learning pipeline.

Multi-sensor data labeling is a multi-faceted process that demands a wide array of data points to be most effective. By understanding and collecting these diverse data types, teams can ensure that their datasets are accurately labeled and ready for success in their machine-learning applications.

Share this article