The nuScenes dataset for autonomous driving

The nuScenes dataset, released by Motional in 2019, transformed the landscape of self-driving datasets. It is a public large-scale dataset for autonomous driving. The nuScenes dataset is inspired by the KITTI dataset. It includes 7x more object annotations. It offers 1000 scenes from Boston and Singapore, enhanced with a diverse array of cameras, radar sensors, and a broad spectrum of object classification labels.

The nuScenes dataset is highly regarded in the autonomous vehicle industry due to its extensive coverage and multi-modal sensor data. Its primary purpose is to advance research and development in autonomous driving and assistive technologies.

Data collection

nuScenes stands out from other datasets due to its diverse sensor suite, which includes a LiDAR sensor, six cameras, and five radars. This combination provides a 360-degree view around the vehicle, offering a wealth of data for sensor fusion and algorithm development.

Dive into the nuScenes dataset and discover a treasure trove of autonomous vehicle data. It’s a comprehensive collection of sensor captures, localization data, and high-definition maps, all conveniently available for download.

Sensors

The full dataset includes approximately 1.4M camera images, 390k LIDAR sweepsand 1.4M RADAR sweeps.

Location

The data in nuScenes is collected from different urban settings, including Boston and Singapore’s complex and dynamic streets. The scenes of 20-second length are manually selected to show a diverse and interesting set of driving maneuvers, traffic situations, and unexpected behaviors.

Data labeling details

The dataset is meticulously annotated across 23 object classes, with over 1.4 million annotated 3D bounding boxes in 40k keyframes. Additionally, object-level attributes such as visibility, activity, and pose are annotated.

These scenes are carefully annotated by humans.

All objects in the nuScenes dataset come with a semantic category, as well as a 3D bounding box and attributes for each frame they occur in. Compared to 2D bounding boxes, this allows us to accurately infer an object’s position and orientation in space.

Panoptic nuScenes is a dataset that provides annotations for every single lidar point present in the 40,000 keyframes of the nuScenes dataset, amounting to 1.4 billion lidar points. It includes 32 semantic label classes and is compatible with other nuScenes datasets. Researchers can use it to study novel problems such as lidar point cloud segmentation, foreground extraction, sensor calibration, and mapping using point-level semantics. Panoptic nuScenes improves on the previous KITTI dataset by having more challenging traffic situations, covering the entire 360-degree view, and being licensable by commercial entities.

Get started with nuScenes

  • Multi-object tracking: 850 labeled sequences, 23 categories
    Information
  • Semantic + panoptic segmentation (nuScenes-lidarseg): 850 labeled sequences, 32 categories
    Information

You can preview and copy a sample of this dataset on Segments.ai:
https://segments.ai/segments/nuscenes-3d-segmentation/samples

Using the nuScenes dataset

The nuScenes dataset is available as free to use strictly for non-commercial purposes.

License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International, or acquire a commercial license

Share this dataset