View on GitHub


Richly-annotated 3D Reconstructions of Indoor Scenes

Angela Dai    Angel X. Chang    Manolis Savva    Maciej Halber
Thomas Funkhouser    Matthias Nießner

Stanford University            Princeton University            Technical University of Munich


ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-level semantic segmentations. To collect this data, we designed an easy-to-use and scalable RGB-D capture system that includes automated surface reconstruction and crowdsourced semantic annotation. We show that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks, including 3D object classification, semantic voxel labeling, and CAD model retrieval. More information can be found in our paper.

If you use the ScanNet data or code please cite:

    title={ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes},
    author={Dai, Angela and Chang, Angel X. and Savva, Manolis and Halber, Maciej and Funkhouser, Thomas and Nie{\ss}ner, Matthias},
    booktitle = {Proc. Computer Vision and Pattern Recognition (CVPR), IEEE},
    year = {2017}


The ScanNet data is released under the ScanNet Terms of Use, and the code is released under the MIT license.

Code and Data:

Please visit our main project repository for more information and access to code, data, and trained models:

Semantic voxel labeling of 3D scans in ScanNet using our 3D CNN architecture. Voxel colors indicate predicted or ground truth category.

Last updated: 2017-05-02