Apollo Scape

1 · Introduction

The dataset consists 5165 image pairs and corresponding disparity maps, where 4156 image pairs are used for training, and 1009 image pairs are used for testing. The images are extracted from Apollo dataset. Ground truth has been acquired by accumulating 3D point clouds from Lidar and fitting 3D CAD models to individually moving cars (obtained from 3d car instance understanding dataset). The dataset contains varying traffic conditions with heavy occlusion, which are very challenging.

2 · Data Download

Training data

stereo_train_1.zip stereo_train_2.zip stereo_train_3.zip

Testing data

stereo_test.zip

3 · Data Structure

The structure of the dataset is following:

• intrinsic.txt: intrinsic parameters

• fg_mask: foreground mask

• bg_mask: background mask

• Camera5: images captured by camera 5

• Camera6: images captured by camera 6

• disparity: the ground truth disparity

Note that to show the disparity better, the disparity value is 200 times larger than groundtruth. If you upload results, the results should also be increased 200 times.

4 · Evaluation

The evaluation code are released on github here.

5 · Metric formula

For each image, given the predicted disparity di and the ground truth di*, the metric for evaluation is defined as:

Here the mask can be either foreground (fg), background (bg) or the whole region (merge of fg and bg). N is the number of image

6 · Rules of ranking

Result benchmark will be:

Rank	Method	D1_all	D1_fg	D1_bg
xxx	xx	xx	xx	xx

7 · Format of submission file

{split}/{data_type}/{image_name}

data_type: disparity: the estimated disparity

The dataset we released is desensitized street view for academic use only.