Spaces:
Runtime error
Runtime error
| # Prepare Datasets for OneFormer | |
| - A dataset can be used by accessing [DatasetCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.DatasetCatalog) for its data, or [MetadataCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.MetadataCatalog) for its metadata (class names, etc). | |
| - This document explains how to setup the builtin datasets so they can be used by the above APIs. [Training OneFormer with Custom Datasets](https://github.com/SHI-Labs/OneFormer/tree/main/datasets/custom_datasets) gives a deeper dive on how to train OneFormer with custom datasets. | |
| - Detectron2 has builtin support for a few datasets. The datasets are assumed to exist in a directory specified by the environment variable `DETECTRON2_DATASETS`. Under this directory, detectron2 will look for datasets in the structure described below, if needed. | |
| ```text | |
| $DETECTRON2_DATASETS/ | |
| ADEChallengeData2016/ | |
| cityscapes/ | |
| coco/ | |
| mapillary_vistas/ | |
| ``` | |
| - You can set the location for builtin datasets by `export DETECTRON2_DATASETS=/path/to/datasets`. If left unset, the default is `./datasets` relative to your current working directory. | |
| ## Expected dataset structure for [ADE20K](http://sceneparsing.csail.mit.edu/) | |
| ```text | |
| ADEChallengeData2016/ | |
| images/ | |
| annotations/ | |
| objectInfo150.txt | |
| # download instance annotation | |
| annotations_instance/ | |
| # generated by prepare_ade20k_sem_seg.py | |
| annotations_detectron2/ | |
| # below are generated by prepare_ade20k_pan_seg.py | |
| ade20k_panoptic_{train,val}.json | |
| ade20k_panoptic_{train,val}/ | |
| # below are generated by prepare_ade20k_ins_seg.py | |
| ade20k_instance_{train,val}.json | |
| ``` | |
| - Generate `annotations_detectron2`: | |
| ```bash | |
| python datasets/prepare_ade20k_sem_seg.py | |
| ``` | |
| - Install panopticapi by: | |
| ```bash | |
| pip install git+https://github.com/cocodataset/panopticapi.git | |
| ``` | |
| - Download the instance annotation from <http://sceneparsing.csail.mit.edu/>: | |
| ```bash | |
| wget http://sceneparsing.csail.mit.edu/data/ChallengeData2017/annotations_instance.tar | |
| ``` | |
| - Then, run `python datasets/prepare_ade20k_pan_seg.py`, to combine semantic and instance annotations for panoptic annotations. | |
| - Run `python datasets/prepare_ade20k_ins_seg.py`, to extract instance annotations in COCO format. | |
| ## Expected dataset structure for [Cityscapes](https://www.cityscapes-dataset.com/downloads/) | |
| ```text | |
| cityscapes/ | |
| gtFine/ | |
| train/ | |
| aachen/ | |
| color.png, instanceIds.png, labelIds.png, polygons.json, | |
| labelTrainIds.png | |
| ... | |
| val/ | |
| test/ | |
| # below are generated Cityscapes panoptic annotation | |
| cityscapes_panoptic_train.json | |
| cityscapes_panoptic_train/ | |
| cityscapes_panoptic_val.json | |
| cityscapes_panoptic_val/ | |
| cityscapes_panoptic_test.json | |
| cityscapes_panoptic_test/ | |
| leftImg8bit/ | |
| train/ | |
| val/ | |
| test/ | |
| ``` | |
| - Login and download the dataset | |
| ```bash | |
| wget --keep-session-cookies --save-cookies=cookies.txt --post-data 'username=myusername&password=mypassword&submit=Login' https://www.cityscapes-dataset.com/login/ | |
| ######## gtFine | |
| wget --load-cookies cookies.txt --content-disposition https://www.cityscapes-dataset.com/file-handling/?packageID=1 | |
| ######## leftImg8bit | |
| wget --load-cookies cookies.txt --content-disposition https://www.cityscapes-dataset.com/file-handling/?packageID=3 | |
| ``` | |
| - Install cityscapes scripts by: | |
| ```bash | |
| pip install git+https://github.com/mcordts/cityscapesScripts.git | |
| ``` | |
| - To create labelTrainIds.png, first prepare the above structure, then run cityscapesescript with: | |
| ```bash | |
| git clone https://github.com/mcordts/cityscapesScripts.git | |
| ``` | |
| ```bash | |
| CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesScripts/cityscapesscripts/preparation/createTrainIdLabelImgs.py | |
| ``` | |
| These files are not needed for instance segmentation. | |
| - To generate Cityscapes panoptic dataset, run cityscapesescript with: | |
| ```bash | |
| CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesScripts/cityscapesscripts/preparation/createPanopticImgs.py | |
| ``` | |
| These files are not needed for semantic and instance segmentation. | |
| ## Expected dataset structure for [COCO](https://cocodataset.org/#download) | |
| ```text | |
| coco/ | |
| annotations/ | |
| instances_{train,val}2017.json | |
| panoptic_{train,val}2017.json | |
| caption_{train,val}2017.json | |
| # evaluate on instance labels derived from panoptic annotations | |
| panoptic2instances_val2017.json | |
| {train,val}2017/ | |
| # image files that are mentioned in the corresponding json | |
| panoptic_{train,val}2017/ # png annotations | |
| panoptic_semseg_{train,val}2017/ # generated by the script mentioned below | |
| ``` | |
| - Install panopticapi by: | |
| ```bash | |
| pip install git+https://github.com/cocodataset/panopticapi.git | |
| ``` | |
| - Then, run `python datasets/prepare_coco_semantic_annos_from_panoptic_annos.py`, to extract semantic annotations from panoptic annotations (only used for evaluation). | |
| - Then run the following command to convert the panoptic json into instance json format (used for evaluation on instance segmentation task): | |
| ```bash | |
| python datasets/panoptic2detection_coco_format.py --things_only | |
| ``` | |
| ## Expected dataset structure for [Mapillary Vistas](https://www.mapillary.com/dataset/vistas) | |
| ```text | |
| mapillary_vistas/ | |
| training/ | |
| images/ | |
| instances/ | |
| labels/ | |
| panoptic/ | |
| validation/ | |
| images/ | |
| instances/ | |
| labels/ | |
| panoptic/ | |
| mapillary_vistas_instance_{train,val}.json # generated by the script mentioned below | |
| ``` | |
| No preprocessing is needed for Mapillary Vistas on semantic and panoptic segmentation. | |
| We do not evaluate for the instance segmentation task on the Mapillary Vistas dataset. | |