Spaces:
Runtime error
Runtime error
| ## Generation of crops from the real datasets | |
| The instructions below allow to generate the crops used for pre-training CroCo v2 from the following real-world datasets: ARKitScenes, MegaDepth, 3DStreetView and IndoorVL. | |
| ### Download the metadata of the crops to generate | |
| First, download the metadata and put them in `./data/`: | |
| ``` | |
| mkdir -p data | |
| cd data/ | |
| wget https://download.europe.naverlabs.com/ComputerVision/CroCo/data/crop_metadata.zip | |
| unzip crop_metadata.zip | |
| rm crop_metadata.zip | |
| cd .. | |
| ``` | |
| ### Prepare the original datasets | |
| Second, download the original datasets in `./data/original_datasets/`. | |
| ``` | |
| mkdir -p data/original_datasets | |
| ``` | |
| ##### ARKitScenes | |
| Download the `raw` dataset from https://github.com/apple/ARKitScenes/blob/main/DATA.md and put it in `./data/original_datasets/ARKitScenes/`. | |
| The resulting file structure should be like: | |
| ``` | |
| ./data/original_datasets/ARKitScenes/ | |
| ββββTraining | |
| ββββ40753679 | |
| β β ultrawide | |
| β β ... | |
| ββββ40753686 | |
| β | |
| ... | |
| ``` | |
| ##### MegaDepth | |
| Download `MegaDepth v1 Dataset` from https://www.cs.cornell.edu/projects/megadepth/ and put it in `./data/original_datasets/MegaDepth/`. | |
| The resulting file structure should be like: | |
| ``` | |
| ./data/original_datasets/MegaDepth/ | |
| ββββ0000 | |
| β ββββimages | |
| β β β 1000557903_87fa96b8a4_o.jpg | |
| β β β ... | |
| β ββββ ... | |
| ββββ0001 | |
| β β | |
| β β ... | |
| ββββ ... | |
| ``` | |
| ##### 3DStreetView | |
| Download `3D_Street_View` dataset from https://github.com/amir32002/3D_Street_View and put it in `./data/original_datasets/3DStreetView/`. | |
| The resulting file structure should be like: | |
| ``` | |
| ./data/original_datasets/3DStreetView/ | |
| ββββdataset_aligned | |
| β ββββ0002 | |
| β β β 0000002_0000001_0000002_0000001.jpg | |
| β β β ... | |
| β ββββ ... | |
| ββββdataset_unaligned | |
| β ββββ0003 | |
| β β β 0000003_0000001_0000002_0000001.jpg | |
| β β β ... | |
| β ββββ ... | |
| ``` | |
| ##### IndoorVL | |
| Download the `IndoorVL` datasets using [Kapture](https://github.com/naver/kapture). | |
| ``` | |
| pip install kapture | |
| mkdir -p ./data/original_datasets/IndoorVL | |
| cd ./data/original_datasets/IndoorVL | |
| kapture_download_dataset.py update | |
| kapture_download_dataset.py install "HyundaiDepartmentStore_*" | |
| kapture_download_dataset.py install "GangnamStation_*" | |
| cd - | |
| ``` | |
| ### Extract the crops | |
| Now, extract the crops for each of the dataset: | |
| ``` | |
| for dataset in ARKitScenes MegaDepth 3DStreetView IndoorVL; | |
| do | |
| python3 datasets/crops/extract_crops_from_images.py --crops ./data/crop_metadata/${dataset}/crops_release.txt --root-dir ./data/original_datasets/${dataset}/ --output-dir ./data/${dataset}_crops/ --imsize 256 --nthread 8 --max-subdir-levels 5 --ideal-number-pairs-in-dir 500; | |
| done | |
| ``` | |
| ##### Note for IndoorVL | |
| Due to some legal issues, we can only release 144,228 pairs out of the 1,593,689 pairs used in the paper. | |
| To account for it in terms of number of pre-training iterations, the pre-training command in this repository uses 125 training epochs including 12 warm-up epochs and learning rate cosine schedule of 250, instead of 100, 10 and 200 respectively. | |
| The impact on the performance is negligible. | |