Spaces:
Runtime error
Runtime error
| # DeepLab: Deep Labelling for Semantic Image Segmentation | |
| DeepLab is a state-of-art deep learning model for semantic image segmentation, | |
| where the goal is to assign semantic labels (e.g., person, dog, cat and so on) | |
| to every pixel in the input image. Current implementation includes the following | |
| features: | |
| 1. DeepLabv1 [1]: We use *atrous convolution* to explicitly control the | |
| resolution at which feature responses are computed within Deep Convolutional | |
| Neural Networks. | |
| 2. DeepLabv2 [2]: We use *atrous spatial pyramid pooling* (ASPP) to robustly | |
| segment objects at multiple scales with filters at multiple sampling rates | |
| and effective fields-of-views. | |
| 3. DeepLabv3 [3]: We augment the ASPP module with *image-level feature* [5, 6] | |
| to capture longer range information. We also include *batch normalization* | |
| [7] parameters to facilitate the training. In particular, we applying atrous | |
| convolution to extract output features at different output strides during | |
| training and evaluation, which efficiently enables training BN at output | |
| stride = 16 and attains a high performance at output stride = 8 during | |
| evaluation. | |
| 4. DeepLabv3+ [4]: We extend DeepLabv3 to include a simple yet effective | |
| decoder module to refine the segmentation results especially along object | |
| boundaries. Furthermore, in this encoder-decoder structure one can | |
| arbitrarily control the resolution of extracted encoder features by atrous | |
| convolution to trade-off precision and runtime. | |
| If you find the code useful for your research, please consider citing our latest | |
| works: | |
| * DeepLabv3+: | |
| ``` | |
| @inproceedings{deeplabv3plus2018, | |
| title={Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation}, | |
| author={Liang-Chieh Chen and Yukun Zhu and George Papandreou and Florian Schroff and Hartwig Adam}, | |
| booktitle={ECCV}, | |
| year={2018} | |
| } | |
| ``` | |
| * MobileNetv2: | |
| ``` | |
| @inproceedings{mobilenetv22018, | |
| title={MobileNetV2: Inverted Residuals and Linear Bottlenecks}, | |
| author={Mark Sandler and Andrew Howard and Menglong Zhu and Andrey Zhmoginov and Liang-Chieh Chen}, | |
| booktitle={CVPR}, | |
| year={2018} | |
| } | |
| ``` | |
| * MobileNetv3: | |
| ``` | |
| @inproceedings{mobilenetv32019, | |
| title={Searching for MobileNetV3}, | |
| author={Andrew Howard and Mark Sandler and Grace Chu and Liang-Chieh Chen and Bo Chen and Mingxing Tan and Weijun Wang and Yukun Zhu and Ruoming Pang and Vijay Vasudevan and Quoc V. Le and Hartwig Adam}, | |
| booktitle={ICCV}, | |
| year={2019} | |
| } | |
| ``` | |
| * Architecture search for dense prediction cell: | |
| ``` | |
| @inproceedings{dpc2018, | |
| title={Searching for Efficient Multi-Scale Architectures for Dense Image Prediction}, | |
| author={Liang-Chieh Chen and Maxwell D. Collins and Yukun Zhu and George Papandreou and Barret Zoph and Florian Schroff and Hartwig Adam and Jonathon Shlens}, | |
| booktitle={NIPS}, | |
| year={2018} | |
| } | |
| ``` | |
| * Auto-DeepLab (also called hnasnet in core/nas_network.py): | |
| ``` | |
| @inproceedings{autodeeplab2019, | |
| title={Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic | |
| Image Segmentation}, | |
| author={Chenxi Liu and Liang-Chieh Chen and Florian Schroff and Hartwig Adam | |
| and Wei Hua and Alan Yuille and Li Fei-Fei}, | |
| booktitle={CVPR}, | |
| year={2019} | |
| } | |
| ``` | |
| In the current implementation, we support adopting the following network | |
| backbones: | |
| 1. MobileNetv2 [8] and MobileNetv3 [16]: A fast network structure designed | |
| for mobile devices. | |
| 2. Xception [9, 10]: A powerful network structure intended for server-side | |
| deployment. | |
| 3. ResNet-v1-{50,101} [14]: We provide both the original ResNet-v1 and its | |
| 'beta' variant where the 'stem' is modified for semantic segmentation. | |
| 4. PNASNet [15]: A Powerful network structure found by neural architecture | |
| search. | |
| 5. Auto-DeepLab (called HNASNet in the code): A segmentation-specific network | |
| backbone found by neural architecture search. | |
| This directory contains our TensorFlow [11] implementation. We provide codes | |
| allowing users to train the model, evaluate results in terms of mIOU (mean | |
| intersection-over-union), and visualize segmentation results. We use PASCAL VOC | |
| 2012 [12] and Cityscapes [13] semantic segmentation benchmarks as an example in | |
| the code. | |
| Some segmentation results on Flickr images: | |
| <p align="center"> | |
| <img src="g3doc/img/vis1.png" width=600></br> | |
| <img src="g3doc/img/vis2.png" width=600></br> | |
| <img src="g3doc/img/vis3.png" width=600></br> | |
| </p> | |
| ## Contacts (Maintainers) | |
| * Liang-Chieh Chen, github: [aquariusjay](https://github.com/aquariusjay) | |
| * YuKun Zhu, github: [yknzhu](https://github.com/YknZhu) | |
| * George Papandreou, github: [gpapan](https://github.com/gpapan) | |
| * Hui Hui, github: [huihui-personal](https://github.com/huihui-personal) | |
| * Maxwell D. Collins, github: [mcollinswisc](https://github.com/mcollinswisc) | |
| * Ting Liu: github: [tingliu](https://github.com/tingliu) | |
| ## Tables of Contents | |
| Demo: | |
| * <a href='https://colab.sandbox.google.com/github/tensorflow/models/blob/master/research/deeplab/deeplab_demo.ipynb'>Colab notebook for off-the-shelf inference.</a><br> | |
| Running: | |
| * <a href='g3doc/installation.md'>Installation.</a><br> | |
| * <a href='g3doc/pascal.md'>Running DeepLab on PASCAL VOC 2012 semantic segmentation dataset.</a><br> | |
| * <a href='g3doc/cityscapes.md'>Running DeepLab on Cityscapes semantic segmentation dataset.</a><br> | |
| * <a href='g3doc/ade20k.md'>Running DeepLab on ADE20K semantic segmentation dataset.</a><br> | |
| Models: | |
| * <a href='g3doc/model_zoo.md'>Checkpoints and frozen inference graphs.</a><br> | |
| Misc: | |
| * Please check <a href='g3doc/faq.md'>FAQ</a> if you have some questions before reporting the issues.<br> | |
| ## Getting Help | |
| To get help with issues you may encounter while using the DeepLab Tensorflow | |
| implementation, create a new question on | |
| [StackOverflow](https://stackoverflow.com/) with the tag "tensorflow". | |
| Please report bugs (i.e., broken code, not usage questions) to the | |
| tensorflow/models GitHub [issue | |
| tracker](https://github.com/tensorflow/models/issues), prefixing the issue name | |
| with "deeplab". | |
| ## License | |
| All the codes in deeplab folder is covered by the [LICENSE](https://github.com/tensorflow/models/blob/master/LICENSE) | |
| under tensorflow/models. Please refer to the LICENSE for details. | |
| ## Change Logs | |
| ### March 26, 2020 | |
| * Supported EdgeTPU-DeepLab and EdgeTPU-DeepLab-slim on Cityscapes. | |
| **Contributor**: Yun Long. | |
| ### November 20, 2019 | |
| * Supported MobileNetV3 large and small model variants on Cityscapes. | |
| **Contributor**: Yukun Zhu. | |
| ### March 27, 2019 | |
| * Supported using different loss weights on different classes during training. | |
| **Contributor**: Yuwei Yang. | |
| ### March 26, 2019 | |
| * Supported ResNet-v1-18. **Contributor**: Michalis Raptis. | |
| ### March 6, 2019 | |
| * Released the evaluation code (under the `evaluation` folder) for image | |
| parsing, a.k.a. panoptic segmentation. In particular, the released code supports | |
| evaluating the parsing results in terms of both the parsing covering and | |
| panoptic quality metrics. **Contributors**: Maxwell Collins and Ting Liu. | |
| ### February 6, 2019 | |
| * Updated decoder module to exploit multiple low-level features with different | |
| output_strides. | |
| ### December 3, 2018 | |
| * Released the MobileNet-v2 checkpoint on ADE20K. | |
| ### November 19, 2018 | |
| * Supported NAS architecture for feature extraction. **Contributor**: Chenxi Liu. | |
| * Supported hard pixel mining during training. | |
| ### October 1, 2018 | |
| * Released MobileNet-v2 depth-multiplier = 0.5 COCO-pretrained checkpoints on | |
| PASCAL VOC 2012, and Xception-65 COCO pretrained checkpoint (i.e., no PASCAL | |
| pretrained). | |
| ### September 5, 2018 | |
| * Released Cityscapes pretrained checkpoints with found best dense prediction cell. | |
| ### May 26, 2018 | |
| * Updated ADE20K pretrained checkpoint. | |
| ### May 18, 2018 | |
| * Added builders for ResNet-v1 and Xception model variants. | |
| * Added ADE20K support, including colormap and pretrained Xception_65 checkpoint. | |
| * Fixed a bug on using non-default depth_multiplier for MobileNet-v2. | |
| ### March 22, 2018 | |
| * Released checkpoints using MobileNet-V2 as network backbone and pretrained on | |
| PASCAL VOC 2012 and Cityscapes. | |
| ### March 5, 2018 | |
| * First release of DeepLab in TensorFlow including deeper Xception network | |
| backbone. Included chekcpoints that have been pretrained on PASCAL VOC 2012 | |
| and Cityscapes. | |
| ## References | |
| 1. **Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs**<br /> | |
| Liang-Chieh Chen+, George Papandreou+, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille (+ equal | |
| contribution). <br /> | |
| [[link]](https://arxiv.org/abs/1412.7062). In ICLR, 2015. | |
| 2. **DeepLab: Semantic Image Segmentation with Deep Convolutional Nets,** | |
| **Atrous Convolution, and Fully Connected CRFs** <br /> | |
| Liang-Chieh Chen+, George Papandreou+, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille (+ equal | |
| contribution). <br /> | |
| [[link]](http://arxiv.org/abs/1606.00915). TPAMI 2017. | |
| 3. **Rethinking Atrous Convolution for Semantic Image Segmentation**<br /> | |
| Liang-Chieh Chen, George Papandreou, Florian Schroff, Hartwig Adam.<br /> | |
| [[link]](http://arxiv.org/abs/1706.05587). arXiv: 1706.05587, 2017. | |
| 4. **Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation**<br /> | |
| Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, Hartwig Adam.<br /> | |
| [[link]](https://arxiv.org/abs/1802.02611). In ECCV, 2018. | |
| 5. **ParseNet: Looking Wider to See Better**<br /> | |
| Wei Liu, Andrew Rabinovich, Alexander C Berg<br /> | |
| [[link]](https://arxiv.org/abs/1506.04579). arXiv:1506.04579, 2015. | |
| 6. **Pyramid Scene Parsing Network**<br /> | |
| Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia<br /> | |
| [[link]](https://arxiv.org/abs/1612.01105). In CVPR, 2017. | |
| 7. **Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate shift**<br /> | |
| Sergey Ioffe, Christian Szegedy <br /> | |
| [[link]](https://arxiv.org/abs/1502.03167). In ICML, 2015. | |
| 8. **MobileNetV2: Inverted Residuals and Linear Bottlenecks**<br /> | |
| Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen<br /> | |
| [[link]](https://arxiv.org/abs/1801.04381). In CVPR, 2018. | |
| 9. **Xception: Deep Learning with Depthwise Separable Convolutions**<br /> | |
| François Chollet<br /> | |
| [[link]](https://arxiv.org/abs/1610.02357). In CVPR, 2017. | |
| 10. **Deformable Convolutional Networks -- COCO Detection and Segmentation Challenge 2017 Entry**<br /> | |
| Haozhi Qi, Zheng Zhang, Bin Xiao, Han Hu, Bowen Cheng, Yichen Wei, Jifeng Dai<br /> | |
| [[link]](http://presentations.cocodataset.org/COCO17-Detect-MSRA.pdf). ICCV COCO Challenge | |
| Workshop, 2017. | |
| 11. **Tensorflow: Large-Scale Machine Learning on Heterogeneous Distributed Systems**<br /> | |
| M. Abadi, A. Agarwal, et al. <br /> | |
| [[link]](https://arxiv.org/abs/1603.04467). arXiv:1603.04467, 2016. | |
| 12. **The Pascal Visual Object Classes Challenge – A Retrospective,** <br /> | |
| Mark Everingham, S. M. Ali Eslami, Luc Van Gool, Christopher K. I. Williams, John | |
| Winn, and Andrew Zisserma. <br /> | |
| [[link]](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/). IJCV, 2014. | |
| 13. **The Cityscapes Dataset for Semantic Urban Scene Understanding**<br /> | |
| Cordts, Marius, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, Bernt Schiele. <br /> | |
| [[link]](https://www.cityscapes-dataset.com/). In CVPR, 2016. | |
| 14. **Deep Residual Learning for Image Recognition**<br /> | |
| Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. <br /> | |
| [[link]](https://arxiv.org/abs/1512.03385). In CVPR, 2016. | |
| 15. **Progressive Neural Architecture Search**<br /> | |
| Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy. <br /> | |
| [[link]](https://arxiv.org/abs/1712.00559). In ECCV, 2018. | |
| 16. **Searching for MobileNetV3**<br /> | |
| Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, Hartwig Adam. <br /> | |
| [[link]](https://arxiv.org/abs/1905.02244). In ICCV, 2019. | |