mirror of
https://github.com/huchenlei/Depth-Anything.git
synced 2026-01-26 15:29:46 +00:00
2.4 KiB
2.4 KiB
Depth Anything for Semantic Segmentation
We use our Depth Anything pre-trained ViT-L encoder to fine-tune downstream semantic segmentation models.
Performance
Cityscapes
Note that our results are obtained without Mapillary pre-training.
| Method | Encoder | mIoU (s.s.) | m.s. |
|---|---|---|---|
| SegFormer | MiT-B5 | 82.4 | 84.0 |
| Mask2Former | Swin-L | 83.3 | 84.3 |
| OneFormer | Swin-L | 83.0 | 84.4 |
| OneFormer | ConNeXt-XL | 83.6 | 84.6 |
| DDP | ConNeXt-L | 83.2 | 83.9 |
| Ours | ViT-L | 84.8 | 86.2 |
ADE20K
| Method | Encoder | mIoU |
|---|---|---|
| SegFormer | MiT-B5 | 51.0 |
| Mask2Former | Swin-L | 56.4 |
| UperNet | BEiT-L | 56.3 |
| ViT-Adapter | BEiT-L | 58.3 |
| OneFormer | Swin-L | 57.4 |
| OneFormer | ConNeXt-XL | 57.4 |
| Ours | ViT-L | 59.4 |
Pre-trained models
Installation
Please refer to MMSegmentation for instructions. Do not forget to install mmdet to support Mask2Former:
pip install "mmdet>=3.0.0rc4"
After installation:
- move our config/depth_anything to mmseg's config
- move our dinov2.py to mmseg's backbones
- add DINOv2 in mmseg's models/backbones/__init__.py
- download our provided torchhub directory and put it at the root of your working directory
- download the Depth Anything pre-trained model (to initialize the encoder) and 2) put it under the
checkpointsfolder.
For training or inference with our pre-trained models, please refer to MMSegmentation instructions.