Initial commit

This commit is contained in:
Lihe Yang
2024-01-22 09:14:27 +08:00
committed by GitHub
parent a8a78aedd8
commit 2495a75427
76 changed files with 58425 additions and 2 deletions

150
README.md
View File

@@ -1,2 +1,148 @@
# Depth-Anything
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
<div align="center">
<h2>Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data</h2>
[**Lihe Yang**](https://liheyoung.github.io/)<sup>1</sup> · [**Bingyi Kang**](https://scholar.google.com/citations?user=NmHgX-wAAAAJ)<sup>2+</sup> · [**Zilong Huang**](http://speedinghzl.github.io/)<sup>2</sup> · [**Xiaogang Xu**](https://xiaogang00.github.io/)<sup>3,4</sup>, [**Jiashi Feng**](https://sites.google.com/site/jshfeng/)<sup>2</sup> · [**Hengshuang Zhao**](https://hszhao.github.io/)<sup>1+</sup>
<sup>1</sup>The University of Hong Kong · <sup>2</sup>TikTok · <sup>3</sup>Zhejiang Lab · <sup>4</sup>Zhejiang University
<sup>+</sup>corresponding authors
<a href=""><img src='https://img.shields.io/badge/arXiv-Depth Anything-red' alt='Paper PDF'></a>
<a href='https://depth-anything.github.io'><img src='https://img.shields.io/badge/Project_Page-Depth Anything-green' alt='Project Page'></a>
<a href='https://huggingface.co/spaces/LiheYoung/Depth-Anything'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue'></a>
</div>
This work presents Depth Anything, a highly practical solution for robust monocular depth estimation by training on a combination of 1.5M labeled images and **62M+ unlabeled images**.
![teaser](assets/teaser.png)
## News
* **2024-01-22:** Paper, project page, code, models, and demo are released.
## Features of Depth Anything
- **Relative depth estimation**:
Our foundation models listed [here](https://huggingface.co/spaces/LiheYoung/Depth-Anything/tree/main/checkpoints) can provide relative depth estimation for any given image robustly. Please refer [here](#running) for details.
- **Metric depth estimation**
We fine-tune our Depth Anything model with metric depth information from NYUv2 or KITTI. It offers strong capabilities of both in-domain and zero-shot metric depth estimation. Please refer [here](./metric_depth) for details.
- **Better depth-conditioned ControlNet**
We re-train **a better depth-conditioned ControlNet** based on Depth Anything. It offers more precise synthesis than the previous MiDaS-based ControlNet. Please refer [here](./controlnet/) for details.
- **Downstream high-level scene understanding**
The Depth Anything encoder can be fine-tuned to downstream high-level perception tasks, *e.g.*, semantic segmentation, 86.2 mIoU on Cityscapes and 59.4 mIoU on ADE20K. Please refer [here](./semseg/) for details.
## Performance
Here we compare our Depth Anything with the previously best MiDaS v3.1 BEiT<sub>L-512</sub> model.
Please note that the latest MiDaS is also trained on KITTI and NYUv2, while we do not.
| Method | Params | KITTI || NYUv2 || Sintel || DDAD || ETH3D || DIODE ||
|-|-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
| | | AbsRel | $\delta_1$ | AbsRel | $\delta_1$ | AbsRel | $\delta_1$ | AbsRel | $\delta_1$ | AbsRel | $\delta_1$ | AbsRel | $\delta_1$ |
| MiDaS | 345.0M | 0.127 | 0.850 | 0.048 | *0.980* | 0.587 | 0.699 | 0.251 | 0.766 | 0.139 | 0.867 | 0.075 | 0.942 |
| **Ours-S** | 24.8M | 0.080 | 0.936 | 0.053 | 0.972 | 0.464 | 0.739 | 0.247 | 0.768 | 0.127 | **0.885** | 0.076 | 0.939 |
| **Ours-B** | 97.5M | *0.080* | *0.939* | *0.046* | 0.979 | **0.432** | *0.756* | *0.232* | *0.786* | **0.126** | *0.884* | *0.069* | *0.946* |
| **Ours-L** | 335.3M | **0.076** | **0.947** | **0.043** | **0.981** | *0.458* | **0.760** | **0.230** | **0.789** | *0.127* | 0.882 | **0.066** | **0.952** |
We highlight the **best** and *second best* results in **bold** and *italic* respectively (**better results**: AbsRel $\downarrow$ , $\delta_1 \uparrow$).
## Pre-trained models
We provide three models of varying scales for robust relatve depth estimation:
- Depth-Anything-ViT-Small (24.8M)
- Depth-Anything-ViT-Base (97.5M)
- Depth-Anything-ViT-Large (335.3M)
Download our pre-trained models [here](https://huggingface.co/spaces/LiheYoung/Depth-Anything/tree/main/checkpoints), and put them under the ``checkpoints`` directory.
## Usage
### Installation
The setup is very simple. Just make ensure ``torch``, ``torchvision``, and ``cv2`` are supported in your environment.
```bash
git clone https://github.com/LiheYoung/Depth-Anything
cd Depth-Anything
pip install -r requirements.txt
```
### Running
```bash
python run.py --encoder <vits | vitb | vitl> --load-from <pretrained-model> --img-path <img-directory | single-img | txt-file> --outdir <outdir> --localhub
```
For the ``img-path``, you can either 1) point it to an image directory storing all interested images, 2) point it to a single image, or 3) point it to a text file storing all image paths.
For example:
```bash
python run.py --encoder vitl --load-from checkpoints/depth_anything_vitl14.pth --img-path demo_images --outdir depth_visualization --localhub
```
### Import Depth Anything to your project
If you want to use Depth Anything in our own project, you can simply follow [``run.py``](run.py) to load our models and define data pre-processing.
<details>
<summary>Code snippet (note the difference between our data pre-processing and that of MiDaS.)</summary>
```python
from depth_anything.dpt import DPT_DINOv2
from depth_anything.util.transform import Resize, NormalizeImage, PrepareForNet
import cv2
import torch
depth_anything = DPT_DINOv2(encoder='vitl', features=256, out_channels=[256, 512, 1024, 1024], localhub=True)
depth_anything.load_state_dict(torch.load('checkpoints/depth_anything_vitl14.pth'))
transform = Compose([
Resize(
width=518,
height=518,
resize_target=False,
keep_aspect_ratio=True,
ensure_multiple_of=14,
resize_method='lower_bound',
image_interpolation_method=cv2.INTER_CUBIC,
),
NormalizeImage(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
PrepareForNet(),
])
image = cv2.cvtColor(cv2.imread('your image path'), cv2.COLOR_BGR2RGB) / 255.0
image = transform({'image': image})['image']
image = torch.from_numpy(image).unsqueeze(0)
# depth shape: 1xHxW
depth = depth_anything(image)
```
</details>
## Citation
If you find this project useful, please consider citing:
```bibtex
@article{depthanything,
title={Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data},
author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
journal={arXiv:},
year={2024},
}
```

BIN
assets/controlnet_demo1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.0 MiB

BIN
assets/controlnet_demo2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.9 MiB

BIN
assets/paper.pdf Normal file

Binary file not shown.

BIN
assets/teaser.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.4 MiB

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

40
controlnet/README.md Normal file
View File

@@ -0,0 +1,40 @@
## Depth-Conditioned ControlNet based on Depth Anything
We use [Diffusers](https://github.com/huggingface/diffusers/tree/main) to re-train a better depth-conditioned ControlNet based on our Depth Anything.
Please download our [config file](./config.json) and [pre-trained weights](https://huggingface.co/spaces/LiheYoung/Depth-Anything/tree/main/checkpoints_controlnet), then follow the [instructions](https://github.com/huggingface/diffusers/tree/main/examples/controlnet) in Diffusers for inference.
## Depth-to-Image Synthesis
![demo2](../assets/controlnet_demo1.png)
![demo1](../assets/controlnet_demo2.png)
## Video Editing
The demos below are generated by [MagicEdit](https://github.com/magic-research/magic-edit). The middle video is generated by MiDaS-based ControlNet, while the last video is generated by Depth Anything-based ControlNet.
<div style="display: flex; justify-content: space-around;">
<video width="30%" controls autoplay muted loop>
<source src="../assets/video_edit/demo1_video.mp4" type="video/mp4">
</video>
<video width="30%" controls autoplay muted loop>
<source src="../assets/video_edit/demo1_midas.mp4" type="video/mp4">
</video>
<video width="30%" controls autoplay muted loop>
<source src="../assets/video_edit/demo1_ours.mp4" type="video/mp4">
</video>
</div><br>
<div style="display: flex; justify-content: space-around;">
<video width="30%" controls autoplay muted loop>
<source src="../assets/video_edit/demo2_video.mp4" type="video/mp4">
</video>
<video width="30%" controls autoplay muted loop>
<source src="../assets/video_edit/demo2_midas.mp4" type="video/mp4">
</video>
<video width="30%" controls autoplay muted loop>
<source src="../assets/video_edit/demo2_ours.mp4" type="video/mp4">
</video>
</div>

51
controlnet/config.json Normal file
View File

@@ -0,0 +1,51 @@
{
"_class_name": "ControlNetModel",
"_diffusers_version": "0.26.0.dev0",
"act_fn": "silu",
"addition_embed_type": null,
"addition_embed_type_num_heads": 64,
"addition_time_embed_dim": null,
"attention_head_dim": 8,
"block_out_channels": [
320,
640,
1280,
1280
],
"class_embed_type": null,
"conditioning_channels": 3,
"conditioning_embedding_out_channels": [
16,
32,
96,
256
],
"controlnet_conditioning_channel_order": "rgb",
"cross_attention_dim": 768,
"down_block_types": [
"CrossAttnDownBlock2D",
"CrossAttnDownBlock2D",
"CrossAttnDownBlock2D",
"DownBlock2D"
],
"downsample_padding": 1,
"encoder_hid_dim": null,
"encoder_hid_dim_type": null,
"flip_sin_to_cos": true,
"freq_shift": 0,
"global_pool_conditions": false,
"in_channels": 4,
"layers_per_block": 2,
"mid_block_scale_factor": 1,
"mid_block_type": "UNetMidBlock2DCrossAttn",
"norm_eps": 1e-05,
"norm_num_groups": 32,
"num_attention_heads": null,
"num_class_embeds": null,
"only_cross_attention": false,
"projection_class_embeddings_input_dim": null,
"resnet_time_scale_shift": "default",
"transformer_layers_per_block": 1,
"upcast_attention": false,
"use_linear_projection": false
}

153
depth_anything/blocks.py Normal file
View File

@@ -0,0 +1,153 @@
import torch.nn as nn
def _make_scratch(in_shape, out_shape, groups=1, expand=False):
scratch = nn.Module()
out_shape1 = out_shape
out_shape2 = out_shape
out_shape3 = out_shape
if len(in_shape) >= 4:
out_shape4 = out_shape
if expand:
out_shape1 = out_shape
out_shape2 = out_shape*2
out_shape3 = out_shape*4
if len(in_shape) >= 4:
out_shape4 = out_shape*8
scratch.layer1_rn = nn.Conv2d(
in_shape[0], out_shape1, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
)
scratch.layer2_rn = nn.Conv2d(
in_shape[1], out_shape2, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
)
scratch.layer3_rn = nn.Conv2d(
in_shape[2], out_shape3, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
)
if len(in_shape) >= 4:
scratch.layer4_rn = nn.Conv2d(
in_shape[3], out_shape4, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
)
return scratch
class ResidualConvUnit(nn.Module):
"""Residual convolution module.
"""
def __init__(self, features, activation, bn):
"""Init.
Args:
features (int): number of features
"""
super().__init__()
self.bn = bn
self.groups=1
self.conv1 = nn.Conv2d(
features, features, kernel_size=3, stride=1, padding=1, bias=True, groups=self.groups
)
self.conv2 = nn.Conv2d(
features, features, kernel_size=3, stride=1, padding=1, bias=True, groups=self.groups
)
if self.bn==True:
self.bn1 = nn.BatchNorm2d(features)
self.bn2 = nn.BatchNorm2d(features)
self.activation = activation
self.skip_add = nn.quantized.FloatFunctional()
def forward(self, x):
"""Forward pass.
Args:
x (tensor): input
Returns:
tensor: output
"""
out = self.activation(x)
out = self.conv1(out)
if self.bn==True:
out = self.bn1(out)
out = self.activation(out)
out = self.conv2(out)
if self.bn==True:
out = self.bn2(out)
if self.groups > 1:
out = self.conv_merge(out)
return self.skip_add.add(out, x)
class FeatureFusionBlock(nn.Module):
"""Feature fusion block.
"""
def __init__(self, features, activation, deconv=False, bn=False, expand=False, align_corners=True, size=None):
"""Init.
Args:
features (int): number of features
"""
super(FeatureFusionBlock, self).__init__()
self.deconv = deconv
self.align_corners = align_corners
self.groups=1
self.expand = expand
out_features = features
if self.expand==True:
out_features = features//2
self.out_conv = nn.Conv2d(features, out_features, kernel_size=1, stride=1, padding=0, bias=True, groups=1)
self.resConfUnit1 = ResidualConvUnit(features, activation, bn)
self.resConfUnit2 = ResidualConvUnit(features, activation, bn)
self.skip_add = nn.quantized.FloatFunctional()
self.size=size
def forward(self, *xs, size=None):
"""Forward pass.
Returns:
tensor: output
"""
output = xs[0]
if len(xs) == 2:
res = self.resConfUnit1(xs[1])
output = self.skip_add.add(output, res)
output = self.resConfUnit2(output)
if (size is None) and (self.size is None):
modifier = {"scale_factor": 2}
elif size is None:
modifier = {"size": self.size}
else:
modifier = {"size": size}
output = nn.functional.interpolate(
output, **modifier, mode="bilinear", align_corners=self.align_corners
)
output = self.out_conv(output)
return output

170
depth_anything/dpt.py Normal file
View File

@@ -0,0 +1,170 @@
import torch
import torch.nn as nn
from .blocks import FeatureFusionBlock, _make_scratch
import torch.nn.functional as F
def _make_fusion_block(features, use_bn, size = None):
return FeatureFusionBlock(
features,
nn.ReLU(False),
deconv=False,
bn=use_bn,
expand=False,
align_corners=True,
size=size,
)
class DPTHead(nn.Module):
def __init__(self, nclass, in_channels, features=256, use_bn=False, out_channels=[256, 512, 1024, 1024], use_clstoken=False):
super(DPTHead, self).__init__()
self.nclass = nclass
self.use_clstoken = use_clstoken
self.projects = nn.ModuleList([
nn.Conv2d(
in_channels=in_channels,
out_channels=out_channel,
kernel_size=1,
stride=1,
padding=0,
) for out_channel in out_channels
])
self.resize_layers = nn.ModuleList([
nn.ConvTranspose2d(
in_channels=out_channels[0],
out_channels=out_channels[0],
kernel_size=4,
stride=4,
padding=0),
nn.ConvTranspose2d(
in_channels=out_channels[1],
out_channels=out_channels[1],
kernel_size=2,
stride=2,
padding=0),
nn.Identity(),
nn.Conv2d(
in_channels=out_channels[3],
out_channels=out_channels[3],
kernel_size=3,
stride=2,
padding=1)
])
if use_clstoken:
self.readout_projects = nn.ModuleList()
for _ in range(len(self.projects)):
self.readout_projects.append(
nn.Sequential(
nn.Linear(2 * in_channels, in_channels),
nn.GELU()))
self.scratch = _make_scratch(
out_channels,
features,
groups=1,
expand=False,
)
self.scratch.stem_transpose = None
self.scratch.refinenet1 = _make_fusion_block(features, use_bn)
self.scratch.refinenet2 = _make_fusion_block(features, use_bn)
self.scratch.refinenet3 = _make_fusion_block(features, use_bn)
self.scratch.refinenet4 = _make_fusion_block(features, use_bn)
head_features_1 = features
head_features_2 = 32
if nclass > 1:
self.scratch.output_conv = nn.Sequential(
nn.Conv2d(head_features_1, head_features_1, kernel_size=3, stride=1, padding=1),
nn.ReLU(True),
nn.Conv2d(head_features_1, nclass, kernel_size=1, stride=1, padding=0),
)
else:
self.scratch.output_conv1 = nn.Conv2d(head_features_1, head_features_1 // 2, kernel_size=3, stride=1, padding=1)
self.scratch.output_conv2 = nn.Sequential(
nn.Conv2d(head_features_1 // 2, head_features_2, kernel_size=3, stride=1, padding=1),
nn.ReLU(True),
nn.Conv2d(head_features_2, 1, kernel_size=1, stride=1, padding=0),
nn.ReLU(True),
nn.Identity(),
)
def forward(self, out_features, patch_h, patch_w):
out = []
for i, x in enumerate(out_features):
if self.use_clstoken:
x, cls_token = x[0], x[1]
readout = cls_token.unsqueeze(1).expand_as(x)
x = self.readout_projects[i](torch.cat((x, readout), -1))
else:
x = x[0]
x = x.permute(0, 2, 1).reshape((x.shape[0], x.shape[-1], patch_h, patch_w))
x = self.projects[i](x)
x = self.resize_layers[i](x)
out.append(x)
layer_1, layer_2, layer_3, layer_4 = out
layer_1_rn = self.scratch.layer1_rn(layer_1)
layer_2_rn = self.scratch.layer2_rn(layer_2)
layer_3_rn = self.scratch.layer3_rn(layer_3)
layer_4_rn = self.scratch.layer4_rn(layer_4)
path_4 = self.scratch.refinenet4(layer_4_rn, size=layer_3_rn.shape[2:])
path_3 = self.scratch.refinenet3(path_4, layer_3_rn, size=layer_2_rn.shape[2:])
path_2 = self.scratch.refinenet2(path_3, layer_2_rn, size=layer_1_rn.shape[2:])
path_1 = self.scratch.refinenet1(path_2, layer_1_rn)
out = self.scratch.output_conv1(path_1)
out = F.interpolate(out, (int(patch_h * 14), int(patch_w * 14)), mode="bilinear", align_corners=True)
out = self.scratch.output_conv2(out)
return out
class DPT_DINOv2(nn.Module):
def __init__(self, encoder='vitl', features=256, out_channels=[256, 512, 1024, 1024], use_bn=False, use_clstoken=False, localhub=True):
super(DPT_DINOv2, self).__init__()
assert encoder in ['vits', 'vitb', 'vitl']
# in case the Internet connection is not stable, please load the DINOv2 locally
if localhub:
self.pretrained = torch.hub.load('torchhub/facebookresearch_dinov2_main', 'dinov2_{:}14'.format(encoder), source='local', pretrained=False)
else:
self.pretrained = torch.hub.load('facebookresearch/dinov2', 'dinov2_{:}14'.format(encoder))
dim = self.pretrained.blocks[0].attn.qkv.in_features
self.depth_head = DPTHead(1, dim, features, use_bn, out_channels=out_channels, use_clstoken=use_clstoken)
def forward(self, x):
h, w = x.shape[-2:]
features = self.pretrained.get_intermediate_layers(x, 4, return_class_token=True)
patch_h, patch_w = h // 14, w // 14
depth = self.depth_head(features, patch_h, patch_w)
depth = F.interpolate(depth, size=(h, w), mode="bilinear", align_corners=True)
depth = F.relu(depth)
return depth.squeeze(1)
if __name__ == '__main__':
depth_anything = DPT_DINOv2()
depth_anything.load_state_dict(torch.load('checkpoints/depth_anything_dinov2_vitl14.pth'))

View File

@@ -0,0 +1,248 @@
import random
from PIL import Image, ImageOps, ImageFilter
import torch
from torchvision import transforms
import torch.nn.functional as F
import numpy as np
import cv2
import math
def apply_min_size(sample, size, image_interpolation_method=cv2.INTER_AREA):
"""Rezise the sample to ensure the given size. Keeps aspect ratio.
Args:
sample (dict): sample
size (tuple): image size
Returns:
tuple: new size
"""
shape = list(sample["disparity"].shape)
if shape[0] >= size[0] and shape[1] >= size[1]:
return sample
scale = [0, 0]
scale[0] = size[0] / shape[0]
scale[1] = size[1] / shape[1]
scale = max(scale)
shape[0] = math.ceil(scale * shape[0])
shape[1] = math.ceil(scale * shape[1])
# resize
sample["image"] = cv2.resize(
sample["image"], tuple(shape[::-1]), interpolation=image_interpolation_method
)
sample["disparity"] = cv2.resize(
sample["disparity"], tuple(shape[::-1]), interpolation=cv2.INTER_NEAREST
)
sample["mask"] = cv2.resize(
sample["mask"].astype(np.float32),
tuple(shape[::-1]),
interpolation=cv2.INTER_NEAREST,
)
sample["mask"] = sample["mask"].astype(bool)
return tuple(shape)
class Resize(object):
"""Resize sample to given size (width, height).
"""
def __init__(
self,
width,
height,
resize_target=True,
keep_aspect_ratio=False,
ensure_multiple_of=1,
resize_method="lower_bound",
image_interpolation_method=cv2.INTER_AREA,
):
"""Init.
Args:
width (int): desired output width
height (int): desired output height
resize_target (bool, optional):
True: Resize the full sample (image, mask, target).
False: Resize image only.
Defaults to True.
keep_aspect_ratio (bool, optional):
True: Keep the aspect ratio of the input sample.
Output sample might not have the given width and height, and
resize behaviour depends on the parameter 'resize_method'.
Defaults to False.
ensure_multiple_of (int, optional):
Output width and height is constrained to be multiple of this parameter.
Defaults to 1.
resize_method (str, optional):
"lower_bound": Output will be at least as large as the given size.
"upper_bound": Output will be at max as large as the given size. (Output size might be smaller than given size.)
"minimal": Scale as least as possible. (Output size might be smaller than given size.)
Defaults to "lower_bound".
"""
self.__width = width
self.__height = height
self.__resize_target = resize_target
self.__keep_aspect_ratio = keep_aspect_ratio
self.__multiple_of = ensure_multiple_of
self.__resize_method = resize_method
self.__image_interpolation_method = image_interpolation_method
def constrain_to_multiple_of(self, x, min_val=0, max_val=None):
y = (np.round(x / self.__multiple_of) * self.__multiple_of).astype(int)
if max_val is not None and y > max_val:
y = (np.floor(x / self.__multiple_of) * self.__multiple_of).astype(int)
if y < min_val:
y = (np.ceil(x / self.__multiple_of) * self.__multiple_of).astype(int)
return y
def get_size(self, width, height):
# determine new height and width
scale_height = self.__height / height
scale_width = self.__width / width
if self.__keep_aspect_ratio:
if self.__resize_method == "lower_bound":
# scale such that output size is lower bound
if scale_width > scale_height:
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
elif self.__resize_method == "upper_bound":
# scale such that output size is upper bound
if scale_width < scale_height:
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
elif self.__resize_method == "minimal":
# scale as least as possbile
if abs(1 - scale_width) < abs(1 - scale_height):
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
else:
raise ValueError(
f"resize_method {self.__resize_method} not implemented"
)
if self.__resize_method == "lower_bound":
new_height = self.constrain_to_multiple_of(
scale_height * height, min_val=self.__height
)
new_width = self.constrain_to_multiple_of(
scale_width * width, min_val=self.__width
)
elif self.__resize_method == "upper_bound":
new_height = self.constrain_to_multiple_of(
scale_height * height, max_val=self.__height
)
new_width = self.constrain_to_multiple_of(
scale_width * width, max_val=self.__width
)
elif self.__resize_method == "minimal":
new_height = self.constrain_to_multiple_of(scale_height * height)
new_width = self.constrain_to_multiple_of(scale_width * width)
else:
raise ValueError(f"resize_method {self.__resize_method} not implemented")
return (new_width, new_height)
def __call__(self, sample):
width, height = self.get_size(
sample["image"].shape[1], sample["image"].shape[0]
)
# resize sample
sample["image"] = cv2.resize(
sample["image"],
(width, height),
interpolation=self.__image_interpolation_method,
)
if self.__resize_target:
if "disparity" in sample:
sample["disparity"] = cv2.resize(
sample["disparity"],
(width, height),
interpolation=cv2.INTER_NEAREST,
)
if "depth" in sample:
sample["depth"] = cv2.resize(
sample["depth"], (width, height), interpolation=cv2.INTER_NEAREST
)
if "semseg_mask" in sample:
# sample["semseg_mask"] = cv2.resize(
# sample["semseg_mask"], (width, height), interpolation=cv2.INTER_NEAREST
# )
sample["semseg_mask"] = F.interpolate(torch.from_numpy(sample["semseg_mask"]).float()[None, None, ...], (height, width), mode='nearest').numpy()[0, 0]
if "mask" in sample:
sample["mask"] = cv2.resize(
sample["mask"].astype(np.float32),
(width, height),
interpolation=cv2.INTER_NEAREST,
)
# sample["mask"] = sample["mask"].astype(bool)
# print(sample['image'].shape, sample['depth'].shape)
return sample
class NormalizeImage(object):
"""Normlize image by given mean and std.
"""
def __init__(self, mean, std):
self.__mean = mean
self.__std = std
def __call__(self, sample):
sample["image"] = (sample["image"] - self.__mean) / self.__std
return sample
class PrepareForNet(object):
"""Prepare sample for usage as network input.
"""
def __init__(self):
pass
def __call__(self, sample):
image = np.transpose(sample["image"], (2, 0, 1))
sample["image"] = np.ascontiguousarray(image).astype(np.float32)
if "mask" in sample:
sample["mask"] = sample["mask"].astype(np.float32)
sample["mask"] = np.ascontiguousarray(sample["mask"])
if "depth" in sample:
depth = sample["depth"].astype(np.float32)
sample["depth"] = np.ascontiguousarray(depth)
if "semseg_mask" in sample:
sample["semseg_mask"] = sample["semseg_mask"].astype(np.float32)
sample["semseg_mask"] = np.ascontiguousarray(sample["semseg_mask"])
return sample

88
metric_depth/README.md Normal file
View File

@@ -0,0 +1,88 @@
# Depth Anything for Metric Depth Estimation
Our Depth Anything models primarily focus on robust *relative* depth estimation. To achieve *metric* depth estimation, we follow ZoeDepth to fine-tune from our Depth Anything pre-trained encoder with metric depth information from NYUv2 or KITTI.
## Performance
### *In-domain* metric depth estimation
#### NYUv2
| Method | $\delta_1 \uparrow$ | $\delta_2 \uparrow$ | $\delta_3 \uparrow$ | AbsRel $\downarrow$ | RMSE $\downarrow$ | log10 $\downarrow$ |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
| ZoeDepth | 0.951 | 0.994 | 0.999 | 0.077 | 0.282 | 0.033 |
| Depth Anything | **0.984** | **0.998** | **1.000** | **0.056** | **0.206** | **0.024** |
#### KITTI
| Method | $\delta_1 \uparrow$ | $\delta_2 \uparrow$ | $\delta_3 \uparrow$ | AbsRel $\downarrow$ | RMSE $\downarrow$ | log10 $\downarrow$ |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
| ZoeDepth | 0.971 | 0.996 | 0.999 | 0.054 | 2.281 | 0.082 |
| Depth Anything | **0.982** | **0.998** | **1.000** | **0.046** | **1.896** | **0.069** |
### *Zero-shot* metric depth estimation
Indoor: NYUv2 $\rightarrow$ SUN RGB-D, iBims-1, and HyperSim<br>
Outdoor: KITTI $\rightarrow$ Virtual KITTI 2 and DIODE Outdoor
| Method | SUN || iBims || HyperSim || vKITTI || DIODE Outdoor ||
|-|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
| | AbsRel | $\delta_1$ | AbsRel | $\delta_1$ | AbsRel | $\delta_1$ | AbsRel | $\delta_1$ | AbsRel | $\delta_1$ |
| ZoeDepth | 0.520 | 0.545 | 0.169 | 0.656 | 0.407 | 0.302 | 0.106 | 0.844 | 0.814 | 0.237 |
| Depth Anything | **0.500** | **0.660** | **0.150** | **0.714** | **0.363** | **0.361** | **0.085** | **0.913** | **0.794** | **0.288** |
## Pre-trained models
We provide two pre-trained models ([download](https://huggingface.co/spaces/LiheYoung/Depth-Anything/tree/main/checkpoints_metric_depth)), one for *indoor* metric depth estimation trained on NYUv2, and the other for *outdoor* metric depth estimation trained on KITTI.
## Installation
```bash
conda env create -n depth_anything_metric --file environment.yml
conda activate depth_anything_metric
```
Please follow [ZoeDepth](https://github.com/isl-org/ZoeDepth) to prepare the training and test datasets.
## Evaluation
Make sure you have downloaded our pre-trained models [here](https://huggingface.co/spaces/LiheYoung/Depth-Anything/tree/main/checkpoints_metric_depth) and put them under the ``checkpoints`` directory.
Indoor:
```bash
python evaluate.py -m zoedepth --pretrained_resource="local::./checkpoints/depth_anything_metric_depth_indoor.pt" -d <nyu | sunrgbd | ibims | hypersim_test>
```
Outdoor:
```bash
python evaluate.py -m zoedepth --pretrained_resource="local::./checkpoints/depth_anything_metric_depth_outdoor.pt" -d <kitti | vkitti2 | diode_outdoor>
```
## Training
```bash
python train_mono.py -m zoedepth -d <nyu | kitti> --pretrained_resource=""
```
This will automatically use our Depth Anything pre-trained ViT-L encoder.
## Citation
If you find this project useful, please consider citing:
```bibtex
@article{depthanything,
title={Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data},
author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
journal={arXiv:},
year={2024},
}
```

View File

@@ -0,0 +1,26 @@
name: zoe
channels:
- pytorch
- nvidia
- conda-forge
dependencies:
- cuda=11.7.1
- h5py=3.7.0
- hdf5=1.12.2
- matplotlib=3.6.2
- matplotlib-base=3.6.2
- numpy=1.24.1
- opencv=4.6.0
- pip=22.3.1
- python=3.9.7
- pytorch=1.13.1
- pytorch-cuda=11.7
- pytorch-mutex=1.0
- scipy=1.10.0
- torchaudio=0.13.1
- torchvision=0.14.1
- pip:
- huggingface-hub==0.11.1
- timm==0.6.12
- tqdm==4.64.1
- wandb==0.13.9

160
metric_depth/evaluate.py Normal file
View File

@@ -0,0 +1,160 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import argparse
from pprint import pprint
import torch
from zoedepth.utils.easydict import EasyDict as edict
from tqdm import tqdm
from zoedepth.data.data_mono import DepthDataLoader
from zoedepth.models.builder import build_model
from zoedepth.utils.arg_utils import parse_unknown
from zoedepth.utils.config import change_dataset, get_config, ALL_EVAL_DATASETS, ALL_INDOOR, ALL_OUTDOOR
from zoedepth.utils.misc import (RunningAverageDict, colors, compute_metrics,
count_parameters)
@torch.no_grad()
def infer(model, images, **kwargs):
"""Inference with flip augmentation"""
# images.shape = N, C, H, W
def get_depth_from_prediction(pred):
if isinstance(pred, torch.Tensor):
pred = pred # pass
elif isinstance(pred, (list, tuple)):
pred = pred[-1]
elif isinstance(pred, dict):
pred = pred['metric_depth'] if 'metric_depth' in pred else pred['out']
else:
raise NotImplementedError(f"Unknown output type {type(pred)}")
return pred
pred1 = model(images, **kwargs)
pred1 = get_depth_from_prediction(pred1)
pred2 = model(torch.flip(images, [3]), **kwargs)
pred2 = get_depth_from_prediction(pred2)
pred2 = torch.flip(pred2, [3])
mean_pred = 0.5 * (pred1 + pred2)
return mean_pred
@torch.no_grad()
def evaluate(model, test_loader, config, round_vals=True, round_precision=3):
model.eval()
metrics = RunningAverageDict()
for i, sample in tqdm(enumerate(test_loader), total=len(test_loader)):
if 'has_valid_depth' in sample:
if not sample['has_valid_depth']:
continue
image, depth = sample['image'], sample['depth']
image, depth = image.cuda(), depth.cuda()
depth = depth.squeeze().unsqueeze(0).unsqueeze(0)
focal = sample.get('focal', torch.Tensor(
[715.0873]).cuda()) # This magic number (focal) is only used for evaluating BTS model
pred = infer(model, image, dataset=sample['dataset'][0], focal=focal)
# Save image, depth, pred for visualization
if "save_images" in config and config.save_images:
import os
# print("Saving images ...")
from PIL import Image
import torchvision.transforms as transforms
from zoedepth.utils.misc import colorize
os.makedirs(config.save_images, exist_ok=True)
# def save_image(img, path):
d = colorize(depth.squeeze().cpu().numpy(), 0, 10)
p = colorize(pred.squeeze().cpu().numpy(), 0, 10)
im = transforms.ToPILImage()(image.squeeze().cpu())
im.save(os.path.join(config.save_images, f"{i}_img.png"))
Image.fromarray(d).save(os.path.join(config.save_images, f"{i}_depth.png"))
Image.fromarray(p).save(os.path.join(config.save_images, f"{i}_pred.png"))
# print(depth.shape, pred.shape)
metrics.update(compute_metrics(depth, pred, config=config))
if round_vals:
def r(m): return round(m, round_precision)
else:
def r(m): return m
metrics = {k: r(v) for k, v in metrics.get_value().items()}
return metrics
def main(config):
model = build_model(config)
test_loader = DepthDataLoader(config, 'online_eval').data
model = model.cuda()
metrics = evaluate(model, test_loader, config)
print(f"{colors.fg.green}")
print(metrics)
print(f"{colors.reset}")
metrics['#params'] = f"{round(count_parameters(model, include_all=True)/1e6, 2)}M"
return metrics
def eval_model(model_name, pretrained_resource, dataset='nyu', **kwargs):
# Load default pretrained resource defined in config if not set
overwrite = {**kwargs, "pretrained_resource": pretrained_resource} if pretrained_resource else kwargs
config = get_config(model_name, "eval", dataset, **overwrite)
# config = change_dataset(config, dataset) # change the dataset
pprint(config)
print(f"Evaluating {model_name} on {dataset}...")
metrics = main(config)
return metrics
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("-m", "--model", type=str,
required=True, help="Name of the model to evaluate")
parser.add_argument("-p", "--pretrained_resource", type=str,
required=False, default="", help="Pretrained resource to use for fetching weights. If not set, default resource from model config is used, Refer models.model_io.load_state_from_resource for more details.")
parser.add_argument("-d", "--dataset", type=str, required=False,
default='nyu', help="Dataset to evaluate on")
args, unknown_args = parser.parse_known_args()
overwrite_kwargs = parse_unknown(unknown_args)
if "ALL_INDOOR" in args.dataset:
datasets = ALL_INDOOR
elif "ALL_OUTDOOR" in args.dataset:
datasets = ALL_OUTDOOR
elif "ALL" in args.dataset:
datasets = ALL_EVAL_DATASETS
elif "," in args.dataset:
datasets = args.dataset.split(",")
else:
datasets = [args.dataset]
for dataset in datasets:
eval_model(args.model, pretrained_resource=args.pretrained_resource,
dataset=dataset, **overwrite_kwargs)

182
metric_depth/train_mix.py Normal file
View File

@@ -0,0 +1,182 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
from zoedepth.utils.misc import count_parameters, parallelize
from zoedepth.utils.config import get_config
from zoedepth.utils.arg_utils import parse_unknown
from zoedepth.trainers.builder import get_trainer
from zoedepth.models.builder import build_model
from zoedepth.data.data_mono import MixedNYUKITTI
import torch.utils.data.distributed
import torch.multiprocessing as mp
import torch
import numpy as np
from pprint import pprint
import argparse
import os
os.environ["PYOPENGL_PLATFORM"] = "egl"
os.environ["WANDB_START_METHOD"] = "thread"
def fix_random_seed(seed: int):
"""
Fix random seed for reproducibility
Args:
seed (int): random seed
"""
import random
import numpy
import torch
random.seed(seed)
numpy.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
def load_ckpt(config, model, checkpoint_dir="./checkpoints", ckpt_type="best"):
import glob
import os
from zoedepth.models.model_io import load_wts
if hasattr(config, "checkpoint"):
checkpoint = config.checkpoint
elif hasattr(config, "ckpt_pattern"):
pattern = config.ckpt_pattern
matches = glob.glob(os.path.join(
checkpoint_dir, f"*{pattern}*{ckpt_type}*"))
if not (len(matches) > 0):
raise ValueError(f"No matches found for the pattern {pattern}")
checkpoint = matches[0]
else:
return model
model = load_wts(model, checkpoint)
print("Loaded weights from {0}".format(checkpoint))
return model
def main_worker(gpu, ngpus_per_node, config):
try:
fix_random_seed(43)
config.gpu = gpu
model = build_model(config)
# print(model)
model = load_ckpt(config, model)
model = parallelize(config, model)
total_params = f"{round(count_parameters(model)/1e6,2)}M"
config.total_params = total_params
print(f"Total parameters : {total_params}")
train_loader = MixedNYUKITTI(config, "train").data
test_loader = MixedNYUKITTI(config, "online_eval").data
trainer = get_trainer(config)(
config, model, train_loader, test_loader, device=config.gpu)
trainer.train()
finally:
import wandb
wandb.finish()
if __name__ == '__main__':
mp.set_start_method('forkserver')
parser = argparse.ArgumentParser()
parser.add_argument("-m", "--model", type=str, default="synunet")
parser.add_argument("-d", "--dataset", type=str, default='mix')
parser.add_argument("--trainer", type=str, default=None)
args, unknown_args = parser.parse_known_args()
overwrite_kwargs = parse_unknown(unknown_args)
overwrite_kwargs["model"] = args.model
if args.trainer is not None:
overwrite_kwargs["trainer"] = args.trainer
config = get_config(args.model, "train", args.dataset, **overwrite_kwargs)
# git_commit()
if config.use_shared_dict:
shared_dict = mp.Manager().dict()
else:
shared_dict = None
config.shared_dict = shared_dict
config.batch_size = config.bs
config.mode = 'train'
if config.root != "." and not os.path.isdir(config.root):
os.makedirs(config.root)
try:
node_str = os.environ['SLURM_JOB_NODELIST'].replace(
'[', '').replace(']', '')
nodes = node_str.split(',')
config.world_size = len(nodes)
config.rank = int(os.environ['SLURM_PROCID'])
# config.save_dir = "/ibex/scratch/bhatsf/videodepth/checkpoints"
except KeyError as e:
# We are NOT using SLURM
config.world_size = 1
config.rank = 0
nodes = ["127.0.0.1"]
if config.distributed:
print(config.rank)
port = np.random.randint(15000, 15025)
config.dist_url = 'tcp://{}:{}'.format(nodes[0], port)
print(config.dist_url)
config.dist_backend = 'nccl'
config.gpu = None
ngpus_per_node = torch.cuda.device_count()
config.num_workers = config.workers
config.ngpus_per_node = ngpus_per_node
print("Config:")
pprint(config)
if config.distributed:
config.world_size = ngpus_per_node * config.world_size
mp.spawn(main_worker, nprocs=ngpus_per_node,
args=(ngpus_per_node, config))
else:
if ngpus_per_node == 1:
config.gpu = 0
main_worker(config.gpu, ngpus_per_node, config)

176
metric_depth/train_mono.py Normal file
View File

@@ -0,0 +1,176 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
from zoedepth.utils.misc import count_parameters, parallelize
from zoedepth.utils.config import get_config
from zoedepth.utils.arg_utils import parse_unknown
from zoedepth.trainers.builder import get_trainer
from zoedepth.models.builder import build_model
from zoedepth.data.data_mono import DepthDataLoader
import torch.utils.data.distributed
import torch.multiprocessing as mp
import torch
import numpy as np
from pprint import pprint
import argparse
import os
os.environ["PYOPENGL_PLATFORM"] = "egl"
os.environ["WANDB_START_METHOD"] = "thread"
def fix_random_seed(seed: int):
import random
import numpy
import torch
random.seed(seed)
numpy.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = True
def load_ckpt(config, model, checkpoint_dir="./checkpoints", ckpt_type="best"):
import glob
import os
from zoedepth.models.model_io import load_wts
if hasattr(config, "checkpoint"):
checkpoint = config.checkpoint
elif hasattr(config, "ckpt_pattern"):
pattern = config.ckpt_pattern
matches = glob.glob(os.path.join(
checkpoint_dir, f"*{pattern}*{ckpt_type}*"))
if not (len(matches) > 0):
raise ValueError(f"No matches found for the pattern {pattern}")
checkpoint = matches[0]
else:
return model
model = load_wts(model, checkpoint)
print("Loaded weights from {0}".format(checkpoint))
return model
def main_worker(gpu, ngpus_per_node, config):
try:
seed = config.seed if 'seed' in config and config.seed else 43
fix_random_seed(seed)
config.gpu = gpu
model = build_model(config)
# print(model)
model = load_ckpt(config, model)
model = parallelize(config, model)
total_params = f"{round(count_parameters(model)/1e6,2)}M"
config.total_params = total_params
print(f"Total parameters : {total_params}")
train_loader = DepthDataLoader(config, "train").data
test_loader = DepthDataLoader(config, "online_eval").data
trainer = get_trainer(config)(
config, model, train_loader, test_loader, device=config.gpu)
trainer.train()
finally:
import wandb
wandb.finish()
if __name__ == '__main__':
mp.set_start_method('forkserver')
parser = argparse.ArgumentParser()
parser.add_argument("-m", "--model", type=str, default="synunet")
parser.add_argument("-d", "--dataset", type=str, default='nyu')
parser.add_argument("--trainer", type=str, default=None)
args, unknown_args = parser.parse_known_args()
overwrite_kwargs = parse_unknown(unknown_args)
overwrite_kwargs["model"] = args.model
if args.trainer is not None:
overwrite_kwargs["trainer"] = args.trainer
config = get_config(args.model, "train", args.dataset, **overwrite_kwargs)
# git_commit()
if config.use_shared_dict:
shared_dict = mp.Manager().dict()
else:
shared_dict = None
config.shared_dict = shared_dict
config.batch_size = config.bs
config.mode = 'train'
if config.root != "." and not os.path.isdir(config.root):
os.makedirs(config.root)
try:
node_str = os.environ['SLURM_JOB_NODELIST'].replace(
'[', '').replace(']', '')
nodes = node_str.split(',')
config.world_size = len(nodes)
config.rank = int(os.environ['SLURM_PROCID'])
# config.save_dir = "/ibex/scratch/bhatsf/videodepth/checkpoints"
except KeyError as e:
# We are NOT using SLURM
config.world_size = 1
config.rank = 0
nodes = ["127.0.0.1"]
if config.distributed:
print(config.rank)
port = np.random.randint(15000, 15025)
config.dist_url = 'tcp://{}:{}'.format(nodes[0], port)
print(config.dist_url)
config.dist_backend = 'nccl'
config.gpu = None
ngpus_per_node = torch.cuda.device_count()
config.num_workers = config.workers
config.ngpus_per_node = ngpus_per_node
print("Config:")
pprint(config)
if config.distributed:
config.world_size = ngpus_per_node * config.world_size
mp.spawn(main_worker, nprocs=ngpus_per_node,
args=(ngpus_per_node, config))
else:
if ngpus_per_node == 1:
config.gpu = 0
main_worker(config.gpu, ngpus_per_node, config)

View File

@@ -0,0 +1,697 @@
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000069.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000069.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000054.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000054.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000042.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000042.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000057.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000057.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000030.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000030.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000027.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000027.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000012.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000012.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000075.png None 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000036.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000036.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000033.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000033.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000015.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000015.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000072.png None 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000003.png None 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000039.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000039.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000009.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000009.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000051.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000051.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000060.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000060.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000021.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000021.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000024.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000024.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000045.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000045.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000018.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000018.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000048.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000048.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000006.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000006.png 721.5377
2011_09_26/2011_09_26_drive_0002_sync/image_02/data/0000000063.png 2011_09_26_drive_0002_sync/proj_depth/groundtruth/image_02/0000000063.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000016.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000016.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000032.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000032.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000048.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000048.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000064.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000064.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000080.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000080.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000096.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000096.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000112.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000112.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000128.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000128.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000144.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000144.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000160.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000160.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000176.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000176.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000196.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000196.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000212.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000212.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000228.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000228.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000244.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000244.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000260.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000260.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000276.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000276.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000292.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000292.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000308.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000308.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000324.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000324.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000340.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000340.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000356.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000356.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000372.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000372.png 721.5377
2011_09_26/2011_09_26_drive_0009_sync/image_02/data/0000000388.png 2011_09_26_drive_0009_sync/proj_depth/groundtruth/image_02/0000000388.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000090.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000090.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000050.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000050.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000110.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000110.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000115.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000115.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000060.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000060.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000105.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000105.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000125.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000125.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000020.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000020.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000140.png None 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000085.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000085.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000070.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000070.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000080.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000080.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000065.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000065.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000095.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000095.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000130.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000130.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000100.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000100.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000010.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000010.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000030.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000030.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000135.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000135.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000040.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000040.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000005.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000005.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000120.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000120.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000045.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000045.png 721.5377
2011_09_26/2011_09_26_drive_0013_sync/image_02/data/0000000035.png 2011_09_26_drive_0013_sync/proj_depth/groundtruth/image_02/0000000035.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000003.png None 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000069.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000069.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000057.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000057.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000012.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000012.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000072.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000072.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000018.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000018.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000063.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000063.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000084.png None 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000015.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000015.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000066.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000066.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000006.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000006.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000048.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000048.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000060.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000060.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000009.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000009.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000033.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000033.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000021.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000021.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000075.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000075.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000027.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000027.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000045.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000045.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000078.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000078.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000036.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000036.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000051.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000051.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000054.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000054.png 721.5377
2011_09_26/2011_09_26_drive_0020_sync/image_02/data/0000000042.png 2011_09_26_drive_0020_sync/proj_depth/groundtruth/image_02/0000000042.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000018.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000018.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000090.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000090.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000126.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000126.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000378.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000378.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000036.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000036.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000288.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000288.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000198.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000198.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000450.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000450.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000144.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000144.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000072.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000072.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000252.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000252.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000180.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000180.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000432.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000432.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000396.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000396.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000054.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000054.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000468.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000468.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000306.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000306.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000108.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000108.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000162.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000162.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000342.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000342.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000270.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000270.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000414.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000414.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000216.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000216.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000360.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000360.png 721.5377
2011_09_26/2011_09_26_drive_0023_sync/image_02/data/0000000324.png 2011_09_26_drive_0023_sync/proj_depth/groundtruth/image_02/0000000324.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000077.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000077.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000035.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000035.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000091.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000091.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000112.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000112.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000007.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000007.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000175.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000175.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000042.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000042.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000098.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000098.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000133.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000133.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000161.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000161.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000014.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000014.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000126.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000126.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000168.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000168.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000070.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000070.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000084.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000084.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000140.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000140.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000049.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000049.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000182.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000182.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000147.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000147.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000056.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000056.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000063.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000063.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000021.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000021.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000119.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000119.png 721.5377
2011_09_26/2011_09_26_drive_0027_sync/image_02/data/0000000028.png 2011_09_26_drive_0027_sync/proj_depth/groundtruth/image_02/0000000028.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000380.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000380.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000394.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000394.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000324.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000324.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000268.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000268.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000366.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000366.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000296.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000296.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000014.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000014.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000028.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000028.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000182.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000182.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000168.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000168.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000196.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000196.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000140.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000140.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000084.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000084.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000056.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000056.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000112.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000112.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000352.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000352.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000126.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000126.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000070.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000070.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000310.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000310.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000154.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000154.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000098.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000098.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000408.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000408.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000042.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000042.png 721.5377
2011_09_26/2011_09_26_drive_0029_sync/image_02/data/0000000338.png 2011_09_26_drive_0029_sync/proj_depth/groundtruth/image_02/0000000338.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000128.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000128.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000192.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000192.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000032.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000032.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000352.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000352.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000608.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000608.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000224.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000224.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000576.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000576.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000672.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000672.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000064.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000064.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000448.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000448.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000704.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000704.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000640.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000640.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000512.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000512.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000768.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000768.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000160.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000160.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000416.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000416.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000480.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000480.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000800.png None 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000288.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000288.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000544.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000544.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000096.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000096.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000384.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000384.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000256.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000256.png 721.5377
2011_09_26/2011_09_26_drive_0036_sync/image_02/data/0000000320.png 2011_09_26_drive_0036_sync/proj_depth/groundtruth/image_02/0000000320.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000005.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000005.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000010.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000010.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000015.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000015.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000020.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000020.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000025.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000025.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000030.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000030.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000035.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000035.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000040.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000040.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000045.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000045.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000050.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000050.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000055.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000055.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000060.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000060.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000065.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000065.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000070.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000070.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000075.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000075.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000080.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000080.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000085.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000085.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000090.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000090.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000095.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000095.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000100.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000100.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000105.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000105.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000110.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000110.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000115.png 2011_09_26_drive_0046_sync/proj_depth/groundtruth/image_02/0000000115.png 721.5377
2011_09_26/2011_09_26_drive_0046_sync/image_02/data/0000000120.png None 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000001.png None 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000002.png None 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000003.png None 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000004.png None 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000005.png 2011_09_26_drive_0048_sync/proj_depth/groundtruth/image_02/0000000005.png 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000006.png 2011_09_26_drive_0048_sync/proj_depth/groundtruth/image_02/0000000006.png 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000007.png 2011_09_26_drive_0048_sync/proj_depth/groundtruth/image_02/0000000007.png 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000008.png 2011_09_26_drive_0048_sync/proj_depth/groundtruth/image_02/0000000008.png 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000009.png 2011_09_26_drive_0048_sync/proj_depth/groundtruth/image_02/0000000009.png 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000010.png 2011_09_26_drive_0048_sync/proj_depth/groundtruth/image_02/0000000010.png 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000011.png 2011_09_26_drive_0048_sync/proj_depth/groundtruth/image_02/0000000011.png 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000012.png 2011_09_26_drive_0048_sync/proj_depth/groundtruth/image_02/0000000012.png 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000013.png 2011_09_26_drive_0048_sync/proj_depth/groundtruth/image_02/0000000013.png 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000014.png 2011_09_26_drive_0048_sync/proj_depth/groundtruth/image_02/0000000014.png 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000015.png 2011_09_26_drive_0048_sync/proj_depth/groundtruth/image_02/0000000015.png 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000016.png 2011_09_26_drive_0048_sync/proj_depth/groundtruth/image_02/0000000016.png 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000017.png None 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000018.png None 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000019.png None 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000020.png None 721.5377
2011_09_26/2011_09_26_drive_0048_sync/image_02/data/0000000021.png None 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000046.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000046.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000014.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000014.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000036.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000036.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000028.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000028.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000026.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000026.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000050.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000050.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000040.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000040.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000008.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000008.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000016.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000016.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000044.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000044.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000018.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000018.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000032.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000032.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000042.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000042.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000010.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000010.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000020.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000020.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000048.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000048.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000052.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000052.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000006.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000006.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000030.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000030.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000012.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000012.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000038.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000038.png 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000002.png None 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000004.png None 721.5377
2011_09_26/2011_09_26_drive_0052_sync/image_02/data/0000000022.png 2011_09_26_drive_0052_sync/proj_depth/groundtruth/image_02/0000000022.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000011.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000011.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000033.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000033.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000242.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000242.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000253.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000253.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000286.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000286.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000154.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000154.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000099.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000099.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000220.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000220.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000022.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000022.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000077.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000077.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000187.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000187.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000143.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000143.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000066.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000066.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000176.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000176.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000110.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000110.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000275.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000275.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000264.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000264.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000198.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000198.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000055.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000055.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000088.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000088.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000121.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000121.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000209.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000209.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000165.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000165.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000231.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000231.png 721.5377
2011_09_26/2011_09_26_drive_0056_sync/image_02/data/0000000044.png 2011_09_26_drive_0056_sync/proj_depth/groundtruth/image_02/0000000044.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000056.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000056.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000344.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000344.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000358.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000358.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000316.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000316.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000238.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000238.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000098.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000098.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000112.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000112.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000028.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000028.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000014.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000014.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000330.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000330.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000154.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000154.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000042.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000042.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000302.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000302.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000182.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000182.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000288.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000288.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000140.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000140.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000274.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000274.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000224.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000224.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000372.png None 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000196.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000196.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000126.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000126.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000084.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000084.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000210.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000210.png 721.5377
2011_09_26/2011_09_26_drive_0059_sync/image_02/data/0000000070.png 2011_09_26_drive_0059_sync/proj_depth/groundtruth/image_02/0000000070.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000528.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000528.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000308.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000308.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000044.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000044.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000352.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000352.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000066.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000066.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000506.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000506.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000176.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000176.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000022.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000022.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000242.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000242.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000462.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000462.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000418.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000418.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000110.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000110.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000440.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000440.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000396.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000396.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000154.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000154.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000374.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000374.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000088.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000088.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000286.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000286.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000550.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000550.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000264.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000264.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000220.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000220.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000330.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000330.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000484.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000484.png 721.5377
2011_09_26/2011_09_26_drive_0064_sync/image_02/data/0000000198.png 2011_09_26_drive_0064_sync/proj_depth/groundtruth/image_02/0000000198.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000283.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000283.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000361.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000361.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000270.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000270.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000127.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000127.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000205.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000205.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000218.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000218.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000153.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000153.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000335.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000335.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000192.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000192.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000348.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000348.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000101.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000101.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000049.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000049.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000179.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000179.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000140.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000140.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000374.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000374.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000322.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000322.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000309.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000309.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000244.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000244.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000062.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000062.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000257.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000257.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000088.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000088.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000114.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000114.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000075.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000075.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000296.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000296.png 721.5377
2011_09_26/2011_09_26_drive_0084_sync/image_02/data/0000000231.png 2011_09_26_drive_0084_sync/proj_depth/groundtruth/image_02/0000000231.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000007.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000007.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000196.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000196.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000439.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000439.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000169.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000169.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000115.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000115.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000034.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000034.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000304.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000304.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000331.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000331.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000277.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000277.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000520.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000520.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000682.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000682.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000628.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000628.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000088.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000088.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000601.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000601.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000574.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000574.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000223.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000223.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000655.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000655.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000358.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000358.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000412.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000412.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000142.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000142.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000385.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000385.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000061.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000061.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000493.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000493.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000466.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000466.png 721.5377
2011_09_26/2011_09_26_drive_0086_sync/image_02/data/0000000250.png 2011_09_26_drive_0086_sync/proj_depth/groundtruth/image_02/0000000250.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000016.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000016.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000032.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000032.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000048.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000048.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000064.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000064.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000080.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000080.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000096.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000096.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000112.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000112.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000128.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000128.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000144.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000144.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000160.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000160.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000176.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000176.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000192.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000192.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000208.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000208.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000224.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000224.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000240.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000240.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000256.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000256.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000305.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000305.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000321.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000321.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000337.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000337.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000353.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000353.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000369.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000369.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000385.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000385.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000401.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000401.png 721.5377
2011_09_26/2011_09_26_drive_0093_sync/image_02/data/0000000417.png 2011_09_26_drive_0093_sync/proj_depth/groundtruth/image_02/0000000417.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000019.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000019.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000038.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000038.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000057.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000057.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000076.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000076.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000095.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000095.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000114.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000114.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000133.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000133.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000152.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000152.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000171.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000171.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000190.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000190.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000209.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000209.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000228.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000228.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000247.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000247.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000266.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000266.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000285.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000285.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000304.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000304.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000323.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000323.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000342.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000342.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000361.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000361.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000380.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000380.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000399.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000399.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000418.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000418.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000437.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000437.png 721.5377
2011_09_26/2011_09_26_drive_0096_sync/image_02/data/0000000456.png 2011_09_26_drive_0096_sync/proj_depth/groundtruth/image_02/0000000456.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000692.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000692.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000930.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000930.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000760.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000760.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000896.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000896.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000284.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000284.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000148.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000148.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000522.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000522.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000794.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000794.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000624.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000624.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000726.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000726.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000216.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000216.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000318.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000318.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000488.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000488.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000590.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000590.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000454.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000454.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000862.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000862.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000386.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000386.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000352.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000352.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000420.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000420.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000658.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000658.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000828.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000828.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000556.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000556.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000114.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000114.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000182.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000182.png 721.5377
2011_09_26/2011_09_26_drive_0101_sync/image_02/data/0000000080.png 2011_09_26_drive_0101_sync/proj_depth/groundtruth/image_02/0000000080.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000015.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000015.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000035.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000035.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000043.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000043.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000051.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000051.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000059.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000059.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000067.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000067.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000075.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000075.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000083.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000083.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000091.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000091.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000099.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000099.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000107.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000107.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000115.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000115.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000123.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000123.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000131.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000131.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000139.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000139.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000147.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000147.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000155.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000155.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000163.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000163.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000171.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000171.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000179.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000179.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000187.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000187.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000195.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000195.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000203.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000203.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000211.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000211.png 721.5377
2011_09_26/2011_09_26_drive_0106_sync/image_02/data/0000000219.png 2011_09_26_drive_0106_sync/proj_depth/groundtruth/image_02/0000000219.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000312.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000312.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000494.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000494.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000104.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000104.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000130.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000130.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000156.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000156.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000182.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000182.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000598.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000598.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000416.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000416.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000364.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000364.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000026.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000026.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000078.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000078.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000572.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000572.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000468.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000468.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000260.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000260.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000624.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000624.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000234.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000234.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000442.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000442.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000390.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000390.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000546.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000546.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000286.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000286.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000000.png None 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000338.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000338.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000208.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000208.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000650.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000650.png 721.5377
2011_09_26/2011_09_26_drive_0117_sync/image_02/data/0000000052.png 2011_09_26_drive_0117_sync/proj_depth/groundtruth/image_02/0000000052.png 721.5377
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000024.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000024.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000021.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000021.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000036.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000036.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000000.png None 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000051.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000051.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000018.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000018.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000033.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000033.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000090.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000090.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000045.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000045.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000054.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000054.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000012.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000012.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000039.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000039.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000009.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000009.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000003.png None 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000030.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000030.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000078.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000078.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000060.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000060.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000048.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000048.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000084.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000084.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000081.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000081.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000006.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000006.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000057.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000057.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000072.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000072.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000087.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000087.png 707.0493
2011_09_28/2011_09_28_drive_0002_sync/image_02/data/0000000063.png 2011_09_28_drive_0002_sync/proj_depth/groundtruth/image_02/0000000063.png 707.0493
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000252.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000252.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000540.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000540.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000001054.png None 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000036.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000036.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000360.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000360.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000807.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000807.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000879.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000879.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000288.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000288.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000771.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000771.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000000.png None 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000216.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000216.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000951.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000951.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000324.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000324.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000432.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000432.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000504.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000504.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000576.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000576.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000108.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000108.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000180.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000180.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000072.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000072.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000612.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000612.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000915.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000915.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000735.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000735.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000144.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000144.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000396.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000396.png 718.3351
2011_09_29/2011_09_29_drive_0071_sync/image_02/data/0000000468.png 2011_09_29_drive_0071_sync/proj_depth/groundtruth/image_02/0000000468.png 718.3351
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000132.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000132.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000011.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000011.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000154.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000154.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000022.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000022.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000242.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000242.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000198.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000198.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000176.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000176.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000231.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000231.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000275.png None 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000220.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000220.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000088.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000088.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000143.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000143.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000055.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000055.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000033.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000033.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000187.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000187.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000110.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000110.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000044.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000044.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000077.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000077.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000066.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000066.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000000.png None 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000165.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000165.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000264.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000264.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000253.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000253.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000209.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000209.png 707.0912
2011_09_30/2011_09_30_drive_0016_sync/image_02/data/0000000121.png 2011_09_30_drive_0016_sync/proj_depth/groundtruth/image_02/0000000121.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000000107.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000000107.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000002247.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000002247.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000001391.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000001391.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000000535.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000000535.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000001819.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000001819.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000001177.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000001177.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000000428.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000000428.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000001926.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000001926.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000000749.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000000749.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000001284.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000001284.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000002140.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000002140.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000001605.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000001605.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000001498.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000001498.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000000642.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000000642.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000002740.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000002740.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000002419.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000002419.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000000856.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000000856.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000002526.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000002526.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000001712.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000001712.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000001070.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000001070.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000000000.png None 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000002033.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000002033.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000000214.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000000214.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000000963.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000000963.png 707.0912
2011_09_30/2011_09_30_drive_0018_sync/image_02/data/0000002633.png 2011_09_30_drive_0018_sync/proj_depth/groundtruth/image_02/0000002633.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000533.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000533.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000001040.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000001040.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000082.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000082.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000205.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000205.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000835.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000835.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000451.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000451.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000164.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000164.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000794.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000794.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000328.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000328.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000615.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000615.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000917.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000917.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000369.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000369.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000287.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000287.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000123.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000123.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000876.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000876.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000410.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000410.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000492.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000492.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000958.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000958.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000656.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000656.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000000.png None 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000753.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000753.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000574.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000574.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000001081.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000001081.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000041.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000041.png 707.0912
2011_09_30/2011_09_30_drive_0027_sync/image_02/data/0000000246.png 2011_09_30_drive_0027_sync/proj_depth/groundtruth/image_02/0000000246.png 707.0912
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000002906.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000002906.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000002544.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000002544.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000000362.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000000362.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000004535.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000004535.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000000734.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000000734.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000001096.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000001096.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000004173.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000004173.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000000543.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000000543.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000001277.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000001277.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000004354.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000004354.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000001458.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000001458.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000001820.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000001820.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000003449.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000003449.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000003268.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000003268.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000000915.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000000915.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000002363.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000002363.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000002725.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000002725.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000000181.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000000181.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000001639.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000001639.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000003992.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000003992.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000003087.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000003087.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000002001.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000002001.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000003811.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000003811.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000003630.png 2011_10_03_drive_0027_sync/proj_depth/groundtruth/image_02/0000003630.png 718.856
2011_10_03/2011_10_03_drive_0027_sync/image_02/data/0000000000.png None 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000096.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000096.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000800.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000800.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000320.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000320.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000576.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000576.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000000.png None 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000480.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000480.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000640.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000640.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000032.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000032.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000384.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000384.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000160.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000160.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000704.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000704.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000736.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000736.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000672.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000672.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000064.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000064.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000288.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000288.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000352.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000352.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000512.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000512.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000544.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000544.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000608.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000608.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000128.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000128.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000224.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000224.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000416.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000416.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000192.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000192.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000448.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000448.png 718.856
2011_10_03/2011_10_03_drive_0047_sync/image_02/data/0000000768.png 2011_10_03_drive_0047_sync/proj_depth/groundtruth/image_02/0000000768.png 718.856

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,654 @@
bathroom/rgb_00045.jpg bathroom/sync_depth_00045.png 518.8579
bathroom/rgb_00046.jpg bathroom/sync_depth_00046.png 518.8579
bathroom/rgb_00507.jpg bathroom/sync_depth_00507.png 518.8579
bathroom/rgb_00508.jpg bathroom/sync_depth_00508.png 518.8579
bathroom/rgb_00509.jpg bathroom/sync_depth_00509.png 518.8579
bathroom/rgb_00510.jpg bathroom/sync_depth_00510.png 518.8579
bathroom/rgb_00511.jpg bathroom/sync_depth_00511.png 518.8579
bathroom/rgb_00512.jpg bathroom/sync_depth_00512.png 518.8579
bathroom/rgb_00649.jpg bathroom/sync_depth_00649.png 518.8579
bathroom/rgb_00650.jpg bathroom/sync_depth_00650.png 518.8579
bathroom/rgb_00655.jpg bathroom/sync_depth_00655.png 518.8579
bathroom/rgb_00656.jpg bathroom/sync_depth_00656.png 518.8579
bathroom/rgb_00657.jpg bathroom/sync_depth_00657.png 518.8579
bathroom/rgb_00662.jpg bathroom/sync_depth_00662.png 518.8579
bathroom/rgb_00663.jpg bathroom/sync_depth_00663.png 518.8579
bathroom/rgb_00667.jpg bathroom/sync_depth_00667.png 518.8579
bathroom/rgb_00668.jpg bathroom/sync_depth_00668.png 518.8579
bathroom/rgb_00670.jpg bathroom/sync_depth_00670.png 518.8579
bathroom/rgb_00671.jpg bathroom/sync_depth_00671.png 518.8579
bathroom/rgb_00672.jpg bathroom/sync_depth_00672.png 518.8579
bathroom/rgb_00675.jpg bathroom/sync_depth_00675.png 518.8579
bathroom/rgb_00676.jpg bathroom/sync_depth_00676.png 518.8579
bathroom/rgb_00677.jpg bathroom/sync_depth_00677.png 518.8579
bathroom/rgb_00678.jpg bathroom/sync_depth_00678.png 518.8579
bathroom/rgb_00679.jpg bathroom/sync_depth_00679.png 518.8579
bathroom/rgb_00680.jpg bathroom/sync_depth_00680.png 518.8579
bathroom/rgb_00685.jpg bathroom/sync_depth_00685.png 518.8579
bathroom/rgb_00686.jpg bathroom/sync_depth_00686.png 518.8579
bathroom/rgb_00687.jpg bathroom/sync_depth_00687.png 518.8579
bathroom/rgb_00688.jpg bathroom/sync_depth_00688.png 518.8579
bathroom/rgb_00689.jpg bathroom/sync_depth_00689.png 518.8579
bathroom/rgb_00692.jpg bathroom/sync_depth_00692.png 518.8579
bathroom/rgb_00693.jpg bathroom/sync_depth_00693.png 518.8579
bathroom/rgb_00696.jpg bathroom/sync_depth_00696.png 518.8579
bathroom/rgb_00669.jpg bathroom/sync_depth_00669.png 518.8579
bathroom/rgb_00697.jpg bathroom/sync_depth_00697.png 518.8579
bathroom/rgb_00698.jpg bathroom/sync_depth_00698.png 518.8579
bathroom/rgb_00705.jpg bathroom/sync_depth_00705.png 518.8579
bathroom/rgb_00706.jpg bathroom/sync_depth_00706.png 518.8579
bathroom/rgb_00707.jpg bathroom/sync_depth_00707.png 518.8579
bathroom/rgb_00708.jpg bathroom/sync_depth_00708.png 518.8579
bathroom/rgb_00709.jpg bathroom/sync_depth_00709.png 518.8579
bathroom/rgb_00710.jpg bathroom/sync_depth_00710.png 518.8579
bathroom/rgb_00711.jpg bathroom/sync_depth_00711.png 518.8579
bathroom/rgb_00712.jpg bathroom/sync_depth_00712.png 518.8579
bathroom/rgb_00716.jpg bathroom/sync_depth_00716.png 518.8579
bathroom/rgb_00717.jpg bathroom/sync_depth_00717.png 518.8579
bathroom/rgb_00723.jpg bathroom/sync_depth_00723.png 518.8579
bathroom/rgb_00724.jpg bathroom/sync_depth_00724.png 518.8579
bathroom/rgb_00725.jpg bathroom/sync_depth_00725.png 518.8579
bathroom/rgb_00726.jpg bathroom/sync_depth_00726.png 518.8579
bathroom/rgb_00727.jpg bathroom/sync_depth_00727.png 518.8579
bathroom/rgb_00730.jpg bathroom/sync_depth_00730.png 518.8579
bathroom/rgb_00731.jpg bathroom/sync_depth_00731.png 518.8579
bathroom/rgb_00732.jpg bathroom/sync_depth_00732.png 518.8579
bathroom/rgb_00733.jpg bathroom/sync_depth_00733.png 518.8579
bathroom/rgb_00742.jpg bathroom/sync_depth_00742.png 518.8579
bathroom/rgb_00743.jpg bathroom/sync_depth_00743.png 518.8579
bedroom/rgb_00055.jpg bedroom/sync_depth_00055.png 518.8579
bedroom/rgb_00056.jpg bedroom/sync_depth_00056.png 518.8579
bedroom/rgb_00058.jpg bedroom/sync_depth_00058.png 518.8579
bedroom/rgb_00059.jpg bedroom/sync_depth_00059.png 518.8579
bedroom/rgb_00060.jpg bedroom/sync_depth_00060.png 518.8579
bedroom/rgb_00061.jpg bedroom/sync_depth_00061.png 518.8579
bedroom/rgb_00062.jpg bedroom/sync_depth_00062.png 518.8579
bedroom/rgb_00075.jpg bedroom/sync_depth_00075.png 518.8579
bedroom/rgb_00076.jpg bedroom/sync_depth_00076.png 518.8579
bedroom/rgb_00077.jpg bedroom/sync_depth_00077.png 518.8579
bedroom/rgb_00078.jpg bedroom/sync_depth_00078.png 518.8579
bedroom/rgb_00170.jpg bedroom/sync_depth_00170.png 518.8579
bedroom/rgb_00171.jpg bedroom/sync_depth_00171.png 518.8579
bedroom/rgb_00172.jpg bedroom/sync_depth_00172.png 518.8579
bedroom/rgb_00173.jpg bedroom/sync_depth_00173.png 518.8579
bedroom/rgb_00174.jpg bedroom/sync_depth_00174.png 518.8579
bedroom/rgb_00175.jpg bedroom/sync_depth_00175.png 518.8579
bedroom/rgb_00180.jpg bedroom/sync_depth_00180.png 518.8579
bedroom/rgb_00181.jpg bedroom/sync_depth_00181.png 518.8579
bedroom/rgb_00182.jpg bedroom/sync_depth_00182.png 518.8579
bedroom/rgb_00183.jpg bedroom/sync_depth_00183.png 518.8579
bedroom/rgb_00184.jpg bedroom/sync_depth_00184.png 518.8579
bedroom/rgb_00185.jpg bedroom/sync_depth_00185.png 518.8579
bedroom/rgb_00186.jpg bedroom/sync_depth_00186.png 518.8579
bedroom/rgb_00187.jpg bedroom/sync_depth_00187.png 518.8579
bedroom/rgb_00188.jpg bedroom/sync_depth_00188.png 518.8579
bedroom/rgb_00189.jpg bedroom/sync_depth_00189.png 518.8579
bedroom/rgb_00190.jpg bedroom/sync_depth_00190.png 518.8579
bedroom/rgb_00191.jpg bedroom/sync_depth_00191.png 518.8579
bedroom/rgb_00192.jpg bedroom/sync_depth_00192.png 518.8579
bedroom/rgb_00219.jpg bedroom/sync_depth_00219.png 518.8579
bedroom/rgb_00220.jpg bedroom/sync_depth_00220.png 518.8579
bedroom/rgb_00221.jpg bedroom/sync_depth_00221.png 518.8579
bedroom/rgb_00279.jpg bedroom/sync_depth_00279.png 518.8579
bedroom/rgb_00179.jpg bedroom/sync_depth_00179.png 518.8579
bedroom/rgb_00280.jpg bedroom/sync_depth_00280.png 518.8579
bedroom/rgb_00536.jpg bedroom/sync_depth_00536.png 518.8579
bedroom/rgb_00960.jpg bedroom/sync_depth_00960.png 518.8579
bedroom/rgb_01000.jpg bedroom/sync_depth_01000.png 518.8579
bedroom/rgb_01052.jpg bedroom/sync_depth_01052.png 518.8579
bedroom/rgb_01092.jpg bedroom/sync_depth_01092.png 518.8579
bedroom/rgb_01122.jpg bedroom/sync_depth_01122.png 518.8579
bedroom/rgb_01150.jpg bedroom/sync_depth_01150.png 518.8579
bedroom/rgb_00281.jpg bedroom/sync_depth_00281.png 518.8579
bedroom/rgb_00282.jpg bedroom/sync_depth_00282.png 518.8579
bedroom/rgb_00514.jpg bedroom/sync_depth_00514.png 518.8579
bedroom/rgb_00515.jpg bedroom/sync_depth_00515.png 518.8579
bedroom/rgb_00516.jpg bedroom/sync_depth_00516.png 518.8579
bedroom/rgb_00517.jpg bedroom/sync_depth_00517.png 518.8579
bedroom/rgb_00518.jpg bedroom/sync_depth_00518.png 518.8579
bedroom/rgb_00519.jpg bedroom/sync_depth_00519.png 518.8579
bedroom/rgb_00520.jpg bedroom/sync_depth_00520.png 518.8579
bedroom/rgb_00521.jpg bedroom/sync_depth_00521.png 518.8579
bedroom/rgb_00522.jpg bedroom/sync_depth_00522.png 518.8579
bedroom/rgb_00523.jpg bedroom/sync_depth_00523.png 518.8579
bedroom/rgb_00524.jpg bedroom/sync_depth_00524.png 518.8579
bedroom/rgb_00525.jpg bedroom/sync_depth_00525.png 518.8579
bedroom/rgb_00530.jpg bedroom/sync_depth_00530.png 518.8579
bedroom/rgb_00531.jpg bedroom/sync_depth_00531.png 518.8579
bedroom/rgb_00532.jpg bedroom/sync_depth_00532.png 518.8579
bedroom/rgb_00537.jpg bedroom/sync_depth_00537.png 518.8579
bedroom/rgb_00538.jpg bedroom/sync_depth_00538.png 518.8579
bedroom/rgb_00916.jpg bedroom/sync_depth_00916.png 518.8579
bedroom/rgb_00917.jpg bedroom/sync_depth_00917.png 518.8579
bedroom/rgb_00918.jpg bedroom/sync_depth_00918.png 518.8579
bedroom/rgb_00925.jpg bedroom/sync_depth_00925.png 518.8579
bedroom/rgb_00926.jpg bedroom/sync_depth_00926.png 518.8579
bedroom/rgb_00927.jpg bedroom/sync_depth_00927.png 518.8579
bedroom/rgb_00931.jpg bedroom/sync_depth_00931.png 518.8579
bedroom/rgb_00932.jpg bedroom/sync_depth_00932.png 518.8579
bedroom/rgb_00933.jpg bedroom/sync_depth_00933.png 518.8579
bedroom/rgb_00934.jpg bedroom/sync_depth_00934.png 518.8579
bedroom/rgb_00944.jpg bedroom/sync_depth_00944.png 518.8579
bedroom/rgb_00945.jpg bedroom/sync_depth_00945.png 518.8579
bedroom/rgb_00946.jpg bedroom/sync_depth_00946.png 518.8579
bedroom/rgb_00958.jpg bedroom/sync_depth_00958.png 518.8579
bedroom/rgb_00959.jpg bedroom/sync_depth_00959.png 518.8579
bedroom/rgb_00961.jpg bedroom/sync_depth_00961.png 518.8579
bedroom/rgb_00964.jpg bedroom/sync_depth_00964.png 518.8579
bedroom/rgb_00965.jpg bedroom/sync_depth_00965.png 518.8579
bedroom/rgb_00966.jpg bedroom/sync_depth_00966.png 518.8579
bedroom/rgb_00969.jpg bedroom/sync_depth_00969.png 518.8579
bedroom/rgb_00970.jpg bedroom/sync_depth_00970.png 518.8579
bedroom/rgb_00971.jpg bedroom/sync_depth_00971.png 518.8579
bedroom/rgb_00972.jpg bedroom/sync_depth_00972.png 518.8579
bedroom/rgb_00973.jpg bedroom/sync_depth_00973.png 518.8579
bedroom/rgb_00974.jpg bedroom/sync_depth_00974.png 518.8579
bedroom/rgb_00975.jpg bedroom/sync_depth_00975.png 518.8579
bedroom/rgb_00976.jpg bedroom/sync_depth_00976.png 518.8579
bedroom/rgb_00990.jpg bedroom/sync_depth_00990.png 518.8579
bedroom/rgb_00991.jpg bedroom/sync_depth_00991.png 518.8579
bedroom/rgb_00992.jpg bedroom/sync_depth_00992.png 518.8579
bedroom/rgb_00993.jpg bedroom/sync_depth_00993.png 518.8579
bedroom/rgb_00994.jpg bedroom/sync_depth_00994.png 518.8579
bedroom/rgb_01001.jpg bedroom/sync_depth_01001.png 518.8579
bedroom/rgb_01002.jpg bedroom/sync_depth_01002.png 518.8579
bedroom/rgb_01003.jpg bedroom/sync_depth_01003.png 518.8579
bedroom/rgb_01009.jpg bedroom/sync_depth_01009.png 518.8579
bedroom/rgb_01010.jpg bedroom/sync_depth_01010.png 518.8579
bedroom/rgb_01011.jpg bedroom/sync_depth_01011.png 518.8579
bedroom/rgb_01020.jpg bedroom/sync_depth_01020.png 518.8579
bedroom/rgb_01021.jpg bedroom/sync_depth_01021.png 518.8579
bedroom/rgb_01022.jpg bedroom/sync_depth_01022.png 518.8579
bedroom/rgb_01031.jpg bedroom/sync_depth_01031.png 518.8579
bedroom/rgb_01032.jpg bedroom/sync_depth_01032.png 518.8579
bedroom/rgb_01033.jpg bedroom/sync_depth_01033.png 518.8579
bedroom/rgb_01037.jpg bedroom/sync_depth_01037.png 518.8579
bedroom/rgb_01038.jpg bedroom/sync_depth_01038.png 518.8579
bedroom/rgb_01047.jpg bedroom/sync_depth_01047.png 518.8579
bedroom/rgb_01048.jpg bedroom/sync_depth_01048.png 518.8579
bedroom/rgb_01051.jpg bedroom/sync_depth_01051.png 518.8579
bedroom/rgb_01056.jpg bedroom/sync_depth_01056.png 518.8579
bedroom/rgb_01057.jpg bedroom/sync_depth_01057.png 518.8579
bedroom/rgb_01074.jpg bedroom/sync_depth_01074.png 518.8579
bedroom/rgb_01075.jpg bedroom/sync_depth_01075.png 518.8579
bedroom/rgb_01076.jpg bedroom/sync_depth_01076.png 518.8579
bedroom/rgb_01077.jpg bedroom/sync_depth_01077.png 518.8579
bedroom/rgb_01078.jpg bedroom/sync_depth_01078.png 518.8579
bedroom/rgb_01079.jpg bedroom/sync_depth_01079.png 518.8579
bedroom/rgb_01080.jpg bedroom/sync_depth_01080.png 518.8579
bedroom/rgb_01081.jpg bedroom/sync_depth_01081.png 518.8579
bedroom/rgb_01082.jpg bedroom/sync_depth_01082.png 518.8579
bedroom/rgb_01083.jpg bedroom/sync_depth_01083.png 518.8579
bedroom/rgb_01087.jpg bedroom/sync_depth_01087.png 518.8579
bedroom/rgb_01088.jpg bedroom/sync_depth_01088.png 518.8579
bedroom/rgb_01089.jpg bedroom/sync_depth_01089.png 518.8579
bedroom/rgb_01090.jpg bedroom/sync_depth_01090.png 518.8579
bedroom/rgb_01091.jpg bedroom/sync_depth_01091.png 518.8579
bedroom/rgb_01093.jpg bedroom/sync_depth_01093.png 518.8579
bedroom/rgb_01094.jpg bedroom/sync_depth_01094.png 518.8579
bedroom/rgb_01095.jpg bedroom/sync_depth_01095.png 518.8579
bedroom/rgb_01097.jpg bedroom/sync_depth_01097.png 518.8579
bedroom/rgb_01098.jpg bedroom/sync_depth_01098.png 518.8579
bedroom/rgb_01099.jpg bedroom/sync_depth_01099.png 518.8579
bedroom/rgb_01100.jpg bedroom/sync_depth_01100.png 518.8579
bedroom/rgb_01101.jpg bedroom/sync_depth_01101.png 518.8579
bedroom/rgb_01102.jpg bedroom/sync_depth_01102.png 518.8579
bedroom/rgb_01103.jpg bedroom/sync_depth_01103.png 518.8579
bedroom/rgb_01105.jpg bedroom/sync_depth_01105.png 518.8579
bedroom/rgb_01106.jpg bedroom/sync_depth_01106.png 518.8579
bedroom/rgb_01107.jpg bedroom/sync_depth_01107.png 518.8579
bedroom/rgb_01108.jpg bedroom/sync_depth_01108.png 518.8579
bedroom/rgb_01116.jpg bedroom/sync_depth_01116.png 518.8579
bedroom/rgb_01117.jpg bedroom/sync_depth_01117.png 518.8579
bedroom/rgb_01118.jpg bedroom/sync_depth_01118.png 518.8579
bedroom/rgb_01123.jpg bedroom/sync_depth_01123.png 518.8579
bedroom/rgb_01124.jpg bedroom/sync_depth_01124.png 518.8579
bedroom/rgb_01125.jpg bedroom/sync_depth_01125.png 518.8579
bedroom/rgb_01126.jpg bedroom/sync_depth_01126.png 518.8579
bedroom/rgb_01127.jpg bedroom/sync_depth_01127.png 518.8579
bedroom/rgb_01128.jpg bedroom/sync_depth_01128.png 518.8579
bedroom/rgb_01129.jpg bedroom/sync_depth_01129.png 518.8579
bedroom/rgb_01130.jpg bedroom/sync_depth_01130.png 518.8579
bedroom/rgb_01134.jpg bedroom/sync_depth_01134.png 518.8579
bedroom/rgb_01135.jpg bedroom/sync_depth_01135.png 518.8579
bedroom/rgb_01143.jpg bedroom/sync_depth_01143.png 518.8579
bedroom/rgb_01144.jpg bedroom/sync_depth_01144.png 518.8579
bedroom/rgb_01145.jpg bedroom/sync_depth_01145.png 518.8579
bedroom/rgb_01146.jpg bedroom/sync_depth_01146.png 518.8579
bedroom/rgb_01147.jpg bedroom/sync_depth_01147.png 518.8579
bedroom/rgb_01148.jpg bedroom/sync_depth_01148.png 518.8579
bedroom/rgb_01149.jpg bedroom/sync_depth_01149.png 518.8579
bedroom/rgb_01151.jpg bedroom/sync_depth_01151.png 518.8579
bedroom/rgb_01152.jpg bedroom/sync_depth_01152.png 518.8579
bedroom/rgb_01153.jpg bedroom/sync_depth_01153.png 518.8579
bedroom/rgb_01154.jpg bedroom/sync_depth_01154.png 518.8579
bedroom/rgb_01155.jpg bedroom/sync_depth_01155.png 518.8579
bedroom/rgb_01156.jpg bedroom/sync_depth_01156.png 518.8579
bedroom/rgb_01157.jpg bedroom/sync_depth_01157.png 518.8579
bedroom/rgb_01161.jpg bedroom/sync_depth_01161.png 518.8579
bedroom/rgb_01162.jpg bedroom/sync_depth_01162.png 518.8579
bedroom/rgb_01163.jpg bedroom/sync_depth_01163.png 518.8579
bedroom/rgb_01164.jpg bedroom/sync_depth_01164.png 518.8579
bedroom/rgb_01165.jpg bedroom/sync_depth_01165.png 518.8579
bedroom/rgb_01166.jpg bedroom/sync_depth_01166.png 518.8579
bedroom/rgb_01169.jpg bedroom/sync_depth_01169.png 518.8579
bedroom/rgb_01170.jpg bedroom/sync_depth_01170.png 518.8579
bedroom/rgb_01173.jpg bedroom/sync_depth_01173.png 518.8579
bedroom/rgb_01174.jpg bedroom/sync_depth_01174.png 518.8579
bedroom/rgb_01175.jpg bedroom/sync_depth_01175.png 518.8579
bedroom/rgb_01178.jpg bedroom/sync_depth_01178.png 518.8579
bedroom/rgb_01179.jpg bedroom/sync_depth_01179.png 518.8579
bedroom/rgb_01180.jpg bedroom/sync_depth_01180.png 518.8579
bedroom/rgb_01181.jpg bedroom/sync_depth_01181.png 518.8579
bedroom/rgb_01182.jpg bedroom/sync_depth_01182.png 518.8579
bedroom/rgb_01183.jpg bedroom/sync_depth_01183.png 518.8579
bedroom/rgb_01191.jpg bedroom/sync_depth_01191.png 518.8579
bedroom/rgb_01192.jpg bedroom/sync_depth_01192.png 518.8579
bedroom/rgb_01193.jpg bedroom/sync_depth_01193.png 518.8579
bedroom/rgb_01194.jpg bedroom/sync_depth_01194.png 518.8579
bedroom/rgb_01195.jpg bedroom/sync_depth_01195.png 518.8579
bookstore/rgb_00083.jpg bookstore/sync_depth_00083.png 518.8579
bookstore/rgb_00084.jpg bookstore/sync_depth_00084.png 518.8579
bookstore/rgb_00085.jpg bookstore/sync_depth_00085.png 518.8579
bookstore/rgb_00086.jpg bookstore/sync_depth_00086.png 518.8579
bookstore/rgb_00087.jpg bookstore/sync_depth_00087.png 518.8579
bookstore/rgb_00088.jpg bookstore/sync_depth_00088.png 518.8579
bookstore/rgb_00089.jpg bookstore/sync_depth_00089.png 518.8579
bookstore/rgb_00090.jpg bookstore/sync_depth_00090.png 518.8579
bookstore/rgb_00116.jpg bookstore/sync_depth_00116.png 518.8579
bookstore/rgb_00117.jpg bookstore/sync_depth_00117.png 518.8579
bookstore/rgb_00118.jpg bookstore/sync_depth_00118.png 518.8579
classroom/rgb_00283.jpg classroom/sync_depth_00283.png 518.8579
classroom/rgb_00284.jpg classroom/sync_depth_00284.png 518.8579
classroom/rgb_00295.jpg classroom/sync_depth_00295.png 518.8579
classroom/rgb_00296.jpg classroom/sync_depth_00296.png 518.8579
classroom/rgb_00297.jpg classroom/sync_depth_00297.png 518.8579
classroom/rgb_00298.jpg classroom/sync_depth_00298.png 518.8579
classroom/rgb_00299.jpg classroom/sync_depth_00299.png 518.8579
classroom/rgb_00300.jpg classroom/sync_depth_00300.png 518.8579
classroom/rgb_00301.jpg classroom/sync_depth_00301.png 518.8579
classroom/rgb_00309.jpg classroom/sync_depth_00309.png 518.8579
classroom/rgb_00310.jpg classroom/sync_depth_00310.png 518.8579
classroom/rgb_00311.jpg classroom/sync_depth_00311.png 518.8579
classroom/rgb_00314.jpg classroom/sync_depth_00314.png 518.8579
classroom/rgb_00315.jpg classroom/sync_depth_00315.png 518.8579
classroom/rgb_00316.jpg classroom/sync_depth_00316.png 518.8579
classroom/rgb_00324.jpg classroom/sync_depth_00324.png 518.8579
classroom/rgb_00325.jpg classroom/sync_depth_00325.png 518.8579
classroom/rgb_00326.jpg classroom/sync_depth_00326.png 518.8579
classroom/rgb_00327.jpg classroom/sync_depth_00327.png 518.8579
classroom/rgb_00328.jpg classroom/sync_depth_00328.png 518.8579
classroom/rgb_00329.jpg classroom/sync_depth_00329.png 518.8579
classroom/rgb_00330.jpg classroom/sync_depth_00330.png 518.8579
classroom/rgb_00331.jpg classroom/sync_depth_00331.png 518.8579
computer_lab/rgb_00332.jpg computer_lab/sync_depth_00332.png 518.8579
computer_lab/rgb_00333.jpg computer_lab/sync_depth_00333.png 518.8579
computer_lab/rgb_00334.jpg computer_lab/sync_depth_00334.png 518.8579
dining_room/rgb_00548.jpg dining_room/sync_depth_00548.png 518.8579
dining_room/rgb_00549.jpg dining_room/sync_depth_00549.png 518.8579
dining_room/rgb_00550.jpg dining_room/sync_depth_00550.png 518.8579
dining_room/rgb_01346.jpg dining_room/sync_depth_01346.png 518.8579
dining_room/rgb_01347.jpg dining_room/sync_depth_01347.png 518.8579
dining_room/rgb_01348.jpg dining_room/sync_depth_01348.png 518.8579
dining_room/rgb_01352.jpg dining_room/sync_depth_01352.png 518.8579
dining_room/rgb_01353.jpg dining_room/sync_depth_01353.png 518.8579
dining_room/rgb_01354.jpg dining_room/sync_depth_01354.png 518.8579
dining_room/rgb_01355.jpg dining_room/sync_depth_01355.png 518.8579
dining_room/rgb_01363.jpg dining_room/sync_depth_01363.png 518.8579
dining_room/rgb_01364.jpg dining_room/sync_depth_01364.png 518.8579
dining_room/rgb_01367.jpg dining_room/sync_depth_01367.png 518.8579
dining_room/rgb_01368.jpg dining_room/sync_depth_01368.png 518.8579
dining_room/rgb_01383.jpg dining_room/sync_depth_01383.png 518.8579
dining_room/rgb_01384.jpg dining_room/sync_depth_01384.png 518.8579
dining_room/rgb_01385.jpg dining_room/sync_depth_01385.png 518.8579
dining_room/rgb_01387.jpg dining_room/sync_depth_01387.png 518.8579
dining_room/rgb_01388.jpg dining_room/sync_depth_01388.png 518.8579
dining_room/rgb_01389.jpg dining_room/sync_depth_01389.png 518.8579
dining_room/rgb_01390.jpg dining_room/sync_depth_01390.png 518.8579
dining_room/rgb_01393.jpg dining_room/sync_depth_01393.png 518.8579
dining_room/rgb_01394.jpg dining_room/sync_depth_01394.png 518.8579
dining_room/rgb_01395.jpg dining_room/sync_depth_01395.png 518.8579
dining_room/rgb_01396.jpg dining_room/sync_depth_01396.png 518.8579
dining_room/rgb_01397.jpg dining_room/sync_depth_01397.png 518.8579
dining_room/rgb_01398.jpg dining_room/sync_depth_01398.png 518.8579
dining_room/rgb_01399.jpg dining_room/sync_depth_01399.png 518.8579
dining_room/rgb_01400.jpg dining_room/sync_depth_01400.png 518.8579
dining_room/rgb_01406.jpg dining_room/sync_depth_01406.png 518.8579
dining_room/rgb_01407.jpg dining_room/sync_depth_01407.png 518.8579
dining_room/rgb_01408.jpg dining_room/sync_depth_01408.png 518.8579
dining_room/rgb_01409.jpg dining_room/sync_depth_01409.png 518.8579
dining_room/rgb_01410.jpg dining_room/sync_depth_01410.png 518.8579
dining_room/rgb_01386.jpg dining_room/sync_depth_01386.png 518.8579
dining_room/rgb_01411.jpg dining_room/sync_depth_01411.png 518.8579
dining_room/rgb_01412.jpg dining_room/sync_depth_01412.png 518.8579
dining_room/rgb_01413.jpg dining_room/sync_depth_01413.png 518.8579
dining_room/rgb_01420.jpg dining_room/sync_depth_01420.png 518.8579
dining_room/rgb_01421.jpg dining_room/sync_depth_01421.png 518.8579
dining_room/rgb_01422.jpg dining_room/sync_depth_01422.png 518.8579
dining_room/rgb_01423.jpg dining_room/sync_depth_01423.png 518.8579
dining_room/rgb_01429.jpg dining_room/sync_depth_01429.png 518.8579
dining_room/rgb_01430.jpg dining_room/sync_depth_01430.png 518.8579
dining_room/rgb_01431.jpg dining_room/sync_depth_01431.png 518.8579
dining_room/rgb_01432.jpg dining_room/sync_depth_01432.png 518.8579
dining_room/rgb_01440.jpg dining_room/sync_depth_01440.png 518.8579
dining_room/rgb_01441.jpg dining_room/sync_depth_01441.png 518.8579
dining_room/rgb_01442.jpg dining_room/sync_depth_01442.png 518.8579
dining_room/rgb_01443.jpg dining_room/sync_depth_01443.png 518.8579
dining_room/rgb_01444.jpg dining_room/sync_depth_01444.png 518.8579
dining_room/rgb_01445.jpg dining_room/sync_depth_01445.png 518.8579
dining_room/rgb_01446.jpg dining_room/sync_depth_01446.png 518.8579
dining_room/rgb_01447.jpg dining_room/sync_depth_01447.png 518.8579
dining_room/rgb_01448.jpg dining_room/sync_depth_01448.png 518.8579
foyer/rgb_00350.jpg foyer/sync_depth_00350.png 518.8579
foyer/rgb_00351.jpg foyer/sync_depth_00351.png 518.8579
home_office/rgb_00354.jpg home_office/sync_depth_00354.png 518.8579
home_office/rgb_00355.jpg home_office/sync_depth_00355.png 518.8579
home_office/rgb_00356.jpg home_office/sync_depth_00356.png 518.8579
home_office/rgb_00357.jpg home_office/sync_depth_00357.png 518.8579
home_office/rgb_00358.jpg home_office/sync_depth_00358.png 518.8579
home_office/rgb_00359.jpg home_office/sync_depth_00359.png 518.8579
home_office/rgb_00360.jpg home_office/sync_depth_00360.png 518.8579
home_office/rgb_00361.jpg home_office/sync_depth_00361.png 518.8579
home_office/rgb_00362.jpg home_office/sync_depth_00362.png 518.8579
home_office/rgb_00363.jpg home_office/sync_depth_00363.png 518.8579
home_office/rgb_00383.jpg home_office/sync_depth_00383.png 518.8579
home_office/rgb_00384.jpg home_office/sync_depth_00384.png 518.8579
home_office/rgb_00385.jpg home_office/sync_depth_00385.png 518.8579
home_office/rgb_00386.jpg home_office/sync_depth_00386.png 518.8579
home_office/rgb_00387.jpg home_office/sync_depth_00387.png 518.8579
home_office/rgb_00388.jpg home_office/sync_depth_00388.png 518.8579
home_office/rgb_00389.jpg home_office/sync_depth_00389.png 518.8579
home_office/rgb_00394.jpg home_office/sync_depth_00394.png 518.8579
home_office/rgb_00395.jpg home_office/sync_depth_00395.png 518.8579
home_office/rgb_00396.jpg home_office/sync_depth_00396.png 518.8579
home_office/rgb_00554.jpg home_office/sync_depth_00554.png 518.8579
home_office/rgb_00555.jpg home_office/sync_depth_00555.png 518.8579
home_office/rgb_00556.jpg home_office/sync_depth_00556.png 518.8579
home_office/rgb_00557.jpg home_office/sync_depth_00557.png 518.8579
kitchen/rgb_00000.jpg kitchen/sync_depth_00000.png 518.8579
kitchen/rgb_00001.jpg kitchen/sync_depth_00001.png 518.8579
kitchen/rgb_00124.jpg kitchen/sync_depth_00124.png 518.8579
kitchen/rgb_00125.jpg kitchen/sync_depth_00125.png 518.8579
kitchen/rgb_00126.jpg kitchen/sync_depth_00126.png 518.8579
kitchen/rgb_00127.jpg kitchen/sync_depth_00127.png 518.8579
kitchen/rgb_00128.jpg kitchen/sync_depth_00128.png 518.8579
kitchen/rgb_00130.jpg kitchen/sync_depth_00130.png 518.8579
kitchen/rgb_00131.jpg kitchen/sync_depth_00131.png 518.8579
kitchen/rgb_00132.jpg kitchen/sync_depth_00132.png 518.8579
kitchen/rgb_00133.jpg kitchen/sync_depth_00133.png 518.8579
kitchen/rgb_00136.jpg kitchen/sync_depth_00136.png 518.8579
kitchen/rgb_00193.jpg kitchen/sync_depth_00193.png 518.8579
kitchen/rgb_00194.jpg kitchen/sync_depth_00194.png 518.8579
kitchen/rgb_00195.jpg kitchen/sync_depth_00195.png 518.8579
kitchen/rgb_00196.jpg kitchen/sync_depth_00196.png 518.8579
kitchen/rgb_00197.jpg kitchen/sync_depth_00197.png 518.8579
kitchen/rgb_00199.jpg kitchen/sync_depth_00199.png 518.8579
kitchen/rgb_00200.jpg kitchen/sync_depth_00200.png 518.8579
kitchen/rgb_00201.jpg kitchen/sync_depth_00201.png 518.8579
kitchen/rgb_00249.jpg kitchen/sync_depth_00249.png 518.8579
kitchen/rgb_00558.jpg kitchen/sync_depth_00558.png 518.8579
kitchen/rgb_00559.jpg kitchen/sync_depth_00559.png 518.8579
kitchen/rgb_00560.jpg kitchen/sync_depth_00560.png 518.8579
kitchen/rgb_00561.jpg kitchen/sync_depth_00561.png 518.8579
kitchen/rgb_00562.jpg kitchen/sync_depth_00562.png 518.8579
kitchen/rgb_00563.jpg kitchen/sync_depth_00563.png 518.8579
kitchen/rgb_00564.jpg kitchen/sync_depth_00564.png 518.8579
kitchen/rgb_00565.jpg kitchen/sync_depth_00565.png 518.8579
kitchen/rgb_00566.jpg kitchen/sync_depth_00566.png 518.8579
kitchen/rgb_00567.jpg kitchen/sync_depth_00567.png 518.8579
kitchen/rgb_00568.jpg kitchen/sync_depth_00568.png 518.8579
kitchen/rgb_00569.jpg kitchen/sync_depth_00569.png 518.8579
kitchen/rgb_00570.jpg kitchen/sync_depth_00570.png 518.8579
kitchen/rgb_00198.jpg kitchen/sync_depth_00198.png 518.8579
kitchen/rgb_00758.jpg kitchen/sync_depth_00758.png 518.8579
kitchen/rgb_00776.jpg kitchen/sync_depth_00776.png 518.8579
kitchen/rgb_00811.jpg kitchen/sync_depth_00811.png 518.8579
kitchen/rgb_00844.jpg kitchen/sync_depth_00844.png 518.8579
kitchen/rgb_00759.jpg kitchen/sync_depth_00759.png 518.8579
kitchen/rgb_00760.jpg kitchen/sync_depth_00760.png 518.8579
kitchen/rgb_00761.jpg kitchen/sync_depth_00761.png 518.8579
kitchen/rgb_00762.jpg kitchen/sync_depth_00762.png 518.8579
kitchen/rgb_00763.jpg kitchen/sync_depth_00763.png 518.8579
kitchen/rgb_00764.jpg kitchen/sync_depth_00764.png 518.8579
kitchen/rgb_00765.jpg kitchen/sync_depth_00765.png 518.8579
kitchen/rgb_00766.jpg kitchen/sync_depth_00766.png 518.8579
kitchen/rgb_00767.jpg kitchen/sync_depth_00767.png 518.8579
kitchen/rgb_00768.jpg kitchen/sync_depth_00768.png 518.8579
kitchen/rgb_00769.jpg kitchen/sync_depth_00769.png 518.8579
kitchen/rgb_00770.jpg kitchen/sync_depth_00770.png 518.8579
kitchen/rgb_00771.jpg kitchen/sync_depth_00771.png 518.8579
kitchen/rgb_00772.jpg kitchen/sync_depth_00772.png 518.8579
kitchen/rgb_00773.jpg kitchen/sync_depth_00773.png 518.8579
kitchen/rgb_00774.jpg kitchen/sync_depth_00774.png 518.8579
kitchen/rgb_00775.jpg kitchen/sync_depth_00775.png 518.8579
kitchen/rgb_00777.jpg kitchen/sync_depth_00777.png 518.8579
kitchen/rgb_00778.jpg kitchen/sync_depth_00778.png 518.8579
kitchen/rgb_00779.jpg kitchen/sync_depth_00779.png 518.8579
kitchen/rgb_00780.jpg kitchen/sync_depth_00780.png 518.8579
kitchen/rgb_00781.jpg kitchen/sync_depth_00781.png 518.8579
kitchen/rgb_00782.jpg kitchen/sync_depth_00782.png 518.8579
kitchen/rgb_00783.jpg kitchen/sync_depth_00783.png 518.8579
kitchen/rgb_00784.jpg kitchen/sync_depth_00784.png 518.8579
kitchen/rgb_00785.jpg kitchen/sync_depth_00785.png 518.8579
kitchen/rgb_00786.jpg kitchen/sync_depth_00786.png 518.8579
kitchen/rgb_00799.jpg kitchen/sync_depth_00799.png 518.8579
kitchen/rgb_00800.jpg kitchen/sync_depth_00800.png 518.8579
kitchen/rgb_00801.jpg kitchen/sync_depth_00801.png 518.8579
kitchen/rgb_00802.jpg kitchen/sync_depth_00802.png 518.8579
kitchen/rgb_00803.jpg kitchen/sync_depth_00803.png 518.8579
kitchen/rgb_00809.jpg kitchen/sync_depth_00809.png 518.8579
kitchen/rgb_00810.jpg kitchen/sync_depth_00810.png 518.8579
kitchen/rgb_00812.jpg kitchen/sync_depth_00812.png 518.8579
kitchen/rgb_00813.jpg kitchen/sync_depth_00813.png 518.8579
kitchen/rgb_00820.jpg kitchen/sync_depth_00820.png 518.8579
kitchen/rgb_00821.jpg kitchen/sync_depth_00821.png 518.8579
kitchen/rgb_00822.jpg kitchen/sync_depth_00822.png 518.8579
kitchen/rgb_00832.jpg kitchen/sync_depth_00832.png 518.8579
kitchen/rgb_00833.jpg kitchen/sync_depth_00833.png 518.8579
kitchen/rgb_00834.jpg kitchen/sync_depth_00834.png 518.8579
kitchen/rgb_00835.jpg kitchen/sync_depth_00835.png 518.8579
kitchen/rgb_00836.jpg kitchen/sync_depth_00836.png 518.8579
kitchen/rgb_00837.jpg kitchen/sync_depth_00837.png 518.8579
kitchen/rgb_00838.jpg kitchen/sync_depth_00838.png 518.8579
kitchen/rgb_00839.jpg kitchen/sync_depth_00839.png 518.8579
kitchen/rgb_00840.jpg kitchen/sync_depth_00840.png 518.8579
kitchen/rgb_00841.jpg kitchen/sync_depth_00841.png 518.8579
kitchen/rgb_00842.jpg kitchen/sync_depth_00842.png 518.8579
kitchen/rgb_00843.jpg kitchen/sync_depth_00843.png 518.8579
kitchen/rgb_00845.jpg kitchen/sync_depth_00845.png 518.8579
kitchen/rgb_00849.jpg kitchen/sync_depth_00849.png 518.8579
kitchen/rgb_00850.jpg kitchen/sync_depth_00850.png 518.8579
kitchen/rgb_00851.jpg kitchen/sync_depth_00851.png 518.8579
kitchen/rgb_00856.jpg kitchen/sync_depth_00856.png 518.8579
kitchen/rgb_00857.jpg kitchen/sync_depth_00857.png 518.8579
kitchen/rgb_00858.jpg kitchen/sync_depth_00858.png 518.8579
kitchen/rgb_00859.jpg kitchen/sync_depth_00859.png 518.8579
kitchen/rgb_00860.jpg kitchen/sync_depth_00860.png 518.8579
kitchen/rgb_00861.jpg kitchen/sync_depth_00861.png 518.8579
kitchen/rgb_00868.jpg kitchen/sync_depth_00868.png 518.8579
kitchen/rgb_00869.jpg kitchen/sync_depth_00869.png 518.8579
kitchen/rgb_00870.jpg kitchen/sync_depth_00870.png 518.8579
kitchen/rgb_00905.jpg kitchen/sync_depth_00905.png 518.8579
kitchen/rgb_00906.jpg kitchen/sync_depth_00906.png 518.8579
kitchen/rgb_00907.jpg kitchen/sync_depth_00907.png 518.8579
living_room/rgb_00152.jpg living_room/sync_depth_00152.png 518.8579
living_room/rgb_00153.jpg living_room/sync_depth_00153.png 518.8579
living_room/rgb_00154.jpg living_room/sync_depth_00154.png 518.8579
living_room/rgb_00166.jpg living_room/sync_depth_00166.png 518.8579
living_room/rgb_00167.jpg living_room/sync_depth_00167.png 518.8579
living_room/rgb_00168.jpg living_room/sync_depth_00168.png 518.8579
living_room/rgb_00206.jpg living_room/sync_depth_00206.png 518.8579
living_room/rgb_00207.jpg living_room/sync_depth_00207.png 518.8579
living_room/rgb_00208.jpg living_room/sync_depth_00208.png 518.8579
living_room/rgb_00209.jpg living_room/sync_depth_00209.png 518.8579
living_room/rgb_00210.jpg living_room/sync_depth_00210.png 518.8579
living_room/rgb_00211.jpg living_room/sync_depth_00211.png 518.8579
living_room/rgb_00263.jpg living_room/sync_depth_00263.png 518.8579
living_room/rgb_00578.jpg living_room/sync_depth_00578.png 518.8579
living_room/rgb_00579.jpg living_room/sync_depth_00579.png 518.8579
living_room/rgb_00580.jpg living_room/sync_depth_00580.png 518.8579
living_room/rgb_00581.jpg living_room/sync_depth_00581.png 518.8579
living_room/rgb_00590.jpg living_room/sync_depth_00590.png 518.8579
living_room/rgb_00591.jpg living_room/sync_depth_00591.png 518.8579
living_room/rgb_00592.jpg living_room/sync_depth_00592.png 518.8579
living_room/rgb_00593.jpg living_room/sync_depth_00593.png 518.8579
living_room/rgb_00602.jpg living_room/sync_depth_00602.png 518.8579
living_room/rgb_00603.jpg living_room/sync_depth_00603.png 518.8579
living_room/rgb_00604.jpg living_room/sync_depth_00604.png 518.8579
living_room/rgb_00605.jpg living_room/sync_depth_00605.png 518.8579
living_room/rgb_00606.jpg living_room/sync_depth_00606.png 518.8579
living_room/rgb_01200.jpg living_room/sync_depth_01200.png 518.8579
living_room/rgb_01201.jpg living_room/sync_depth_01201.png 518.8579
living_room/rgb_01202.jpg living_room/sync_depth_01202.png 518.8579
living_room/rgb_01203.jpg living_room/sync_depth_01203.png 518.8579
living_room/rgb_01204.jpg living_room/sync_depth_01204.png 518.8579
living_room/rgb_01205.jpg living_room/sync_depth_01205.png 518.8579
living_room/rgb_01206.jpg living_room/sync_depth_01206.png 518.8579
living_room/rgb_01207.jpg living_room/sync_depth_01207.png 518.8579
living_room/rgb_00582.jpg living_room/sync_depth_00582.png 518.8579
living_room/rgb_01208.jpg living_room/sync_depth_01208.png 518.8579
living_room/rgb_01247.jpg living_room/sync_depth_01247.png 518.8579
living_room/rgb_01277.jpg living_room/sync_depth_01277.png 518.8579
living_room/rgb_01302.jpg living_room/sync_depth_01302.png 518.8579
living_room/rgb_01209.jpg living_room/sync_depth_01209.png 518.8579
living_room/rgb_01210.jpg living_room/sync_depth_01210.png 518.8579
living_room/rgb_01211.jpg living_room/sync_depth_01211.png 518.8579
living_room/rgb_01215.jpg living_room/sync_depth_01215.png 518.8579
living_room/rgb_01216.jpg living_room/sync_depth_01216.png 518.8579
living_room/rgb_01217.jpg living_room/sync_depth_01217.png 518.8579
living_room/rgb_01218.jpg living_room/sync_depth_01218.png 518.8579
living_room/rgb_01219.jpg living_room/sync_depth_01219.png 518.8579
living_room/rgb_01225.jpg living_room/sync_depth_01225.png 518.8579
living_room/rgb_01226.jpg living_room/sync_depth_01226.png 518.8579
living_room/rgb_01227.jpg living_room/sync_depth_01227.png 518.8579
living_room/rgb_01228.jpg living_room/sync_depth_01228.png 518.8579
living_room/rgb_01229.jpg living_room/sync_depth_01229.png 518.8579
living_room/rgb_01232.jpg living_room/sync_depth_01232.png 518.8579
living_room/rgb_01233.jpg living_room/sync_depth_01233.png 518.8579
living_room/rgb_01234.jpg living_room/sync_depth_01234.png 518.8579
living_room/rgb_01246.jpg living_room/sync_depth_01246.png 518.8579
living_room/rgb_01248.jpg living_room/sync_depth_01248.png 518.8579
living_room/rgb_01249.jpg living_room/sync_depth_01249.png 518.8579
living_room/rgb_01253.jpg living_room/sync_depth_01253.png 518.8579
living_room/rgb_01254.jpg living_room/sync_depth_01254.png 518.8579
living_room/rgb_01255.jpg living_room/sync_depth_01255.png 518.8579
living_room/rgb_01256.jpg living_room/sync_depth_01256.png 518.8579
living_room/rgb_01257.jpg living_room/sync_depth_01257.png 518.8579
living_room/rgb_01258.jpg living_room/sync_depth_01258.png 518.8579
living_room/rgb_01259.jpg living_room/sync_depth_01259.png 518.8579
living_room/rgb_01260.jpg living_room/sync_depth_01260.png 518.8579
living_room/rgb_01261.jpg living_room/sync_depth_01261.png 518.8579
living_room/rgb_01262.jpg living_room/sync_depth_01262.png 518.8579
living_room/rgb_01263.jpg living_room/sync_depth_01263.png 518.8579
living_room/rgb_01264.jpg living_room/sync_depth_01264.png 518.8579
living_room/rgb_01274.jpg living_room/sync_depth_01274.png 518.8579
living_room/rgb_01275.jpg living_room/sync_depth_01275.png 518.8579
living_room/rgb_01276.jpg living_room/sync_depth_01276.png 518.8579
living_room/rgb_01278.jpg living_room/sync_depth_01278.png 518.8579
living_room/rgb_01279.jpg living_room/sync_depth_01279.png 518.8579
living_room/rgb_01284.jpg living_room/sync_depth_01284.png 518.8579
living_room/rgb_01285.jpg living_room/sync_depth_01285.png 518.8579
living_room/rgb_01286.jpg living_room/sync_depth_01286.png 518.8579
living_room/rgb_01287.jpg living_room/sync_depth_01287.png 518.8579
living_room/rgb_01288.jpg living_room/sync_depth_01288.png 518.8579
living_room/rgb_01289.jpg living_room/sync_depth_01289.png 518.8579
living_room/rgb_01290.jpg living_room/sync_depth_01290.png 518.8579
living_room/rgb_01291.jpg living_room/sync_depth_01291.png 518.8579
living_room/rgb_01292.jpg living_room/sync_depth_01292.png 518.8579
living_room/rgb_01293.jpg living_room/sync_depth_01293.png 518.8579
living_room/rgb_01294.jpg living_room/sync_depth_01294.png 518.8579
living_room/rgb_01296.jpg living_room/sync_depth_01296.png 518.8579
living_room/rgb_01297.jpg living_room/sync_depth_01297.png 518.8579
living_room/rgb_01298.jpg living_room/sync_depth_01298.png 518.8579
living_room/rgb_01301.jpg living_room/sync_depth_01301.png 518.8579
living_room/rgb_01303.jpg living_room/sync_depth_01303.png 518.8579
living_room/rgb_01304.jpg living_room/sync_depth_01304.png 518.8579
living_room/rgb_01305.jpg living_room/sync_depth_01305.png 518.8579
living_room/rgb_01306.jpg living_room/sync_depth_01306.png 518.8579
living_room/rgb_01307.jpg living_room/sync_depth_01307.png 518.8579
living_room/rgb_01313.jpg living_room/sync_depth_01313.png 518.8579
living_room/rgb_01314.jpg living_room/sync_depth_01314.png 518.8579
living_room/rgb_01328.jpg living_room/sync_depth_01328.png 518.8579
living_room/rgb_01329.jpg living_room/sync_depth_01329.png 518.8579
living_room/rgb_01330.jpg living_room/sync_depth_01330.png 518.8579
living_room/rgb_01331.jpg living_room/sync_depth_01331.png 518.8579
living_room/rgb_01334.jpg living_room/sync_depth_01334.png 518.8579
living_room/rgb_01335.jpg living_room/sync_depth_01335.png 518.8579
living_room/rgb_01336.jpg living_room/sync_depth_01336.png 518.8579
living_room/rgb_01337.jpg living_room/sync_depth_01337.png 518.8579
living_room/rgb_01338.jpg living_room/sync_depth_01338.png 518.8579
living_room/rgb_01339.jpg living_room/sync_depth_01339.png 518.8579
office/rgb_00008.jpg office/sync_depth_00008.png 518.8579
office/rgb_00013.jpg office/sync_depth_00013.png 518.8579
office/rgb_00014.jpg office/sync_depth_00014.png 518.8579
office/rgb_00015.jpg office/sync_depth_00015.png 518.8579
office/rgb_00016.jpg office/sync_depth_00016.png 518.8579
office/rgb_00017.jpg office/sync_depth_00017.png 518.8579
office/rgb_00020.jpg office/sync_depth_00020.png 518.8579
office/rgb_00027.jpg office/sync_depth_00027.png 518.8579
office/rgb_00028.jpg office/sync_depth_00028.png 518.8579
office/rgb_00029.jpg office/sync_depth_00029.png 518.8579
office/rgb_00030.jpg office/sync_depth_00030.png 518.8579
office/rgb_00031.jpg office/sync_depth_00031.png 518.8579
office/rgb_00032.jpg office/sync_depth_00032.png 518.8579
office/rgb_00033.jpg office/sync_depth_00033.png 518.8579
office/rgb_00034.jpg office/sync_depth_00034.png 518.8579
office/rgb_00035.jpg office/sync_depth_00035.png 518.8579
office/rgb_00036.jpg office/sync_depth_00036.png 518.8579
office/rgb_00038.jpg office/sync_depth_00038.png 518.8579
office/rgb_00039.jpg office/sync_depth_00039.png 518.8579
office/rgb_00040.jpg office/sync_depth_00040.png 518.8579
office/rgb_00041.jpg office/sync_depth_00041.png 518.8579
office/rgb_00042.jpg office/sync_depth_00042.png 518.8579
office/rgb_00270.jpg office/sync_depth_00270.png 518.8579
office/rgb_00271.jpg office/sync_depth_00271.png 518.8579
office/rgb_00611.jpg office/sync_depth_00611.png 518.8579
office/rgb_00612.jpg office/sync_depth_00612.png 518.8579
office/rgb_00616.jpg office/sync_depth_00616.png 518.8579
office/rgb_00617.jpg office/sync_depth_00617.png 518.8579
office/rgb_00618.jpg office/sync_depth_00618.png 518.8579
office/rgb_00619.jpg office/sync_depth_00619.png 518.8579
office/rgb_00620.jpg office/sync_depth_00620.png 518.8579
office/rgb_00632.jpg office/sync_depth_00632.png 518.8579
office/rgb_00633.jpg office/sync_depth_00633.png 518.8579
office/rgb_00634.jpg office/sync_depth_00634.png 518.8579
office/rgb_00635.jpg office/sync_depth_00635.png 518.8579
office/rgb_00636.jpg office/sync_depth_00636.png 518.8579
office/rgb_00637.jpg office/sync_depth_00637.png 518.8579
office/rgb_00037.jpg office/sync_depth_00037.png 518.8579
office_kitchen/rgb_00410.jpg office_kitchen/sync_depth_00410.png 518.8579
office_kitchen/rgb_00411.jpg office_kitchen/sync_depth_00411.png 518.8579
office_kitchen/rgb_00412.jpg office_kitchen/sync_depth_00412.png 518.8579
office_kitchen/rgb_00413.jpg office_kitchen/sync_depth_00413.png 518.8579
playroom/rgb_00429.jpg playroom/sync_depth_00429.png 518.8579
playroom/rgb_00430.jpg playroom/sync_depth_00430.png 518.8579
playroom/rgb_00431.jpg playroom/sync_depth_00431.png 518.8579
playroom/rgb_00432.jpg playroom/sync_depth_00432.png 518.8579
playroom/rgb_00433.jpg playroom/sync_depth_00433.png 518.8579
playroom/rgb_00434.jpg playroom/sync_depth_00434.png 518.8579
playroom/rgb_00440.jpg playroom/sync_depth_00440.png 518.8579
playroom/rgb_00441.jpg playroom/sync_depth_00441.png 518.8579
playroom/rgb_00442.jpg playroom/sync_depth_00442.png 518.8579
playroom/rgb_00443.jpg playroom/sync_depth_00443.png 518.8579
playroom/rgb_00444.jpg playroom/sync_depth_00444.png 518.8579
playroom/rgb_00445.jpg playroom/sync_depth_00445.png 518.8579
playroom/rgb_00446.jpg playroom/sync_depth_00446.png 518.8579
playroom/rgb_00447.jpg playroom/sync_depth_00447.png 518.8579
reception_room/rgb_00461.jpg reception_room/sync_depth_00461.png 518.8579
reception_room/rgb_00462.jpg reception_room/sync_depth_00462.png 518.8579
reception_room/rgb_00463.jpg reception_room/sync_depth_00463.png 518.8579
reception_room/rgb_00464.jpg reception_room/sync_depth_00464.png 518.8579
reception_room/rgb_00465.jpg reception_room/sync_depth_00465.png 518.8579
study/rgb_00468.jpg study/sync_depth_00468.png 518.8579
study/rgb_00469.jpg study/sync_depth_00469.png 518.8579
study/rgb_00470.jpg study/sync_depth_00470.png 518.8579
study/rgb_00471.jpg study/sync_depth_00471.png 518.8579
study/rgb_00472.jpg study/sync_depth_00472.png 518.8579
study/rgb_00473.jpg study/sync_depth_00473.png 518.8579
study/rgb_00474.jpg study/sync_depth_00474.png 518.8579
study/rgb_00475.jpg study/sync_depth_00475.png 518.8579
study/rgb_00476.jpg study/sync_depth_00476.png 518.8579
study/rgb_00643.jpg study/sync_depth_00643.png 518.8579
study/rgb_00644.jpg study/sync_depth_00644.png 518.8579
study_room/rgb_00272.jpg study_room/sync_depth_00272.png 518.8579
study_room/rgb_00278.jpg study_room/sync_depth_00278.png 518.8579

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,24 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat

View File

@@ -0,0 +1,573 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
# This file is partly inspired from BTS (https://github.com/cleinc/bts/blob/master/pytorch/bts_dataloader.py); author: Jin Han Lee
import itertools
import os
import random
import numpy as np
import cv2
import torch
import torch.nn as nn
import torch.utils.data.distributed
from zoedepth.utils.easydict import EasyDict as edict
from PIL import Image, ImageOps
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms
from zoedepth.utils.config import change_dataset
from .ddad import get_ddad_loader
from .diml_indoor_test import get_diml_indoor_loader
from .diml_outdoor_test import get_diml_outdoor_loader
from .diode import get_diode_loader
from .hypersim import get_hypersim_loader
from .ibims import get_ibims_loader
from .sun_rgbd_loader import get_sunrgbd_loader
from .vkitti import get_vkitti_loader
from .vkitti2 import get_vkitti2_loader
from .preprocess import CropParams, get_white_border, get_black_border
def _is_pil_image(img):
return isinstance(img, Image.Image)
def _is_numpy_image(img):
return isinstance(img, np.ndarray) and (img.ndim in {2, 3})
def preprocessing_transforms(mode, **kwargs):
return transforms.Compose([
ToTensor(mode=mode, **kwargs)
])
class DepthDataLoader(object):
def __init__(self, config, mode, device='cpu', transform=None, **kwargs):
"""
Data loader for depth datasets
Args:
config (dict): Config dictionary. Refer to utils/config.py
mode (str): "train" or "online_eval"
device (str, optional): Device to load the data on. Defaults to 'cpu'.
transform (torchvision.transforms, optional): Transform to apply to the data. Defaults to None.
"""
self.config = config
if config.dataset == 'ibims':
self.data = get_ibims_loader(config, batch_size=1, num_workers=1)
return
if config.dataset == 'sunrgbd':
self.data = get_sunrgbd_loader(
data_dir_root=config.sunrgbd_root, batch_size=1, num_workers=1)
return
if config.dataset == 'diml_indoor':
self.data = get_diml_indoor_loader(
data_dir_root=config.diml_indoor_root, batch_size=1, num_workers=1)
return
if config.dataset == 'diml_outdoor':
self.data = get_diml_outdoor_loader(
data_dir_root=config.diml_outdoor_root, batch_size=1, num_workers=1)
return
if "diode" in config.dataset:
self.data = get_diode_loader(
config[config.dataset+"_root"], batch_size=1, num_workers=1)
return
if config.dataset == 'hypersim_test':
self.data = get_hypersim_loader(
config.hypersim_test_root, batch_size=1, num_workers=1)
return
if config.dataset == 'vkitti':
self.data = get_vkitti_loader(
config.vkitti_root, batch_size=1, num_workers=1)
return
if config.dataset == 'vkitti2':
self.data = get_vkitti2_loader(
config.vkitti2_root, batch_size=1, num_workers=1)
return
if config.dataset == 'ddad':
self.data = get_ddad_loader(config.ddad_root, resize_shape=(
352, 1216), batch_size=1, num_workers=1)
return
img_size = self.config.get("img_size", None)
img_size = img_size if self.config.get(
"do_input_resize", False) else None
if transform is None:
transform = preprocessing_transforms(mode, size=img_size)
if mode == 'train':
Dataset = DataLoadPreprocess
self.training_samples = Dataset(
config, mode, transform=transform, device=device)
if config.distributed:
self.train_sampler = torch.utils.data.distributed.DistributedSampler(
self.training_samples)
else:
self.train_sampler = None
self.data = DataLoader(self.training_samples,
batch_size=config.batch_size,
shuffle=(self.train_sampler is None),
num_workers=config.workers,
pin_memory=True,
persistent_workers=True,
# prefetch_factor=2,
sampler=self.train_sampler)
elif mode == 'online_eval':
self.testing_samples = DataLoadPreprocess(
config, mode, transform=transform)
if config.distributed: # redundant. here only for readability and to be more explicit
# Give whole test set to all processes (and report evaluation only on one) regardless
self.eval_sampler = None
else:
self.eval_sampler = None
self.data = DataLoader(self.testing_samples, 1,
shuffle=kwargs.get("shuffle_test", False),
num_workers=1,
pin_memory=False,
sampler=self.eval_sampler)
elif mode == 'test':
self.testing_samples = DataLoadPreprocess(
config, mode, transform=transform)
self.data = DataLoader(self.testing_samples,
1, shuffle=False, num_workers=1)
else:
print(
'mode should be one of \'train, test, online_eval\'. Got {}'.format(mode))
def repetitive_roundrobin(*iterables):
"""
cycles through iterables but sample wise
first yield first sample from first iterable then first sample from second iterable and so on
then second sample from first iterable then second sample from second iterable and so on
If one iterable is shorter than the others, it is repeated until all iterables are exhausted
repetitive_roundrobin('ABC', 'D', 'EF') --> A D E B D F C D E
"""
# Repetitive roundrobin
iterables_ = [iter(it) for it in iterables]
exhausted = [False] * len(iterables)
while not all(exhausted):
for i, it in enumerate(iterables_):
try:
yield next(it)
except StopIteration:
exhausted[i] = True
iterables_[i] = itertools.cycle(iterables[i])
# First elements may get repeated if one iterable is shorter than the others
yield next(iterables_[i])
class RepetitiveRoundRobinDataLoader(object):
def __init__(self, *dataloaders):
self.dataloaders = dataloaders
def __iter__(self):
return repetitive_roundrobin(*self.dataloaders)
def __len__(self):
# First samples get repeated, thats why the plus one
return len(self.dataloaders) * (max(len(dl) for dl in self.dataloaders) + 1)
class MixedNYUKITTI(object):
def __init__(self, config, mode, device='cpu', **kwargs):
config = edict(config)
config.workers = config.workers // 2
self.config = config
nyu_conf = change_dataset(edict(config), 'nyu')
kitti_conf = change_dataset(edict(config), 'kitti')
# make nyu default for testing
self.config = config = nyu_conf
img_size = self.config.get("img_size", None)
img_size = img_size if self.config.get(
"do_input_resize", False) else None
if mode == 'train':
nyu_loader = DepthDataLoader(
nyu_conf, mode, device=device, transform=preprocessing_transforms(mode, size=img_size)).data
kitti_loader = DepthDataLoader(
kitti_conf, mode, device=device, transform=preprocessing_transforms(mode, size=img_size)).data
# It has been changed to repetitive roundrobin
self.data = RepetitiveRoundRobinDataLoader(
nyu_loader, kitti_loader)
else:
self.data = DepthDataLoader(nyu_conf, mode, device=device).data
def remove_leading_slash(s):
if s[0] == '/' or s[0] == '\\':
return s[1:]
return s
class CachedReader:
def __init__(self, shared_dict=None):
if shared_dict:
self._cache = shared_dict
else:
self._cache = {}
def open(self, fpath):
im = self._cache.get(fpath, None)
if im is None:
im = self._cache[fpath] = Image.open(fpath)
return im
class ImReader:
def __init__(self):
pass
# @cache
def open(self, fpath):
return Image.open(fpath)
class DataLoadPreprocess(Dataset):
def __init__(self, config, mode, transform=None, is_for_online_eval=False, **kwargs):
self.config = config
if mode == 'online_eval':
with open(config.filenames_file_eval, 'r') as f:
self.filenames = f.readlines()
else:
with open(config.filenames_file, 'r') as f:
self.filenames = f.readlines()
self.mode = mode
self.transform = transform
self.to_tensor = ToTensor(mode)
self.is_for_online_eval = is_for_online_eval
if config.use_shared_dict:
self.reader = CachedReader(config.shared_dict)
else:
self.reader = ImReader()
def postprocess(self, sample):
return sample
def __getitem__(self, idx):
sample_path = self.filenames[idx]
focal = float(sample_path.split()[2])
sample = {}
if self.mode == 'train':
if self.config.dataset == 'kitti' and self.config.use_right and random.random() > 0.5:
image_path = os.path.join(
self.config.data_path, remove_leading_slash(sample_path.split()[3]))
depth_path = os.path.join(
self.config.gt_path, remove_leading_slash(sample_path.split()[4]))
else:
image_path = os.path.join(
self.config.data_path, remove_leading_slash(sample_path.split()[0]))
depth_path = os.path.join(
self.config.gt_path, remove_leading_slash(sample_path.split()[1]))
image = self.reader.open(image_path)
depth_gt = self.reader.open(depth_path)
w, h = image.size
if self.config.do_kb_crop:
height = image.height
width = image.width
top_margin = int(height - 352)
left_margin = int((width - 1216) / 2)
depth_gt = depth_gt.crop(
(left_margin, top_margin, left_margin + 1216, top_margin + 352))
image = image.crop(
(left_margin, top_margin, left_margin + 1216, top_margin + 352))
# Avoid blank boundaries due to pixel registration?
# Train images have white border. Test images have black border.
if self.config.dataset == 'nyu' and self.config.avoid_boundary:
# print("Avoiding Blank Boundaries!")
# We just crop and pad again with reflect padding to original size
# original_size = image.size
crop_params = get_white_border(np.array(image, dtype=np.uint8))
image = image.crop((crop_params.left, crop_params.top, crop_params.right, crop_params.bottom))
depth_gt = depth_gt.crop((crop_params.left, crop_params.top, crop_params.right, crop_params.bottom))
# Use reflect padding to fill the blank
image = np.array(image)
image = np.pad(image, ((crop_params.top, h - crop_params.bottom), (crop_params.left, w - crop_params.right), (0, 0)), mode='reflect')
image = Image.fromarray(image)
depth_gt = np.array(depth_gt)
depth_gt = np.pad(depth_gt, ((crop_params.top, h - crop_params.bottom), (crop_params.left, w - crop_params.right)), 'constant', constant_values=0)
depth_gt = Image.fromarray(depth_gt)
if self.config.do_random_rotate and (self.config.aug):
random_angle = (random.random() - 0.5) * 2 * self.config.degree
image = self.rotate_image(image, random_angle)
depth_gt = self.rotate_image(
depth_gt, random_angle, flag=Image.NEAREST)
image = np.asarray(image, dtype=np.float32) / 255.0
depth_gt = np.asarray(depth_gt, dtype=np.float32)
depth_gt = np.expand_dims(depth_gt, axis=2)
if self.config.dataset == 'nyu':
depth_gt = depth_gt / 1000.0
else:
depth_gt = depth_gt / 256.0
if self.config.aug and (self.config.random_crop):
image, depth_gt = self.random_crop(
image, depth_gt, self.config.input_height, self.config.input_width)
if self.config.aug and self.config.random_translate:
# print("Random Translation!")
image, depth_gt = self.random_translate(image, depth_gt, self.config.max_translation)
image, depth_gt = self.train_preprocess(image, depth_gt)
mask = np.logical_and(depth_gt > self.config.min_depth,
depth_gt < self.config.max_depth).squeeze()[None, ...]
sample = {'image': image, 'depth': depth_gt, 'focal': focal,
'mask': mask, **sample}
else:
if self.mode == 'online_eval':
data_path = self.config.data_path_eval
else:
data_path = self.config.data_path
image_path = os.path.join(
data_path, remove_leading_slash(sample_path.split()[0]))
image = np.asarray(self.reader.open(image_path),
dtype=np.float32) / 255.0
if self.mode == 'online_eval':
gt_path = self.config.gt_path_eval
depth_path = os.path.join(
gt_path, remove_leading_slash(sample_path.split()[1]))
has_valid_depth = False
try:
depth_gt = self.reader.open(depth_path)
has_valid_depth = True
except IOError:
depth_gt = False
# print('Missing gt for {}'.format(image_path))
if has_valid_depth:
depth_gt = np.asarray(depth_gt, dtype=np.float32)
depth_gt = np.expand_dims(depth_gt, axis=2)
if self.config.dataset == 'nyu':
depth_gt = depth_gt / 1000.0
else:
depth_gt = depth_gt / 256.0
mask = np.logical_and(
depth_gt >= self.config.min_depth, depth_gt <= self.config.max_depth).squeeze()[None, ...]
else:
mask = False
if self.config.do_kb_crop:
height = image.shape[0]
width = image.shape[1]
top_margin = int(height - 352)
left_margin = int((width - 1216) / 2)
image = image[top_margin:top_margin + 352,
left_margin:left_margin + 1216, :]
if self.mode == 'online_eval' and has_valid_depth:
depth_gt = depth_gt[top_margin:top_margin +
352, left_margin:left_margin + 1216, :]
if self.mode == 'online_eval':
sample = {'image': image, 'depth': depth_gt, 'focal': focal, 'has_valid_depth': has_valid_depth,
'image_path': sample_path.split()[0], 'depth_path': sample_path.split()[1],
'mask': mask}
else:
sample = {'image': image, 'focal': focal}
if (self.mode == 'train') or ('has_valid_depth' in sample and sample['has_valid_depth']):
mask = np.logical_and(depth_gt > self.config.min_depth,
depth_gt < self.config.max_depth).squeeze()[None, ...]
sample['mask'] = mask
if self.transform:
sample = self.transform(sample)
sample = self.postprocess(sample)
sample['dataset'] = self.config.dataset
sample = {**sample, 'image_path': sample_path.split()[0], 'depth_path': sample_path.split()[1]}
return sample
def rotate_image(self, image, angle, flag=Image.BILINEAR):
result = image.rotate(angle, resample=flag)
return result
def random_crop(self, img, depth, height, width):
assert img.shape[0] >= height
assert img.shape[1] >= width
assert img.shape[0] == depth.shape[0]
assert img.shape[1] == depth.shape[1]
x = random.randint(0, img.shape[1] - width)
y = random.randint(0, img.shape[0] - height)
img = img[y:y + height, x:x + width, :]
depth = depth[y:y + height, x:x + width, :]
return img, depth
def random_translate(self, img, depth, max_t=20):
assert img.shape[0] == depth.shape[0]
assert img.shape[1] == depth.shape[1]
p = self.config.translate_prob
do_translate = random.random()
if do_translate > p:
return img, depth
x = random.randint(-max_t, max_t)
y = random.randint(-max_t, max_t)
M = np.float32([[1, 0, x], [0, 1, y]])
# print(img.shape, depth.shape)
img = cv2.warpAffine(img, M, (img.shape[1], img.shape[0]))
depth = cv2.warpAffine(depth, M, (depth.shape[1], depth.shape[0]))
depth = depth.squeeze()[..., None] # add channel dim back. Affine warp removes it
# print("after", img.shape, depth.shape)
return img, depth
def train_preprocess(self, image, depth_gt):
if self.config.aug:
# Random flipping
do_flip = random.random()
if do_flip > 0.5:
image = (image[:, ::-1, :]).copy()
depth_gt = (depth_gt[:, ::-1, :]).copy()
# Random gamma, brightness, color augmentation
do_augment = random.random()
if do_augment > 0.5:
image = self.augment_image(image)
return image, depth_gt
def augment_image(self, image):
# gamma augmentation
gamma = random.uniform(0.9, 1.1)
image_aug = image ** gamma
# brightness augmentation
if self.config.dataset == 'nyu':
brightness = random.uniform(0.75, 1.25)
else:
brightness = random.uniform(0.9, 1.1)
image_aug = image_aug * brightness
# color augmentation
colors = np.random.uniform(0.9, 1.1, size=3)
white = np.ones((image.shape[0], image.shape[1]))
color_image = np.stack([white * colors[i] for i in range(3)], axis=2)
image_aug *= color_image
image_aug = np.clip(image_aug, 0, 1)
return image_aug
def __len__(self):
return len(self.filenames)
class ToTensor(object):
def __init__(self, mode, do_normalize=False, size=None):
self.mode = mode
self.normalize = transforms.Normalize(
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) if do_normalize else nn.Identity()
self.size = size
if size is not None:
self.resize = transforms.Resize(size=size)
else:
self.resize = nn.Identity()
def __call__(self, sample):
image, focal = sample['image'], sample['focal']
image = self.to_tensor(image)
image = self.normalize(image)
image = self.resize(image)
if self.mode == 'test':
return {'image': image, 'focal': focal}
depth = sample['depth']
if self.mode == 'train':
depth = self.to_tensor(depth)
return {**sample, 'image': image, 'depth': depth, 'focal': focal}
else:
has_valid_depth = sample['has_valid_depth']
image = self.resize(image)
return {**sample, 'image': image, 'depth': depth, 'focal': focal, 'has_valid_depth': has_valid_depth,
'image_path': sample['image_path'], 'depth_path': sample['depth_path']}
def to_tensor(self, pic):
if not (_is_pil_image(pic) or _is_numpy_image(pic)):
raise TypeError(
'pic should be PIL Image or ndarray. Got {}'.format(type(pic)))
if isinstance(pic, np.ndarray):
img = torch.from_numpy(pic.transpose((2, 0, 1)))
return img
# handle PIL Image
if pic.mode == 'I':
img = torch.from_numpy(np.array(pic, np.int32, copy=False))
elif pic.mode == 'I;16':
img = torch.from_numpy(np.array(pic, np.int16, copy=False))
else:
img = torch.ByteTensor(
torch.ByteStorage.from_buffer(pic.tobytes()))
# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
nchannel = 3
elif pic.mode == 'I;16':
nchannel = 1
else:
nchannel = len(pic.mode)
img = img.view(pic.size[1], pic.size[0], nchannel)
img = img.transpose(0, 1).transpose(0, 2).contiguous()
if isinstance(img, torch.ByteTensor):
return img.float()
else:
return img

View File

@@ -0,0 +1,125 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import os
import numpy as np
import torch
from PIL import Image
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms
class ToTensor(object):
def __init__(self, resize_shape):
# self.normalize = transforms.Normalize(
# mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
self.normalize = lambda x : x
self.resize = transforms.Resize(resize_shape)
def __call__(self, sample):
image, depth = sample['image'], sample['depth']
image = self.to_tensor(image)
image = self.normalize(image)
depth = self.to_tensor(depth)
image = self.resize(image)
return {'image': image, 'depth': depth, 'dataset': "ddad"}
def to_tensor(self, pic):
if isinstance(pic, np.ndarray):
img = torch.from_numpy(pic.transpose((2, 0, 1)))
return img
# # handle PIL Image
if pic.mode == 'I':
img = torch.from_numpy(np.array(pic, np.int32, copy=False))
elif pic.mode == 'I;16':
img = torch.from_numpy(np.array(pic, np.int16, copy=False))
else:
img = torch.ByteTensor(
torch.ByteStorage.from_buffer(pic.tobytes()))
# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
nchannel = 3
elif pic.mode == 'I;16':
nchannel = 1
else:
nchannel = len(pic.mode)
img = img.view(pic.size[1], pic.size[0], nchannel)
img = img.transpose(0, 1).transpose(0, 2).contiguous()
if isinstance(img, torch.ByteTensor):
return img.float()
else:
return img
class DDAD(Dataset):
def __init__(self, data_dir_root, resize_shape):
import glob
# image paths are of the form <data_dir_root>/{outleft, depthmap}/*.png
# self.image_files = glob.glob(os.path.join(data_dir_root, '*.png'))
# self.depth_files = [r.replace("_rgb.png", "_depth.npy")
# for r in self.image_files]
self.image_files, self.depth_files = [], []
with open('/mnt/bn/liheyang/MTL-SA-1B/dataset/splits/ddad/val.txt', 'r') as f:
lines = f.read().splitlines()
for line in lines:
self.image_files.append(line.split(' ')[0])
self.depth_files.append(line.split(' ')[1])
self.transform = ToTensor(resize_shape)
def __getitem__(self, idx):
image_path = self.image_files[idx]
depth_path = self.depth_files[idx]
image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
depth = np.load(depth_path) # meters
# depth[depth > 8] = -1
depth = depth[..., None]
sample = dict(image=image, depth=depth)
sample = self.transform(sample)
if idx == 0:
print(sample["image"].shape)
return sample
def __len__(self):
return len(self.image_files)
def get_ddad_loader(data_dir_root, resize_shape, batch_size=1, **kwargs):
dataset = DDAD(data_dir_root, resize_shape)
return DataLoader(dataset, batch_size, **kwargs)

View File

@@ -0,0 +1,125 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import os
import numpy as np
import torch
from PIL import Image
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms
class ToTensor(object):
def __init__(self):
# self.normalize = transforms.Normalize(
# mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
self.normalize = lambda x : x
self.resize = transforms.Resize((480, 640))
def __call__(self, sample):
image, depth = sample['image'], sample['depth']
image = self.to_tensor(image)
image = self.normalize(image)
depth = self.to_tensor(depth)
image = self.resize(image)
return {'image': image, 'depth': depth, 'dataset': "diml_indoor"}
def to_tensor(self, pic):
if isinstance(pic, np.ndarray):
img = torch.from_numpy(pic.transpose((2, 0, 1)))
return img
# # handle PIL Image
if pic.mode == 'I':
img = torch.from_numpy(np.array(pic, np.int32, copy=False))
elif pic.mode == 'I;16':
img = torch.from_numpy(np.array(pic, np.int16, copy=False))
else:
img = torch.ByteTensor(
torch.ByteStorage.from_buffer(pic.tobytes()))
# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
nchannel = 3
elif pic.mode == 'I;16':
nchannel = 1
else:
nchannel = len(pic.mode)
img = img.view(pic.size[1], pic.size[0], nchannel)
img = img.transpose(0, 1).transpose(0, 2).contiguous()
if isinstance(img, torch.ByteTensor):
return img.float()
else:
return img
class DIML_Indoor(Dataset):
def __init__(self, data_dir_root):
import glob
# image paths are of the form <data_dir_root>/{HR, LR}/<scene>/{color, depth_filled}/*.png
self.image_files = glob.glob(os.path.join(
data_dir_root, "LR", '*', 'color', '*.png'))
self.depth_files = [r.replace("color", "depth_filled").replace(
"_c.png", "_depth_filled.png") for r in self.image_files]
self.transform = ToTensor()
def __getitem__(self, idx):
image_path = self.image_files[idx]
depth_path = self.depth_files[idx]
image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
depth = np.asarray(Image.open(depth_path),
dtype='uint16') / 1000.0 # mm to meters
# print(np.shape(image))
# print(np.shape(depth))
# depth[depth > 8] = -1
depth = depth[..., None]
sample = dict(image=image, depth=depth)
# return sample
sample = self.transform(sample)
if idx == 0:
print(sample["image"].shape)
return sample
def __len__(self):
return len(self.image_files)
def get_diml_indoor_loader(data_dir_root, batch_size=1, **kwargs):
dataset = DIML_Indoor(data_dir_root)
return DataLoader(dataset, batch_size, **kwargs)
# get_diml_indoor_loader(data_dir_root="datasets/diml/indoor/test/HR")
# get_diml_indoor_loader(data_dir_root="datasets/diml/indoor/test/LR")

View File

@@ -0,0 +1,114 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import os
import numpy as np
import torch
from PIL import Image
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms
class ToTensor(object):
def __init__(self):
# self.normalize = transforms.Normalize(
# mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
self.normalize = lambda x : x
def __call__(self, sample):
image, depth = sample['image'], sample['depth']
image = self.to_tensor(image)
image = self.normalize(image)
depth = self.to_tensor(depth)
return {'image': image, 'depth': depth, 'dataset': "diml_outdoor"}
def to_tensor(self, pic):
if isinstance(pic, np.ndarray):
img = torch.from_numpy(pic.transpose((2, 0, 1)))
return img
# # handle PIL Image
if pic.mode == 'I':
img = torch.from_numpy(np.array(pic, np.int32, copy=False))
elif pic.mode == 'I;16':
img = torch.from_numpy(np.array(pic, np.int16, copy=False))
else:
img = torch.ByteTensor(
torch.ByteStorage.from_buffer(pic.tobytes()))
# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
nchannel = 3
elif pic.mode == 'I;16':
nchannel = 1
else:
nchannel = len(pic.mode)
img = img.view(pic.size[1], pic.size[0], nchannel)
img = img.transpose(0, 1).transpose(0, 2).contiguous()
if isinstance(img, torch.ByteTensor):
return img.float()
else:
return img
class DIML_Outdoor(Dataset):
def __init__(self, data_dir_root):
import glob
# image paths are of the form <data_dir_root>/{outleft, depthmap}/*.png
self.image_files = glob.glob(os.path.join(
data_dir_root, 'outleft', '*.png'))
self.depth_files = [r.replace("outleft", "depthmap")
for r in self.image_files]
self.transform = ToTensor()
def __getitem__(self, idx):
image_path = self.image_files[idx]
depth_path = self.depth_files[idx]
image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
depth = np.asarray(Image.open(depth_path),
dtype='uint16') / 1000.0 # mm to meters
# depth[depth > 8] = -1
depth = depth[..., None]
sample = dict(image=image, depth=depth, dataset="diml_outdoor")
# return sample
return self.transform(sample)
def __len__(self):
return len(self.image_files)
def get_diml_outdoor_loader(data_dir_root, batch_size=1, **kwargs):
dataset = DIML_Outdoor(data_dir_root)
return DataLoader(dataset, batch_size, **kwargs)
# get_diml_outdoor_loader(data_dir_root="datasets/diml/outdoor/test/HR")
# get_diml_outdoor_loader(data_dir_root="datasets/diml/outdoor/test/LR")

View File

@@ -0,0 +1,125 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import os
import numpy as np
import torch
from PIL import Image
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms
class ToTensor(object):
def __init__(self):
# self.normalize = transforms.Normalize(
# mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
self.normalize = lambda x : x
self.resize = transforms.Resize(480)
def __call__(self, sample):
image, depth = sample['image'], sample['depth']
image = self.to_tensor(image)
image = self.normalize(image)
depth = self.to_tensor(depth)
image = self.resize(image)
return {'image': image, 'depth': depth, 'dataset': "diode"}
def to_tensor(self, pic):
if isinstance(pic, np.ndarray):
img = torch.from_numpy(pic.transpose((2, 0, 1)))
return img
# # handle PIL Image
if pic.mode == 'I':
img = torch.from_numpy(np.array(pic, np.int32, copy=False))
elif pic.mode == 'I;16':
img = torch.from_numpy(np.array(pic, np.int16, copy=False))
else:
img = torch.ByteTensor(
torch.ByteStorage.from_buffer(pic.tobytes()))
# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
nchannel = 3
elif pic.mode == 'I;16':
nchannel = 1
else:
nchannel = len(pic.mode)
img = img.view(pic.size[1], pic.size[0], nchannel)
img = img.transpose(0, 1).transpose(0, 2).contiguous()
if isinstance(img, torch.ByteTensor):
return img.float()
else:
return img
class DIODE(Dataset):
def __init__(self, data_dir_root):
import glob
# image paths are of the form <data_dir_root>/scene_#/scan_#/*.png
self.image_files = glob.glob(
os.path.join(data_dir_root, '*', '*', '*.png'))
self.depth_files = [r.replace(".png", "_depth.npy")
for r in self.image_files]
self.depth_mask_files = [
r.replace(".png", "_depth_mask.npy") for r in self.image_files]
self.transform = ToTensor()
def __getitem__(self, idx):
image_path = self.image_files[idx]
depth_path = self.depth_files[idx]
depth_mask_path = self.depth_mask_files[idx]
image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
depth = np.load(depth_path) # in meters
valid = np.load(depth_mask_path) # binary
# depth[depth > 8] = -1
# depth = depth[..., None]
sample = dict(image=image, depth=depth, valid=valid)
# return sample
sample = self.transform(sample)
if idx == 0:
print(sample["image"].shape)
return sample
def __len__(self):
return len(self.image_files)
def get_diode_loader(data_dir_root, batch_size=1, **kwargs):
dataset = DIODE(data_dir_root)
return DataLoader(dataset, batch_size, **kwargs)
# get_diode_loader(data_dir_root="datasets/diode/val/outdoor")

View File

@@ -0,0 +1,138 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import glob
import os
import h5py
import numpy as np
import torch
from PIL import Image
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms
def hypersim_distance_to_depth(npyDistance):
intWidth, intHeight, fltFocal = 1024, 768, 886.81
npyImageplaneX = np.linspace((-0.5 * intWidth) + 0.5, (0.5 * intWidth) - 0.5, intWidth).reshape(
1, intWidth).repeat(intHeight, 0).astype(np.float32)[:, :, None]
npyImageplaneY = np.linspace((-0.5 * intHeight) + 0.5, (0.5 * intHeight) - 0.5,
intHeight).reshape(intHeight, 1).repeat(intWidth, 1).astype(np.float32)[:, :, None]
npyImageplaneZ = np.full([intHeight, intWidth, 1], fltFocal, np.float32)
npyImageplane = np.concatenate(
[npyImageplaneX, npyImageplaneY, npyImageplaneZ], 2)
npyDepth = npyDistance / np.linalg.norm(npyImageplane, 2, 2) * fltFocal
return npyDepth
class ToTensor(object):
def __init__(self):
# self.normalize = transforms.Normalize(
# mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
self.normalize = lambda x: x
self.resize = transforms.Resize((480, 640))
def __call__(self, sample):
image, depth = sample['image'], sample['depth']
image = self.to_tensor(image)
image = self.normalize(image)
depth = self.to_tensor(depth)
image = self.resize(image)
return {'image': image, 'depth': depth, 'dataset': "hypersim"}
def to_tensor(self, pic):
if isinstance(pic, np.ndarray):
img = torch.from_numpy(pic.transpose((2, 0, 1)))
return img
# # handle PIL Image
if pic.mode == 'I':
img = torch.from_numpy(np.array(pic, np.int32, copy=False))
elif pic.mode == 'I;16':
img = torch.from_numpy(np.array(pic, np.int16, copy=False))
else:
img = torch.ByteTensor(
torch.ByteStorage.from_buffer(pic.tobytes()))
# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
nchannel = 3
elif pic.mode == 'I;16':
nchannel = 1
else:
nchannel = len(pic.mode)
img = img.view(pic.size[1], pic.size[0], nchannel)
img = img.transpose(0, 1).transpose(0, 2).contiguous()
if isinstance(img, torch.ByteTensor):
return img.float()
else:
return img
class HyperSim(Dataset):
def __init__(self, data_dir_root):
# image paths are of the form <data_dir_root>/<scene>/images/scene_cam_#_final_preview/*.tonemap.jpg
# depth paths are of the form <data_dir_root>/<scene>/images/scene_cam_#_final_preview/*.depth_meters.hdf5
self.image_files = glob.glob(os.path.join(
data_dir_root, '*', 'images', 'scene_cam_*_final_preview', '*.tonemap.jpg'))
self.depth_files = [r.replace("_final_preview", "_geometry_hdf5").replace(
".tonemap.jpg", ".depth_meters.hdf5") for r in self.image_files]
self.transform = ToTensor()
def __getitem__(self, idx):
image_path = self.image_files[idx]
depth_path = self.depth_files[idx]
image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
# depth from hdf5
depth_fd = h5py.File(depth_path, "r")
# in meters (Euclidean distance)
distance_meters = np.array(depth_fd['dataset'])
depth = hypersim_distance_to_depth(
distance_meters) # in meters (planar depth)
# depth[depth > 8] = -1
depth = depth[..., None]
sample = dict(image=image, depth=depth)
sample = self.transform(sample)
if idx == 0:
print(sample["image"].shape)
return sample
def __len__(self):
return len(self.image_files)
def get_hypersim_loader(data_dir_root, batch_size=1, **kwargs):
dataset = HyperSim(data_dir_root)
return DataLoader(dataset, batch_size, **kwargs)

View File

@@ -0,0 +1,81 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import os
import numpy as np
import torch
from PIL import Image
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms as T
class iBims(Dataset):
def __init__(self, config):
root_folder = config.ibims_root
with open(os.path.join(root_folder, "imagelist.txt"), 'r') as f:
imglist = f.read().split()
samples = []
for basename in imglist:
img_path = os.path.join(root_folder, 'rgb', basename + ".png")
depth_path = os.path.join(root_folder, 'depth', basename + ".png")
valid_mask_path = os.path.join(
root_folder, 'mask_invalid', basename+".png")
transp_mask_path = os.path.join(
root_folder, 'mask_transp', basename+".png")
samples.append(
(img_path, depth_path, valid_mask_path, transp_mask_path))
self.samples = samples
# self.normalize = T.Normalize(
# mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
self.normalize = lambda x : x
def __getitem__(self, idx):
img_path, depth_path, valid_mask_path, transp_mask_path = self.samples[idx]
img = np.asarray(Image.open(img_path), dtype=np.float32) / 255.0
depth = np.asarray(Image.open(depth_path),
dtype=np.uint16).astype('float')*50.0/65535
mask_valid = np.asarray(Image.open(valid_mask_path))
mask_transp = np.asarray(Image.open(transp_mask_path))
# depth = depth * mask_valid * mask_transp
depth = np.where(mask_valid * mask_transp, depth, -1)
img = torch.from_numpy(img).permute(2, 0, 1)
img = self.normalize(img)
depth = torch.from_numpy(depth).unsqueeze(0)
return dict(image=img, depth=depth, image_path=img_path, depth_path=depth_path, dataset='ibims')
def __len__(self):
return len(self.samples)
def get_ibims_loader(config, batch_size=1, **kwargs):
dataloader = DataLoader(iBims(config), batch_size=batch_size, **kwargs)
return dataloader

View File

@@ -0,0 +1,154 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import numpy as np
from dataclasses import dataclass
from typing import Tuple, List
# dataclass to store the crop parameters
@dataclass
class CropParams:
top: int
bottom: int
left: int
right: int
def get_border_params(rgb_image, tolerance=0.1, cut_off=20, value=0, level_diff_threshold=5, channel_axis=-1, min_border=5) -> CropParams:
gray_image = np.mean(rgb_image, axis=channel_axis)
h, w = gray_image.shape
def num_value_pixels(arr):
return np.sum(np.abs(arr - value) < level_diff_threshold)
def is_above_tolerance(arr, total_pixels):
return (num_value_pixels(arr) / total_pixels) > tolerance
# Crop top border until number of value pixels become below tolerance
top = min_border
while is_above_tolerance(gray_image[top, :], w) and top < h-1:
top += 1
if top > cut_off:
break
# Crop bottom border until number of value pixels become below tolerance
bottom = h - min_border
while is_above_tolerance(gray_image[bottom, :], w) and bottom > 0:
bottom -= 1
if h - bottom > cut_off:
break
# Crop left border until number of value pixels become below tolerance
left = min_border
while is_above_tolerance(gray_image[:, left], h) and left < w-1:
left += 1
if left > cut_off:
break
# Crop right border until number of value pixels become below tolerance
right = w - min_border
while is_above_tolerance(gray_image[:, right], h) and right > 0:
right -= 1
if w - right > cut_off:
break
return CropParams(top, bottom, left, right)
def get_white_border(rgb_image, value=255, **kwargs) -> CropParams:
"""Crops the white border of the RGB.
Args:
rgb: RGB image, shape (H, W, 3).
Returns:
Crop parameters.
"""
if value == 255:
# assert range of values in rgb image is [0, 255]
assert np.max(rgb_image) <= 255 and np.min(rgb_image) >= 0, "RGB image values are not in range [0, 255]."
assert rgb_image.max() > 1, "RGB image values are not in range [0, 255]."
elif value == 1:
# assert range of values in rgb image is [0, 1]
assert np.max(rgb_image) <= 1 and np.min(rgb_image) >= 0, "RGB image values are not in range [0, 1]."
return get_border_params(rgb_image, value=value, **kwargs)
def get_black_border(rgb_image, **kwargs) -> CropParams:
"""Crops the black border of the RGB.
Args:
rgb: RGB image, shape (H, W, 3).
Returns:
Crop parameters.
"""
return get_border_params(rgb_image, value=0, **kwargs)
def crop_image(image: np.ndarray, crop_params: CropParams) -> np.ndarray:
"""Crops the image according to the crop parameters.
Args:
image: RGB or depth image, shape (H, W, 3) or (H, W).
crop_params: Crop parameters.
Returns:
Cropped image.
"""
return image[crop_params.top:crop_params.bottom, crop_params.left:crop_params.right]
def crop_images(*images: np.ndarray, crop_params: CropParams) -> Tuple[np.ndarray]:
"""Crops the images according to the crop parameters.
Args:
images: RGB or depth images, shape (H, W, 3) or (H, W).
crop_params: Crop parameters.
Returns:
Cropped images.
"""
return tuple(crop_image(image, crop_params) for image in images)
def crop_black_or_white_border(rgb_image, *other_images: np.ndarray, tolerance=0.1, cut_off=20, level_diff_threshold=5) -> Tuple[np.ndarray]:
"""Crops the white and black border of the RGB and depth images.
Args:
rgb: RGB image, shape (H, W, 3). This image is used to determine the border.
other_images: The other images to crop according to the border of the RGB image.
Returns:
Cropped RGB and other images.
"""
# crop black border
crop_params = get_black_border(rgb_image, tolerance=tolerance, cut_off=cut_off, level_diff_threshold=level_diff_threshold)
cropped_images = crop_images(rgb_image, *other_images, crop_params=crop_params)
# crop white border
crop_params = get_white_border(cropped_images[0], tolerance=tolerance, cut_off=cut_off, level_diff_threshold=level_diff_threshold)
cropped_images = crop_images(*cropped_images, crop_params=crop_params)
return cropped_images

View File

@@ -0,0 +1,115 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import os
import numpy as np
import torch
from PIL import Image
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms
class ToTensor(object):
def __init__(self):
# self.normalize = transforms.Normalize(
# mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
self.normalize = lambda x : x
def __call__(self, sample):
image, depth = sample['image'], sample['depth']
image = self.to_tensor(image)
image = self.normalize(image)
depth = self.to_tensor(depth)
return {'image': image, 'depth': depth, 'dataset': "sunrgbd"}
def to_tensor(self, pic):
if isinstance(pic, np.ndarray):
img = torch.from_numpy(pic.transpose((2, 0, 1)))
return img
# # handle PIL Image
if pic.mode == 'I':
img = torch.from_numpy(np.array(pic, np.int32, copy=False))
elif pic.mode == 'I;16':
img = torch.from_numpy(np.array(pic, np.int16, copy=False))
else:
img = torch.ByteTensor(
torch.ByteStorage.from_buffer(pic.tobytes()))
# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
nchannel = 3
elif pic.mode == 'I;16':
nchannel = 1
else:
nchannel = len(pic.mode)
img = img.view(pic.size[1], pic.size[0], nchannel)
img = img.transpose(0, 1).transpose(0, 2).contiguous()
if isinstance(img, torch.ByteTensor):
return img.float()
else:
return img
class SunRGBD(Dataset):
def __init__(self, data_dir_root):
# test_file_dirs = loadmat(train_test_file)['alltest'].squeeze()
# all_test = [t[0].replace("/n/fs/sun3d/data/", "") for t in test_file_dirs]
# self.all_test = [os.path.join(data_dir_root, t) for t in all_test]
import glob
# self.image_files = glob.glob(
# os.path.join(data_dir_root, 'rgb', 'rgb', '*'))
# self.depth_files = [
# r.replace("rgb/rgb", "gt/gt").replace("jpg", "png") for r in self.image_files]
self.image_files, self.depth_files = [], []
filenames = os.listdir(os.path.join(data_dir_root, 'rgb'))
for i, filename in enumerate(filenames):
self.image_files.append(os.path.join(data_dir_root, 'rgb', filename))
base_num = int(filename.replace('.jpg', '').replace('img-', ''))
self.depth_files.append(os.path.join(data_dir_root, 'depth', str(base_num) + '.png'))
self.transform = ToTensor()
def __getitem__(self, idx):
image_path = self.image_files[idx]
depth_path = self.depth_files[idx]
image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
depth = np.asarray(Image.open(depth_path), dtype='uint16') / 10000.0
# print(depth, depth.min(), depth.max())
depth[depth > 8] = -1
depth = depth[..., None]
return self.transform(dict(image=image, depth=depth))
def __len__(self):
return len(self.image_files)
def get_sunrgbd_loader(data_dir_root, batch_size=1, **kwargs):
dataset = SunRGBD(data_dir_root)
return DataLoader(dataset, batch_size, **kwargs)

View File

@@ -0,0 +1,481 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import math
import random
import cv2
import numpy as np
class RandomFliplr(object):
"""Horizontal flip of the sample with given probability.
"""
def __init__(self, probability=0.5):
"""Init.
Args:
probability (float, optional): Flip probability. Defaults to 0.5.
"""
self.__probability = probability
def __call__(self, sample):
prob = random.random()
if prob < self.__probability:
for k, v in sample.items():
if len(v.shape) >= 2:
sample[k] = np.fliplr(v).copy()
return sample
def apply_min_size(sample, size, image_interpolation_method=cv2.INTER_AREA):
"""Rezise the sample to ensure the given size. Keeps aspect ratio.
Args:
sample (dict): sample
size (tuple): image size
Returns:
tuple: new size
"""
shape = list(sample["disparity"].shape)
if shape[0] >= size[0] and shape[1] >= size[1]:
return sample
scale = [0, 0]
scale[0] = size[0] / shape[0]
scale[1] = size[1] / shape[1]
scale = max(scale)
shape[0] = math.ceil(scale * shape[0])
shape[1] = math.ceil(scale * shape[1])
# resize
sample["image"] = cv2.resize(
sample["image"], tuple(shape[::-1]), interpolation=image_interpolation_method
)
sample["disparity"] = cv2.resize(
sample["disparity"], tuple(shape[::-1]), interpolation=cv2.INTER_NEAREST
)
sample["mask"] = cv2.resize(
sample["mask"].astype(np.float32),
tuple(shape[::-1]),
interpolation=cv2.INTER_NEAREST,
)
sample["mask"] = sample["mask"].astype(bool)
return tuple(shape)
class RandomCrop(object):
"""Get a random crop of the sample with the given size (width, height).
"""
def __init__(
self,
width,
height,
resize_if_needed=False,
image_interpolation_method=cv2.INTER_AREA,
):
"""Init.
Args:
width (int): output width
height (int): output height
resize_if_needed (bool, optional): If True, sample might be upsampled to ensure
that a crop of size (width, height) is possbile. Defaults to False.
"""
self.__size = (height, width)
self.__resize_if_needed = resize_if_needed
self.__image_interpolation_method = image_interpolation_method
def __call__(self, sample):
shape = sample["disparity"].shape
if self.__size[0] > shape[0] or self.__size[1] > shape[1]:
if self.__resize_if_needed:
shape = apply_min_size(
sample, self.__size, self.__image_interpolation_method
)
else:
raise Exception(
"Output size {} bigger than input size {}.".format(
self.__size, shape
)
)
offset = (
np.random.randint(shape[0] - self.__size[0] + 1),
np.random.randint(shape[1] - self.__size[1] + 1),
)
for k, v in sample.items():
if k == "code" or k == "basis":
continue
if len(sample[k].shape) >= 2:
sample[k] = v[
offset[0]: offset[0] + self.__size[0],
offset[1]: offset[1] + self.__size[1],
]
return sample
class Resize(object):
"""Resize sample to given size (width, height).
"""
def __init__(
self,
width,
height,
resize_target=True,
keep_aspect_ratio=False,
ensure_multiple_of=1,
resize_method="lower_bound",
image_interpolation_method=cv2.INTER_AREA,
letter_box=False,
):
"""Init.
Args:
width (int): desired output width
height (int): desired output height
resize_target (bool, optional):
True: Resize the full sample (image, mask, target).
False: Resize image only.
Defaults to True.
keep_aspect_ratio (bool, optional):
True: Keep the aspect ratio of the input sample.
Output sample might not have the given width and height, and
resize behaviour depends on the parameter 'resize_method'.
Defaults to False.
ensure_multiple_of (int, optional):
Output width and height is constrained to be multiple of this parameter.
Defaults to 1.
resize_method (str, optional):
"lower_bound": Output will be at least as large as the given size.
"upper_bound": Output will be at max as large as the given size. (Output size might be smaller than given size.)
"minimal": Scale as least as possible. (Output size might be smaller than given size.)
Defaults to "lower_bound".
"""
self.__width = width
self.__height = height
self.__resize_target = resize_target
self.__keep_aspect_ratio = keep_aspect_ratio
self.__multiple_of = ensure_multiple_of
self.__resize_method = resize_method
self.__image_interpolation_method = image_interpolation_method
self.__letter_box = letter_box
def constrain_to_multiple_of(self, x, min_val=0, max_val=None):
y = (np.round(x / self.__multiple_of) * self.__multiple_of).astype(int)
if max_val is not None and y > max_val:
y = (np.floor(x / self.__multiple_of)
* self.__multiple_of).astype(int)
if y < min_val:
y = (np.ceil(x / self.__multiple_of)
* self.__multiple_of).astype(int)
return y
def get_size(self, width, height):
# determine new height and width
scale_height = self.__height / height
scale_width = self.__width / width
if self.__keep_aspect_ratio:
if self.__resize_method == "lower_bound":
# scale such that output size is lower bound
if scale_width > scale_height:
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
elif self.__resize_method == "upper_bound":
# scale such that output size is upper bound
if scale_width < scale_height:
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
elif self.__resize_method == "minimal":
# scale as least as possbile
if abs(1 - scale_width) < abs(1 - scale_height):
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
else:
raise ValueError(
f"resize_method {self.__resize_method} not implemented"
)
if self.__resize_method == "lower_bound":
new_height = self.constrain_to_multiple_of(
scale_height * height, min_val=self.__height
)
new_width = self.constrain_to_multiple_of(
scale_width * width, min_val=self.__width
)
elif self.__resize_method == "upper_bound":
new_height = self.constrain_to_multiple_of(
scale_height * height, max_val=self.__height
)
new_width = self.constrain_to_multiple_of(
scale_width * width, max_val=self.__width
)
elif self.__resize_method == "minimal":
new_height = self.constrain_to_multiple_of(scale_height * height)
new_width = self.constrain_to_multiple_of(scale_width * width)
else:
raise ValueError(
f"resize_method {self.__resize_method} not implemented")
return (new_width, new_height)
def make_letter_box(self, sample):
top = bottom = (self.__height - sample.shape[0]) // 2
left = right = (self.__width - sample.shape[1]) // 2
sample = cv2.copyMakeBorder(
sample, top, bottom, left, right, cv2.BORDER_CONSTANT, None, 0)
return sample
def __call__(self, sample):
width, height = self.get_size(
sample["image"].shape[1], sample["image"].shape[0]
)
# resize sample
sample["image"] = cv2.resize(
sample["image"],
(width, height),
interpolation=self.__image_interpolation_method,
)
if self.__letter_box:
sample["image"] = self.make_letter_box(sample["image"])
if self.__resize_target:
if "disparity" in sample:
sample["disparity"] = cv2.resize(
sample["disparity"],
(width, height),
interpolation=cv2.INTER_NEAREST,
)
if self.__letter_box:
sample["disparity"] = self.make_letter_box(
sample["disparity"])
if "depth" in sample:
sample["depth"] = cv2.resize(
sample["depth"], (width,
height), interpolation=cv2.INTER_NEAREST
)
if self.__letter_box:
sample["depth"] = self.make_letter_box(sample["depth"])
sample["mask"] = cv2.resize(
sample["mask"].astype(np.float32),
(width, height),
interpolation=cv2.INTER_NEAREST,
)
if self.__letter_box:
sample["mask"] = self.make_letter_box(sample["mask"])
sample["mask"] = sample["mask"].astype(bool)
return sample
class ResizeFixed(object):
def __init__(self, size):
self.__size = size
def __call__(self, sample):
sample["image"] = cv2.resize(
sample["image"], self.__size[::-1], interpolation=cv2.INTER_LINEAR
)
sample["disparity"] = cv2.resize(
sample["disparity"], self.__size[::-
1], interpolation=cv2.INTER_NEAREST
)
sample["mask"] = cv2.resize(
sample["mask"].astype(np.float32),
self.__size[::-1],
interpolation=cv2.INTER_NEAREST,
)
sample["mask"] = sample["mask"].astype(bool)
return sample
class Rescale(object):
"""Rescale target values to the interval [0, max_val].
If input is constant, values are set to max_val / 2.
"""
def __init__(self, max_val=1.0, use_mask=True):
"""Init.
Args:
max_val (float, optional): Max output value. Defaults to 1.0.
use_mask (bool, optional): Only operate on valid pixels (mask == True). Defaults to True.
"""
self.__max_val = max_val
self.__use_mask = use_mask
def __call__(self, sample):
disp = sample["disparity"]
if self.__use_mask:
mask = sample["mask"]
else:
mask = np.ones_like(disp, dtype=np.bool)
if np.sum(mask) == 0:
return sample
min_val = np.min(disp[mask])
max_val = np.max(disp[mask])
if max_val > min_val:
sample["disparity"][mask] = (
(disp[mask] - min_val) / (max_val - min_val) * self.__max_val
)
else:
sample["disparity"][mask] = np.ones_like(
disp[mask]) * self.__max_val / 2.0
return sample
# mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
class NormalizeImage(object):
"""Normlize image by given mean and std.
"""
def __init__(self, mean, std):
self.__mean = mean
self.__std = std
def __call__(self, sample):
sample["image"] = (sample["image"] - self.__mean) / self.__std
return sample
class DepthToDisparity(object):
"""Convert depth to disparity. Removes depth from sample.
"""
def __init__(self, eps=1e-4):
self.__eps = eps
def __call__(self, sample):
assert "depth" in sample
sample["mask"][sample["depth"] < self.__eps] = False
sample["disparity"] = np.zeros_like(sample["depth"])
sample["disparity"][sample["depth"] >= self.__eps] = (
1.0 / sample["depth"][sample["depth"] >= self.__eps]
)
del sample["depth"]
return sample
class DisparityToDepth(object):
"""Convert disparity to depth. Removes disparity from sample.
"""
def __init__(self, eps=1e-4):
self.__eps = eps
def __call__(self, sample):
assert "disparity" in sample
disp = np.abs(sample["disparity"])
sample["mask"][disp < self.__eps] = False
# print(sample["disparity"])
# print(sample["mask"].sum())
# exit()
sample["depth"] = np.zeros_like(disp)
sample["depth"][disp >= self.__eps] = (
1.0 / disp[disp >= self.__eps]
)
del sample["disparity"]
return sample
class PrepareForNet(object):
"""Prepare sample for usage as network input.
"""
def __init__(self):
pass
def __call__(self, sample):
image = np.transpose(sample["image"], (2, 0, 1))
sample["image"] = np.ascontiguousarray(image).astype(np.float32)
if "mask" in sample:
sample["mask"] = sample["mask"].astype(np.float32)
sample["mask"] = np.ascontiguousarray(sample["mask"])
if "disparity" in sample:
disparity = sample["disparity"].astype(np.float32)
sample["disparity"] = np.ascontiguousarray(disparity)
if "depth" in sample:
depth = sample["depth"].astype(np.float32)
sample["depth"] = np.ascontiguousarray(depth)
return sample

View File

@@ -0,0 +1,151 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import torch
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms
import os
from PIL import Image
import numpy as np
import cv2
class ToTensor(object):
def __init__(self):
self.normalize = transforms.Normalize(
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
# self.resize = transforms.Resize((375, 1242))
def __call__(self, sample):
image, depth = sample['image'], sample['depth']
image = self.to_tensor(image)
image = self.normalize(image)
depth = self.to_tensor(depth)
# image = self.resize(image)
return {'image': image, 'depth': depth, 'dataset': "vkitti"}
def to_tensor(self, pic):
if isinstance(pic, np.ndarray):
img = torch.from_numpy(pic.transpose((2, 0, 1)))
return img
# # handle PIL Image
if pic.mode == 'I':
img = torch.from_numpy(np.array(pic, np.int32, copy=False))
elif pic.mode == 'I;16':
img = torch.from_numpy(np.array(pic, np.int16, copy=False))
else:
img = torch.ByteTensor(
torch.ByteStorage.from_buffer(pic.tobytes()))
# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
nchannel = 3
elif pic.mode == 'I;16':
nchannel = 1
else:
nchannel = len(pic.mode)
img = img.view(pic.size[1], pic.size[0], nchannel)
img = img.transpose(0, 1).transpose(0, 2).contiguous()
if isinstance(img, torch.ByteTensor):
return img.float()
else:
return img
class VKITTI(Dataset):
def __init__(self, data_dir_root, do_kb_crop=True):
import glob
# image paths are of the form <data_dir_root>/{HR, LR}/<scene>/{color, depth_filled}/*.png
self.image_files = glob.glob(os.path.join(
data_dir_root, "test_color", '*.png'))
self.depth_files = [r.replace("test_color", "test_depth")
for r in self.image_files]
self.do_kb_crop = True
self.transform = ToTensor()
def __getitem__(self, idx):
image_path = self.image_files[idx]
depth_path = self.depth_files[idx]
image = Image.open(image_path)
depth = Image.open(depth_path)
depth = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR |
cv2.IMREAD_ANYDEPTH)
print("dpeth min max", depth.min(), depth.max())
# print(np.shape(image))
# print(np.shape(depth))
# depth[depth > 8] = -1
if self.do_kb_crop and False:
height = image.height
width = image.width
top_margin = int(height - 352)
left_margin = int((width - 1216) / 2)
depth = depth.crop(
(left_margin, top_margin, left_margin + 1216, top_margin + 352))
image = image.crop(
(left_margin, top_margin, left_margin + 1216, top_margin + 352))
# uv = uv[:, top_margin:top_margin + 352, left_margin:left_margin + 1216]
image = np.asarray(image, dtype=np.float32) / 255.0
# depth = np.asarray(depth, dtype=np.uint16) /1.
depth = depth[..., None]
sample = dict(image=image, depth=depth)
# return sample
sample = self.transform(sample)
if idx == 0:
print(sample["image"].shape)
return sample
def __len__(self):
return len(self.image_files)
def get_vkitti_loader(data_dir_root, batch_size=1, **kwargs):
dataset = VKITTI(data_dir_root)
return DataLoader(dataset, batch_size, **kwargs)
if __name__ == "__main__":
loader = get_vkitti_loader(
data_dir_root="/home/bhatsf/shortcuts/datasets/vkitti_test")
print("Total files", len(loader.dataset))
for i, sample in enumerate(loader):
print(sample["image"].shape)
print(sample["depth"].shape)
print(sample["dataset"])
print(sample['depth'].min(), sample['depth'].max())
if i > 5:
break

View File

@@ -0,0 +1,187 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import os
import cv2
import numpy as np
import torch
from PIL import Image
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms
class ToTensor(object):
def __init__(self):
# self.normalize = transforms.Normalize(
# mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
self.normalize = lambda x: x
# self.resize = transforms.Resize((375, 1242))
def __call__(self, sample):
image, depth = sample['image'], sample['depth']
image = self.to_tensor(image)
image = self.normalize(image)
depth = self.to_tensor(depth)
# image = self.resize(image)
return {'image': image, 'depth': depth, 'dataset': "vkitti"}
def to_tensor(self, pic):
if isinstance(pic, np.ndarray):
img = torch.from_numpy(pic.transpose((2, 0, 1)))
return img
# # handle PIL Image
if pic.mode == 'I':
img = torch.from_numpy(np.array(pic, np.int32, copy=False))
elif pic.mode == 'I;16':
img = torch.from_numpy(np.array(pic, np.int16, copy=False))
else:
img = torch.ByteTensor(
torch.ByteStorage.from_buffer(pic.tobytes()))
# PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
if pic.mode == 'YCbCr':
nchannel = 3
elif pic.mode == 'I;16':
nchannel = 1
else:
nchannel = len(pic.mode)
img = img.view(pic.size[1], pic.size[0], nchannel)
img = img.transpose(0, 1).transpose(0, 2).contiguous()
if isinstance(img, torch.ByteTensor):
return img.float()
else:
return img
class VKITTI2(Dataset):
def __init__(self, data_dir_root, do_kb_crop=True, split="test"):
import glob
# image paths are of the form <data_dir_root>/rgb/<scene>/<variant>/frames/<rgb,depth>/Camera<0,1>/rgb_{}.jpg
self.image_files = glob.glob(os.path.join(
data_dir_root, "**", "frames", "rgb", "Camera_0", '*.jpg'), recursive=True)
self.depth_files = [r.replace("/rgb/", "/depth/").replace(
"rgb_", "depth_").replace(".jpg", ".png") for r in self.image_files]
self.do_kb_crop = True
self.transform = ToTensor()
# If train test split is not created, then create one.
# Split is such that 8% of the frames from each scene are used for testing.
if not os.path.exists(os.path.join(data_dir_root, "train.txt")):
import random
scenes = set([os.path.basename(os.path.dirname(
os.path.dirname(os.path.dirname(f)))) for f in self.image_files])
train_files = []
test_files = []
for scene in scenes:
scene_files = [f for f in self.image_files if os.path.basename(
os.path.dirname(os.path.dirname(os.path.dirname(f)))) == scene]
random.shuffle(scene_files)
train_files.extend(scene_files[:int(len(scene_files) * 0.92)])
test_files.extend(scene_files[int(len(scene_files) * 0.92):])
with open(os.path.join(data_dir_root, "train.txt"), "w") as f:
f.write("\n".join(train_files))
with open(os.path.join(data_dir_root, "test.txt"), "w") as f:
f.write("\n".join(test_files))
if split == "train":
with open(os.path.join(data_dir_root, "train.txt"), "r") as f:
self.image_files = f.read().splitlines()
self.depth_files = [r.replace("/rgb/", "/depth/").replace(
"rgb_", "depth_").replace(".jpg", ".png") for r in self.image_files]
elif split == "test":
with open(os.path.join(data_dir_root, "test.txt"), "r") as f:
self.image_files = f.read().splitlines()
self.depth_files = [r.replace("/rgb/", "/depth/").replace(
"rgb_", "depth_").replace(".jpg", ".png") for r in self.image_files]
def __getitem__(self, idx):
image_path = self.image_files[idx]
depth_path = self.depth_files[idx]
image = Image.open(image_path)
# depth = Image.open(depth_path)
depth = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR |
cv2.IMREAD_ANYDEPTH) / 100.0 # cm to m
depth = Image.fromarray(depth)
# print("dpeth min max", depth.min(), depth.max())
# print(np.shape(image))
# print(np.shape(depth))
if self.do_kb_crop:
if idx == 0:
print("Using KB input crop")
height = image.height
width = image.width
top_margin = int(height - 352)
left_margin = int((width - 1216) / 2)
depth = depth.crop(
(left_margin, top_margin, left_margin + 1216, top_margin + 352))
image = image.crop(
(left_margin, top_margin, left_margin + 1216, top_margin + 352))
# uv = uv[:, top_margin:top_margin + 352, left_margin:left_margin + 1216]
image = np.asarray(image, dtype=np.float32) / 255.0
# depth = np.asarray(depth, dtype=np.uint16) /1.
depth = np.asarray(depth, dtype=np.float32) / 1.
depth[depth > 80] = -1
depth = depth[..., None]
sample = dict(image=image, depth=depth)
# return sample
sample = self.transform(sample)
if idx == 0:
print(sample["image"].shape)
return sample
def __len__(self):
return len(self.image_files)
def get_vkitti2_loader(data_dir_root, batch_size=1, **kwargs):
dataset = VKITTI2(data_dir_root)
return DataLoader(dataset, batch_size, **kwargs)
if __name__ == "__main__":
loader = get_vkitti2_loader(
data_dir_root="/home/bhatsf/shortcuts/datasets/vkitti2")
print("Total files", len(loader.dataset))
for i, sample in enumerate(loader):
print(sample["image"].shape)
print(sample["depth"].shape)
print(sample["dataset"])
print(sample['depth'].min(), sample['depth'].max())
if i > 5:
break

View File

@@ -0,0 +1,24 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat

View File

@@ -0,0 +1,24 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat

View File

@@ -0,0 +1,376 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import torch
import torch.nn as nn
import numpy as np
from torchvision.transforms import Normalize
from zoedepth.models.base_models.dpt_dinov2.dpt import DPT_DINOv2
def denormalize(x):
"""Reverses the imagenet normalization applied to the input.
Args:
x (torch.Tensor - shape(N,3,H,W)): input tensor
Returns:
torch.Tensor - shape(N,3,H,W): Denormalized input
"""
mean = torch.Tensor([0.485, 0.456, 0.406]).view(1, 3, 1, 1).to(x.device)
std = torch.Tensor([0.229, 0.224, 0.225]).view(1, 3, 1, 1).to(x.device)
return x * std + mean
def get_activation(name, bank):
def hook(model, input, output):
bank[name] = output
return hook
class Resize(object):
"""Resize sample to given size (width, height).
"""
def __init__(
self,
width,
height,
resize_target=True,
keep_aspect_ratio=False,
ensure_multiple_of=1,
resize_method="lower_bound",
):
"""Init.
Args:
width (int): desired output width
height (int): desired output height
resize_target (bool, optional):
True: Resize the full sample (image, mask, target).
False: Resize image only.
Defaults to True.
keep_aspect_ratio (bool, optional):
True: Keep the aspect ratio of the input sample.
Output sample might not have the given width and height, and
resize behaviour depends on the parameter 'resize_method'.
Defaults to False.
ensure_multiple_of (int, optional):
Output width and height is constrained to be multiple of this parameter.
Defaults to 1.
resize_method (str, optional):
"lower_bound": Output will be at least as large as the given size.
"upper_bound": Output will be at max as large as the given size. (Output size might be smaller than given size.)
"minimal": Scale as least as possible. (Output size might be smaller than given size.)
Defaults to "lower_bound".
"""
print("Params passed to Resize transform:")
print("\twidth: ", width)
print("\theight: ", height)
print("\tresize_target: ", resize_target)
print("\tkeep_aspect_ratio: ", keep_aspect_ratio)
print("\tensure_multiple_of: ", ensure_multiple_of)
print("\tresize_method: ", resize_method)
self.__width = width
self.__height = height
self.__keep_aspect_ratio = keep_aspect_ratio
self.__multiple_of = ensure_multiple_of
self.__resize_method = resize_method
def constrain_to_multiple_of(self, x, min_val=0, max_val=None):
y = (np.round(x / self.__multiple_of) * self.__multiple_of).astype(int)
if max_val is not None and y > max_val:
y = (np.floor(x / self.__multiple_of)
* self.__multiple_of).astype(int)
if y < min_val:
y = (np.ceil(x / self.__multiple_of)
* self.__multiple_of).astype(int)
return y
def get_size(self, width, height):
# determine new height and width
scale_height = self.__height / height
scale_width = self.__width / width
if self.__keep_aspect_ratio:
if self.__resize_method == "lower_bound":
# scale such that output size is lower bound
if scale_width > scale_height:
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
elif self.__resize_method == "upper_bound":
# scale such that output size is upper bound
if scale_width < scale_height:
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
elif self.__resize_method == "minimal":
# scale as least as possbile
if abs(1 - scale_width) < abs(1 - scale_height):
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
else:
raise ValueError(
f"resize_method {self.__resize_method} not implemented"
)
if self.__resize_method == "lower_bound":
new_height = self.constrain_to_multiple_of(
scale_height * height, min_val=self.__height
)
new_width = self.constrain_to_multiple_of(
scale_width * width, min_val=self.__width
)
elif self.__resize_method == "upper_bound":
new_height = self.constrain_to_multiple_of(
scale_height * height, max_val=self.__height
)
new_width = self.constrain_to_multiple_of(
scale_width * width, max_val=self.__width
)
elif self.__resize_method == "minimal":
new_height = self.constrain_to_multiple_of(scale_height * height)
new_width = self.constrain_to_multiple_of(scale_width * width)
else:
raise ValueError(
f"resize_method {self.__resize_method} not implemented")
return (new_width, new_height)
def __call__(self, x):
width, height = self.get_size(*x.shape[-2:][::-1])
return nn.functional.interpolate(x, (height, width), mode='bilinear', align_corners=True)
class PrepForMidas(object):
def __init__(self, resize_mode="minimal", keep_aspect_ratio=True, img_size=384, do_resize=True):
if isinstance(img_size, int):
img_size = (img_size, img_size)
net_h, net_w = img_size
# self.normalization = Normalize(
# mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
self.normalization = Normalize(
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
self.resizer = Resize(net_w, net_h, keep_aspect_ratio=keep_aspect_ratio, ensure_multiple_of=14, resize_method=resize_mode) \
if do_resize else nn.Identity()
def __call__(self, x):
return self.normalization(self.resizer(x))
class DepthAnythingCore(nn.Module):
def __init__(self, midas, trainable=False, fetch_features=True, layer_names=('out_conv', 'l4_rn', 'r4', 'r3', 'r2', 'r1'), freeze_bn=False, keep_aspect_ratio=True,
img_size=384, **kwargs):
"""Midas Base model used for multi-scale feature extraction.
Args:
midas (torch.nn.Module): Midas model.
trainable (bool, optional): Train midas model. Defaults to False.
fetch_features (bool, optional): Extract multi-scale features. Defaults to True.
layer_names (tuple, optional): Layers used for feature extraction. Order = (head output features, last layer features, ...decoder features). Defaults to ('out_conv', 'l4_rn', 'r4', 'r3', 'r2', 'r1').
freeze_bn (bool, optional): Freeze BatchNorm. Generally results in better finetuning performance. Defaults to False.
keep_aspect_ratio (bool, optional): Keep the aspect ratio of input images while resizing. Defaults to True.
img_size (int, tuple, optional): Input resolution. Defaults to 384.
"""
super().__init__()
self.core = midas
self.output_channels = None
self.core_out = {}
self.trainable = trainable
self.fetch_features = fetch_features
# midas.scratch.output_conv = nn.Identity()
self.handles = []
# self.layer_names = ['out_conv','l4_rn', 'r4', 'r3', 'r2', 'r1']
self.layer_names = layer_names
self.set_trainable(trainable)
self.set_fetch_features(fetch_features)
self.prep = PrepForMidas(keep_aspect_ratio=keep_aspect_ratio,
img_size=img_size, do_resize=kwargs.get('do_resize', True))
if freeze_bn:
self.freeze_bn()
def set_trainable(self, trainable):
self.trainable = trainable
if trainable:
self.unfreeze()
else:
self.freeze()
return self
def set_fetch_features(self, fetch_features):
self.fetch_features = fetch_features
if fetch_features:
if len(self.handles) == 0:
self.attach_hooks(self.core)
else:
self.remove_hooks()
return self
def freeze(self):
for p in self.parameters():
p.requires_grad = False
self.trainable = False
return self
def unfreeze(self):
for p in self.parameters():
p.requires_grad = True
self.trainable = True
return self
def freeze_bn(self):
for m in self.modules():
if isinstance(m, nn.BatchNorm2d):
m.eval()
return self
def forward(self, x, denorm=False, return_rel_depth=False):
# print('input to midas:', x.shape)
with torch.no_grad():
if denorm:
x = denormalize(x)
x = self.prep(x)
with torch.set_grad_enabled(self.trainable):
rel_depth = self.core(x)
if not self.fetch_features:
return rel_depth
out = [self.core_out[k] for k in self.layer_names]
if return_rel_depth:
return rel_depth, out
return out
def get_rel_pos_params(self):
for name, p in self.core.pretrained.named_parameters():
if "pos_embed" in name:
yield p
def get_enc_params_except_rel_pos(self):
for name, p in self.core.pretrained.named_parameters():
if "pos_embed" not in name:
yield p
def freeze_encoder(self, freeze_rel_pos=False):
if freeze_rel_pos:
for p in self.core.pretrained.parameters():
p.requires_grad = False
else:
for p in self.get_enc_params_except_rel_pos():
p.requires_grad = False
return self
def attach_hooks(self, midas):
if len(self.handles) > 0:
self.remove_hooks()
if "out_conv" in self.layer_names:
self.handles.append(list(midas.depth_head.scratch.output_conv2.children())[
1].register_forward_hook(get_activation("out_conv", self.core_out)))
if "r4" in self.layer_names:
self.handles.append(midas.depth_head.scratch.refinenet4.register_forward_hook(
get_activation("r4", self.core_out)))
if "r3" in self.layer_names:
self.handles.append(midas.depth_head.scratch.refinenet3.register_forward_hook(
get_activation("r3", self.core_out)))
if "r2" in self.layer_names:
self.handles.append(midas.depth_head.scratch.refinenet2.register_forward_hook(
get_activation("r2", self.core_out)))
if "r1" in self.layer_names:
self.handles.append(midas.depth_head.scratch.refinenet1.register_forward_hook(
get_activation("r1", self.core_out)))
if "l4_rn" in self.layer_names:
self.handles.append(midas.depth_head.scratch.layer4_rn.register_forward_hook(
get_activation("l4_rn", self.core_out)))
return self
def remove_hooks(self):
for h in self.handles:
h.remove()
return self
def __del__(self):
self.remove_hooks()
def set_output_channels(self):
self.output_channels = [256, 256, 256, 256, 256]
@staticmethod
def build(midas_model_type="dinov2_large", train_midas=False, use_pretrained_midas=True, fetch_features=False, freeze_bn=True, force_keep_ar=False, force_reload=False, **kwargs):
if "img_size" in kwargs:
kwargs = DepthAnythingCore.parse_img_size(kwargs)
img_size = kwargs.pop("img_size", [384, 384])
depth_anything = DPT_DINOv2(out_channels=[256, 512, 1024, 1024], use_clstoken=False)
state_dict = torch.load('../checkpoints/depth_anything_vitl14.pth', map_location='cpu')
depth_anything.load_state_dict(state_dict)
kwargs.update({'keep_aspect_ratio': force_keep_ar})
depth_anything_core = DepthAnythingCore(depth_anything, trainable=train_midas, fetch_features=fetch_features,
freeze_bn=freeze_bn, img_size=img_size, **kwargs)
depth_anything_core.set_output_channels()
return depth_anything_core
@staticmethod
def parse_img_size(config):
assert 'img_size' in config
if isinstance(config['img_size'], str):
assert "," in config['img_size'], "img_size should be a string with comma separated img_size=H,W"
config['img_size'] = list(map(int, config['img_size'].split(",")))
assert len(
config['img_size']) == 2, "img_size should be a string with comma separated img_size=H,W"
elif isinstance(config['img_size'], int):
config['img_size'] = [config['img_size'], config['img_size']]
else:
assert isinstance(config['img_size'], list) and len(
config['img_size']) == 2, "img_size should be a list of H,W"
return config
nchannels2models = {
tuple([256]*5): ["DPT_BEiT_L_384", "DPT_BEiT_L_512", "DPT_BEiT_B_384", "DPT_SwinV2_L_384", "DPT_SwinV2_B_384", "DPT_SwinV2_T_256", "DPT_Large", "DPT_Hybrid"],
(512, 256, 128, 64, 64): ["MiDaS_small"]
}
# Model name to number of output channels
MIDAS_SETTINGS = {m: k for k, v in nchannels2models.items()
for m in v
}

View File

@@ -0,0 +1,153 @@
import torch.nn as nn
def _make_scratch(in_shape, out_shape, groups=1, expand=False):
scratch = nn.Module()
out_shape1 = out_shape
out_shape2 = out_shape
out_shape3 = out_shape
if len(in_shape) >= 4:
out_shape4 = out_shape
if expand:
out_shape1 = out_shape
out_shape2 = out_shape*2
out_shape3 = out_shape*4
if len(in_shape) >= 4:
out_shape4 = out_shape*8
scratch.layer1_rn = nn.Conv2d(
in_shape[0], out_shape1, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
)
scratch.layer2_rn = nn.Conv2d(
in_shape[1], out_shape2, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
)
scratch.layer3_rn = nn.Conv2d(
in_shape[2], out_shape3, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
)
if len(in_shape) >= 4:
scratch.layer4_rn = nn.Conv2d(
in_shape[3], out_shape4, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
)
return scratch
class ResidualConvUnit(nn.Module):
"""Residual convolution module.
"""
def __init__(self, features, activation, bn):
"""Init.
Args:
features (int): number of features
"""
super().__init__()
self.bn = bn
self.groups=1
self.conv1 = nn.Conv2d(
features, features, kernel_size=3, stride=1, padding=1, bias=True, groups=self.groups
)
self.conv2 = nn.Conv2d(
features, features, kernel_size=3, stride=1, padding=1, bias=True, groups=self.groups
)
if self.bn==True:
self.bn1 = nn.BatchNorm2d(features)
self.bn2 = nn.BatchNorm2d(features)
self.activation = activation
self.skip_add = nn.quantized.FloatFunctional()
def forward(self, x):
"""Forward pass.
Args:
x (tensor): input
Returns:
tensor: output
"""
out = self.activation(x)
out = self.conv1(out)
if self.bn==True:
out = self.bn1(out)
out = self.activation(out)
out = self.conv2(out)
if self.bn==True:
out = self.bn2(out)
if self.groups > 1:
out = self.conv_merge(out)
return self.skip_add.add(out, x)
class FeatureFusionBlock(nn.Module):
"""Feature fusion block.
"""
def __init__(self, features, activation, deconv=False, bn=False, expand=False, align_corners=True, size=None):
"""Init.
Args:
features (int): number of features
"""
super(FeatureFusionBlock, self).__init__()
self.deconv = deconv
self.align_corners = align_corners
self.groups=1
self.expand = expand
out_features = features
if self.expand==True:
out_features = features//2
self.out_conv = nn.Conv2d(features, out_features, kernel_size=1, stride=1, padding=0, bias=True, groups=1)
self.resConfUnit1 = ResidualConvUnit(features, activation, bn)
self.resConfUnit2 = ResidualConvUnit(features, activation, bn)
self.skip_add = nn.quantized.FloatFunctional()
self.size=size
def forward(self, *xs, size=None):
"""Forward pass.
Returns:
tensor: output
"""
output = xs[0]
if len(xs) == 2:
res = self.resConfUnit1(xs[1])
output = self.skip_add.add(output, res)
output = self.resConfUnit2(output)
if (size is None) and (self.size is None):
modifier = {"scale_factor": 2}
elif size is None:
modifier = {"size": self.size}
else:
modifier = {"size": size}
output = nn.functional.interpolate(
output, **modifier, mode="bilinear", align_corners=self.align_corners
)
output = self.out_conv(output)
return output

View File

@@ -0,0 +1,157 @@
import torch
import torch.nn as nn
from .blocks import FeatureFusionBlock, _make_scratch
import torch.nn.functional as F
def _make_fusion_block(features, use_bn, size = None):
return FeatureFusionBlock(
features,
nn.ReLU(False),
deconv=False,
bn=use_bn,
expand=False,
align_corners=True,
size=size,
)
class DPTHead(nn.Module):
def __init__(self, in_channels, features=256, use_bn=False, out_channels=[256, 512, 1024, 1024], use_clstoken=False):
super(DPTHead, self).__init__()
self.use_clstoken = use_clstoken
# out_channels = [in_channels // 8, in_channels // 4, in_channels // 2, in_channels]
# out_channels = [in_channels // 4, in_channels // 2, in_channels, in_channels]
# out_channels = [in_channels, in_channels, in_channels, in_channels]
self.projects = nn.ModuleList([
nn.Conv2d(
in_channels=in_channels,
out_channels=out_channel,
kernel_size=1,
stride=1,
padding=0,
) for out_channel in out_channels
])
self.resize_layers = nn.ModuleList([
nn.ConvTranspose2d(
in_channels=out_channels[0],
out_channels=out_channels[0],
kernel_size=4,
stride=4,
padding=0),
nn.ConvTranspose2d(
in_channels=out_channels[1],
out_channels=out_channels[1],
kernel_size=2,
stride=2,
padding=0),
nn.Identity(),
nn.Conv2d(
in_channels=out_channels[3],
out_channels=out_channels[3],
kernel_size=3,
stride=2,
padding=1)
])
if use_clstoken:
self.readout_projects = nn.ModuleList()
for _ in range(len(self.projects)):
self.readout_projects.append(
nn.Sequential(
nn.Linear(2 * in_channels, in_channels),
nn.GELU()))
self.scratch = _make_scratch(
out_channels,
features,
groups=1,
expand=False,
)
self.scratch.stem_transpose = None
self.scratch.refinenet1 = _make_fusion_block(features, use_bn)
self.scratch.refinenet2 = _make_fusion_block(features, use_bn)
self.scratch.refinenet3 = _make_fusion_block(features, use_bn)
self.scratch.refinenet4 = _make_fusion_block(features, use_bn)
head_features_1 = features
head_features_2 = 32
self.scratch.output_conv1 = nn.Conv2d(head_features_1, head_features_1 // 2, kernel_size=3, stride=1, padding=1)
self.scratch.output_conv2 = nn.Sequential(
nn.Conv2d(head_features_1 // 2, head_features_2, kernel_size=3, stride=1, padding=1),
nn.ReLU(True),
nn.Conv2d(head_features_2, 1, kernel_size=1, stride=1, padding=0),
nn.ReLU(True),
nn.Identity(),
)
def forward(self, out_features, patch_h, patch_w):
out = []
for i, x in enumerate(out_features):
if self.use_clstoken:
x, cls_token = x[0], x[1]
readout = cls_token.unsqueeze(1).expand_as(x)
x = self.readout_projects[i](torch.cat((x, readout), -1))
else:
x = x[0]
x = x.permute(0, 2, 1).reshape((x.shape[0], x.shape[-1], patch_h, patch_w))
x = self.projects[i](x)
x = self.resize_layers[i](x)
out.append(x)
layer_1, layer_2, layer_3, layer_4 = out
layer_1_rn = self.scratch.layer1_rn(layer_1)
layer_2_rn = self.scratch.layer2_rn(layer_2)
layer_3_rn = self.scratch.layer3_rn(layer_3)
layer_4_rn = self.scratch.layer4_rn(layer_4)
path_4 = self.scratch.refinenet4(layer_4_rn, size=layer_3_rn.shape[2:])
path_3 = self.scratch.refinenet3(path_4, layer_3_rn, size=layer_2_rn.shape[2:])
path_2 = self.scratch.refinenet2(path_3, layer_2_rn, size=layer_1_rn.shape[2:])
path_1 = self.scratch.refinenet1(path_2, layer_1_rn)
out = self.scratch.output_conv1(path_1)
out = F.interpolate(out, (int(patch_h * 14), int(patch_w * 14)), mode="bilinear", align_corners=True)
out = self.scratch.output_conv2(out)
return out
class DPT_DINOv2(nn.Module):
def __init__(self, encoder='vitl', features=256, use_bn=False, out_channels=[256, 512, 1024, 1024], use_clstoken=False):
super(DPT_DINOv2, self).__init__()
torch.manual_seed(1)
self.pretrained = torch.hub.load('../torchhub/facebookresearch_dinov2_main', 'dinov2_{:}14'.format(encoder), source='local', pretrained=False)
dim = self.pretrained.blocks[0].attn.qkv.in_features
self.depth_head = DPTHead(dim, features, use_bn, out_channels=out_channels, use_clstoken=use_clstoken)
def forward(self, x):
h, w = x.shape[-2:]
features = self.pretrained.get_intermediate_layers(x, 4, return_class_token=True)
patch_h, patch_w = h // 14, w // 14
depth = self.depth_head(features, patch_h, patch_w)
depth = F.interpolate(depth, size=(h, w), mode="bilinear", align_corners=True)
depth = F.relu(depth)
return depth.squeeze(1)

View File

@@ -0,0 +1,380 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import torch
import torch.nn as nn
import numpy as np
from torchvision.transforms import Normalize
def denormalize(x):
"""Reverses the imagenet normalization applied to the input.
Args:
x (torch.Tensor - shape(N,3,H,W)): input tensor
Returns:
torch.Tensor - shape(N,3,H,W): Denormalized input
"""
mean = torch.Tensor([0.485, 0.456, 0.406]).view(1, 3, 1, 1).to(x.device)
std = torch.Tensor([0.229, 0.224, 0.225]).view(1, 3, 1, 1).to(x.device)
return x * std + mean
def get_activation(name, bank):
def hook(model, input, output):
bank[name] = output
return hook
class Resize(object):
"""Resize sample to given size (width, height).
"""
def __init__(
self,
width,
height,
resize_target=True,
keep_aspect_ratio=False,
ensure_multiple_of=1,
resize_method="lower_bound",
):
"""Init.
Args:
width (int): desired output width
height (int): desired output height
resize_target (bool, optional):
True: Resize the full sample (image, mask, target).
False: Resize image only.
Defaults to True.
keep_aspect_ratio (bool, optional):
True: Keep the aspect ratio of the input sample.
Output sample might not have the given width and height, and
resize behaviour depends on the parameter 'resize_method'.
Defaults to False.
ensure_multiple_of (int, optional):
Output width and height is constrained to be multiple of this parameter.
Defaults to 1.
resize_method (str, optional):
"lower_bound": Output will be at least as large as the given size.
"upper_bound": Output will be at max as large as the given size. (Output size might be smaller than given size.)
"minimal": Scale as least as possible. (Output size might be smaller than given size.)
Defaults to "lower_bound".
"""
print("Params passed to Resize transform:")
print("\twidth: ", width)
print("\theight: ", height)
print("\tresize_target: ", resize_target)
print("\tkeep_aspect_ratio: ", keep_aspect_ratio)
print("\tensure_multiple_of: ", ensure_multiple_of)
print("\tresize_method: ", resize_method)
self.__width = width
self.__height = height
self.__keep_aspect_ratio = keep_aspect_ratio
self.__multiple_of = ensure_multiple_of
self.__resize_method = resize_method
def constrain_to_multiple_of(self, x, min_val=0, max_val=None):
y = (np.round(x / self.__multiple_of) * self.__multiple_of).astype(int)
if max_val is not None and y > max_val:
y = (np.floor(x / self.__multiple_of)
* self.__multiple_of).astype(int)
if y < min_val:
y = (np.ceil(x / self.__multiple_of)
* self.__multiple_of).astype(int)
return y
def get_size(self, width, height):
# determine new height and width
scale_height = self.__height / height
scale_width = self.__width / width
if self.__keep_aspect_ratio:
if self.__resize_method == "lower_bound":
# scale such that output size is lower bound
if scale_width > scale_height:
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
elif self.__resize_method == "upper_bound":
# scale such that output size is upper bound
if scale_width < scale_height:
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
elif self.__resize_method == "minimal":
# scale as least as possbile
if abs(1 - scale_width) < abs(1 - scale_height):
# fit width
scale_height = scale_width
else:
# fit height
scale_width = scale_height
else:
raise ValueError(
f"resize_method {self.__resize_method} not implemented"
)
if self.__resize_method == "lower_bound":
new_height = self.constrain_to_multiple_of(
scale_height * height, min_val=self.__height
)
new_width = self.constrain_to_multiple_of(
scale_width * width, min_val=self.__width
)
elif self.__resize_method == "upper_bound":
new_height = self.constrain_to_multiple_of(
scale_height * height, max_val=self.__height
)
new_width = self.constrain_to_multiple_of(
scale_width * width, max_val=self.__width
)
elif self.__resize_method == "minimal":
new_height = self.constrain_to_multiple_of(scale_height * height)
new_width = self.constrain_to_multiple_of(scale_width * width)
else:
raise ValueError(
f"resize_method {self.__resize_method} not implemented")
return (new_width, new_height)
def __call__(self, x):
width, height = self.get_size(*x.shape[-2:][::-1])
return nn.functional.interpolate(x, (height, width), mode='bilinear', align_corners=True)
class PrepForMidas(object):
def __init__(self, resize_mode="minimal", keep_aspect_ratio=True, img_size=384, do_resize=True):
if isinstance(img_size, int):
img_size = (img_size, img_size)
net_h, net_w = img_size
self.normalization = Normalize(
mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
self.resizer = Resize(net_w, net_h, keep_aspect_ratio=keep_aspect_ratio, ensure_multiple_of=32, resize_method=resize_mode) \
if do_resize else nn.Identity()
def __call__(self, x):
return self.normalization(self.resizer(x))
class MidasCore(nn.Module):
def __init__(self, midas, trainable=False, fetch_features=True, layer_names=('out_conv', 'l4_rn', 'r4', 'r3', 'r2', 'r1'), freeze_bn=False, keep_aspect_ratio=True,
img_size=384, **kwargs):
"""Midas Base model used for multi-scale feature extraction.
Args:
midas (torch.nn.Module): Midas model.
trainable (bool, optional): Train midas model. Defaults to False.
fetch_features (bool, optional): Extract multi-scale features. Defaults to True.
layer_names (tuple, optional): Layers used for feature extraction. Order = (head output features, last layer features, ...decoder features). Defaults to ('out_conv', 'l4_rn', 'r4', 'r3', 'r2', 'r1').
freeze_bn (bool, optional): Freeze BatchNorm. Generally results in better finetuning performance. Defaults to False.
keep_aspect_ratio (bool, optional): Keep the aspect ratio of input images while resizing. Defaults to True.
img_size (int, tuple, optional): Input resolution. Defaults to 384.
"""
super().__init__()
self.core = midas
self.output_channels = None
self.core_out = {}
self.trainable = trainable
self.fetch_features = fetch_features
# midas.scratch.output_conv = nn.Identity()
self.handles = []
# self.layer_names = ['out_conv','l4_rn', 'r4', 'r3', 'r2', 'r1']
self.layer_names = layer_names
self.set_trainable(trainable)
self.set_fetch_features(fetch_features)
self.prep = PrepForMidas(keep_aspect_ratio=keep_aspect_ratio,
img_size=img_size, do_resize=kwargs.get('do_resize', True))
if freeze_bn:
self.freeze_bn()
def set_trainable(self, trainable):
self.trainable = trainable
if trainable:
self.unfreeze()
else:
self.freeze()
return self
def set_fetch_features(self, fetch_features):
self.fetch_features = fetch_features
if fetch_features:
if len(self.handles) == 0:
self.attach_hooks(self.core)
else:
self.remove_hooks()
return self
def freeze(self):
for p in self.parameters():
p.requires_grad = False
self.trainable = False
return self
def unfreeze(self):
for p in self.parameters():
p.requires_grad = True
self.trainable = True
return self
def freeze_bn(self):
for m in self.modules():
if isinstance(m, nn.BatchNorm2d):
m.eval()
return self
def forward(self, x, denorm=False, return_rel_depth=False):
# print('input to midas:', x.shape)
with torch.no_grad():
if denorm:
x = denormalize(x)
x = self.prep(x)
# print("Shape after prep: ", x.shape)
# print('pre-processed:', x.shape)
with torch.set_grad_enabled(self.trainable):
# print("Input size to Midascore", x.shape)
rel_depth = self.core(x)
# print("Output from midas shape", rel_depth.shape)
if not self.fetch_features:
return rel_depth
out = [self.core_out[k] for k in self.layer_names]
if return_rel_depth:
return rel_depth, out
return out
def get_rel_pos_params(self):
for name, p in self.core.pretrained.named_parameters():
if "relative_position" in name:
yield p
def get_enc_params_except_rel_pos(self):
for name, p in self.core.pretrained.named_parameters():
if "relative_position" not in name:
yield p
def freeze_encoder(self, freeze_rel_pos=False):
if freeze_rel_pos:
for p in self.core.pretrained.parameters():
p.requires_grad = False
else:
for p in self.get_enc_params_except_rel_pos():
p.requires_grad = False
return self
def attach_hooks(self, midas):
if len(self.handles) > 0:
self.remove_hooks()
if "out_conv" in self.layer_names:
self.handles.append(list(midas.scratch.output_conv.children())[
3].register_forward_hook(get_activation("out_conv", self.core_out)))
if "r4" in self.layer_names:
self.handles.append(midas.scratch.refinenet4.register_forward_hook(
get_activation("r4", self.core_out)))
if "r3" in self.layer_names:
self.handles.append(midas.scratch.refinenet3.register_forward_hook(
get_activation("r3", self.core_out)))
if "r2" in self.layer_names:
self.handles.append(midas.scratch.refinenet2.register_forward_hook(
get_activation("r2", self.core_out)))
if "r1" in self.layer_names:
self.handles.append(midas.scratch.refinenet1.register_forward_hook(
get_activation("r1", self.core_out)))
if "l4_rn" in self.layer_names:
self.handles.append(midas.scratch.layer4_rn.register_forward_hook(
get_activation("l4_rn", self.core_out)))
return self
def remove_hooks(self):
for h in self.handles:
h.remove()
return self
def __del__(self):
self.remove_hooks()
def set_output_channels(self, model_type):
self.output_channels = MIDAS_SETTINGS[model_type]
@staticmethod
def build(midas_model_type="DPT_BEiT_L_384", train_midas=False, use_pretrained_midas=True, fetch_features=False, freeze_bn=True, force_keep_ar=False, force_reload=False, **kwargs):
if midas_model_type not in MIDAS_SETTINGS:
raise ValueError(
f"Invalid model type: {midas_model_type}. Must be one of {list(MIDAS_SETTINGS.keys())}")
if "img_size" in kwargs:
kwargs = MidasCore.parse_img_size(kwargs)
img_size = kwargs.pop("img_size", [384, 384])
# print("img_size", img_size)
midas = torch.hub.load("intel-isl/MiDaS", midas_model_type,
pretrained=use_pretrained_midas, force_reload=force_reload)
kwargs.update({'keep_aspect_ratio': force_keep_ar})
midas_core = MidasCore(midas, trainable=train_midas, fetch_features=fetch_features,
freeze_bn=freeze_bn, img_size=img_size, **kwargs)
midas_core.set_output_channels(midas_model_type)
return midas_core
@staticmethod
def build_from_config(config):
return MidasCore.build(**config)
@staticmethod
def parse_img_size(config):
assert 'img_size' in config
if isinstance(config['img_size'], str):
assert "," in config['img_size'], "img_size should be a string with comma separated img_size=H,W"
config['img_size'] = list(map(int, config['img_size'].split(",")))
assert len(
config['img_size']) == 2, "img_size should be a string with comma separated img_size=H,W"
elif isinstance(config['img_size'], int):
config['img_size'] = [config['img_size'], config['img_size']]
else:
assert isinstance(config['img_size'], list) and len(
config['img_size']) == 2, "img_size should be a list of H,W"
return config
nchannels2models = {
tuple([256]*5): ["DPT_BEiT_L_384", "DPT_BEiT_L_512", "DPT_BEiT_B_384", "DPT_SwinV2_L_384", "DPT_SwinV2_B_384", "DPT_SwinV2_T_256", "DPT_Large", "DPT_Hybrid"],
(512, 256, 128, 64, 64): ["MiDaS_small"]
}
# Model name to number of output channels
MIDAS_SETTINGS = {m: k for k, v in nchannels2models.items()
for m in v
}
# print('MIDAS_SETTINGS:', MIDAS_SETTINGS)

View File

@@ -0,0 +1,51 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
from importlib import import_module
from zoedepth.models.depth_model import DepthModel
def build_model(config) -> DepthModel:
"""Builds a model from a config. The model is specified by the model name and version in the config. The model is then constructed using the build_from_config function of the model interface.
This function should be used to construct models for training and evaluation.
Args:
config (dict): Config dict. Config is constructed in utils/config.py. Each model has its own config file(s) saved in its root model folder.
Returns:
torch.nn.Module: Model corresponding to name and version as specified in config
"""
module_name = f"zoedepth.models.{config.model}"
try:
module = import_module(module_name)
except ModuleNotFoundError as e:
# print the original error message
print(e)
raise ValueError(
f"Model {config.model} not found. Refer above error for details.") from e
try:
get_version = getattr(module, "get_version")
except AttributeError as e:
raise ValueError(
f"Model {config.model} has no get_version function.") from e
return get_version(config.version_name).build_from_config(config)

View File

@@ -0,0 +1,152 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import transforms
import PIL.Image
from PIL import Image
from typing import Union
class DepthModel(nn.Module):
def __init__(self):
super().__init__()
self.device = 'cpu'
def to(self, device) -> nn.Module:
self.device = device
return super().to(device)
def forward(self, x, *args, **kwargs):
raise NotImplementedError
def _infer(self, x: torch.Tensor):
"""
Inference interface for the model
Args:
x (torch.Tensor): input tensor of shape (b, c, h, w)
Returns:
torch.Tensor: output tensor of shape (b, 1, h, w)
"""
return self(x)['metric_depth']
def _infer_with_pad_aug(self, x: torch.Tensor, pad_input: bool=True, fh: float=3, fw: float=3, upsampling_mode: str='bicubic', padding_mode="reflect", **kwargs) -> torch.Tensor:
"""
Inference interface for the model with padding augmentation
Padding augmentation fixes the boundary artifacts in the output depth map.
Boundary artifacts are sometimes caused by the fact that the model is trained on NYU raw dataset which has a black or white border around the image.
This augmentation pads the input image and crops the prediction back to the original size / view.
Note: This augmentation is not required for the models trained with 'avoid_boundary'=True.
Args:
x (torch.Tensor): input tensor of shape (b, c, h, w)
pad_input (bool, optional): whether to pad the input or not. Defaults to True.
fh (float, optional): height padding factor. The padding is calculated as sqrt(h/2) * fh. Defaults to 3.
fw (float, optional): width padding factor. The padding is calculated as sqrt(w/2) * fw. Defaults to 3.
upsampling_mode (str, optional): upsampling mode. Defaults to 'bicubic'.
padding_mode (str, optional): padding mode. Defaults to "reflect".
Returns:
torch.Tensor: output tensor of shape (b, 1, h, w)
"""
# assert x is nchw and c = 3
assert x.dim() == 4, "x must be 4 dimensional, got {}".format(x.dim())
assert x.shape[1] == 3, "x must have 3 channels, got {}".format(x.shape[1])
if pad_input:
assert fh > 0 or fw > 0, "atlease one of fh and fw must be greater than 0"
pad_h = int(np.sqrt(x.shape[2]/2) * fh)
pad_w = int(np.sqrt(x.shape[3]/2) * fw)
padding = [pad_w, pad_w]
if pad_h > 0:
padding += [pad_h, pad_h]
x = F.pad(x, padding, mode=padding_mode, **kwargs)
out = self._infer(x)
if out.shape[-2:] != x.shape[-2:]:
out = F.interpolate(out, size=(x.shape[2], x.shape[3]), mode=upsampling_mode, align_corners=False)
if pad_input:
# crop to the original size, handling the case where pad_h and pad_w is 0
if pad_h > 0:
out = out[:, :, pad_h:-pad_h,:]
if pad_w > 0:
out = out[:, :, :, pad_w:-pad_w]
return out
def infer_with_flip_aug(self, x, pad_input: bool=True, **kwargs) -> torch.Tensor:
"""
Inference interface for the model with horizontal flip augmentation
Horizontal flip augmentation improves the accuracy of the model by averaging the output of the model with and without horizontal flip.
Args:
x (torch.Tensor): input tensor of shape (b, c, h, w)
pad_input (bool, optional): whether to use padding augmentation. Defaults to True.
Returns:
torch.Tensor: output tensor of shape (b, 1, h, w)
"""
# infer with horizontal flip and average
out = self._infer_with_pad_aug(x, pad_input=pad_input, **kwargs)
out_flip = self._infer_with_pad_aug(torch.flip(x, dims=[3]), pad_input=pad_input, **kwargs)
out = (out + torch.flip(out_flip, dims=[3])) / 2
return out
def infer(self, x, pad_input: bool=True, with_flip_aug: bool=True, **kwargs) -> torch.Tensor:
"""
Inference interface for the model
Args:
x (torch.Tensor): input tensor of shape (b, c, h, w)
pad_input (bool, optional): whether to use padding augmentation. Defaults to True.
with_flip_aug (bool, optional): whether to use horizontal flip augmentation. Defaults to True.
Returns:
torch.Tensor: output tensor of shape (b, 1, h, w)
"""
if with_flip_aug:
return self.infer_with_flip_aug(x, pad_input=pad_input, **kwargs)
else:
return self._infer_with_pad_aug(x, pad_input=pad_input, **kwargs)
@torch.no_grad()
def infer_pil(self, pil_img, pad_input: bool=True, with_flip_aug: bool=True, output_type: str="numpy", **kwargs) -> Union[np.ndarray, PIL.Image.Image, torch.Tensor]:
"""
Inference interface for the model for PIL image
Args:
pil_img (PIL.Image.Image): input PIL image
pad_input (bool, optional): whether to use padding augmentation. Defaults to True.
with_flip_aug (bool, optional): whether to use horizontal flip augmentation. Defaults to True.
output_type (str, optional): output type. Supported values are 'numpy', 'pil' and 'tensor'. Defaults to "numpy".
"""
x = transforms.ToTensor()(pil_img).unsqueeze(0).to(self.device)
out_tensor = self.infer(x, pad_input=pad_input, with_flip_aug=with_flip_aug, **kwargs)
if output_type == "numpy":
return out_tensor.squeeze().cpu().numpy()
elif output_type == "pil":
# uint16 is required for depth pil image
out_16bit_numpy = (out_tensor.squeeze().cpu().numpy()*256).astype(np.uint16)
return Image.fromarray(out_16bit_numpy)
elif output_type == "tensor":
return out_tensor.squeeze().cpu()
else:
raise ValueError(f"output_type {output_type} not supported. Supported values are 'numpy', 'pil' and 'tensor'")

View File

@@ -0,0 +1,208 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import torch
import torch.nn as nn
@torch.jit.script
def exp_attractor(dx, alpha: float = 300, gamma: int = 2):
"""Exponential attractor: dc = exp(-alpha*|dx|^gamma) * dx , where dx = a - c, a = attractor point, c = bin center, dc = shift in bin centermmary for exp_attractor
Args:
dx (torch.Tensor): The difference tensor dx = Ai - Cj, where Ai is the attractor point and Cj is the bin center.
alpha (float, optional): Proportional Attractor strength. Determines the absolute strength. Lower alpha = greater attraction. Defaults to 300.
gamma (int, optional): Exponential Attractor strength. Determines the "region of influence" and indirectly number of bin centers affected. Lower gamma = farther reach. Defaults to 2.
Returns:
torch.Tensor : Delta shifts - dc; New bin centers = Old bin centers + dc
"""
return torch.exp(-alpha*(torch.abs(dx)**gamma)) * (dx)
@torch.jit.script
def inv_attractor(dx, alpha: float = 300, gamma: int = 2):
"""Inverse attractor: dc = dx / (1 + alpha*dx^gamma), where dx = a - c, a = attractor point, c = bin center, dc = shift in bin center
This is the default one according to the accompanying paper.
Args:
dx (torch.Tensor): The difference tensor dx = Ai - Cj, where Ai is the attractor point and Cj is the bin center.
alpha (float, optional): Proportional Attractor strength. Determines the absolute strength. Lower alpha = greater attraction. Defaults to 300.
gamma (int, optional): Exponential Attractor strength. Determines the "region of influence" and indirectly number of bin centers affected. Lower gamma = farther reach. Defaults to 2.
Returns:
torch.Tensor: Delta shifts - dc; New bin centers = Old bin centers + dc
"""
return dx.div(1+alpha*dx.pow(gamma))
class AttractorLayer(nn.Module):
def __init__(self, in_features, n_bins, n_attractors=16, mlp_dim=128, min_depth=1e-3, max_depth=10,
alpha=300, gamma=2, kind='sum', attractor_type='exp', memory_efficient=False):
"""
Attractor layer for bin centers. Bin centers are bounded on the interval (min_depth, max_depth)
"""
super().__init__()
self.n_attractors = n_attractors
self.n_bins = n_bins
self.min_depth = min_depth
self.max_depth = max_depth
self.alpha = alpha
self.gamma = gamma
self.kind = kind
self.attractor_type = attractor_type
self.memory_efficient = memory_efficient
self._net = nn.Sequential(
nn.Conv2d(in_features, mlp_dim, 1, 1, 0),
nn.ReLU(inplace=True),
nn.Conv2d(mlp_dim, n_attractors*2, 1, 1, 0), # x2 for linear norm
nn.ReLU(inplace=True)
)
def forward(self, x, b_prev, prev_b_embedding=None, interpolate=True, is_for_query=False):
"""
Args:
x (torch.Tensor) : feature block; shape - n, c, h, w
b_prev (torch.Tensor) : previous bin centers normed; shape - n, prev_nbins, h, w
Returns:
tuple(torch.Tensor,torch.Tensor) : new bin centers normed and scaled; shape - n, nbins, h, w
"""
if prev_b_embedding is not None:
if interpolate:
prev_b_embedding = nn.functional.interpolate(
prev_b_embedding, x.shape[-2:], mode='bilinear', align_corners=True)
x = x + prev_b_embedding
A = self._net(x)
eps = 1e-3
A = A + eps
n, c, h, w = A.shape
A = A.view(n, self.n_attractors, 2, h, w)
A_normed = A / A.sum(dim=2, keepdim=True) # n, a, 2, h, w
A_normed = A[:, :, 0, ...] # n, na, h, w
b_prev = nn.functional.interpolate(
b_prev, (h, w), mode='bilinear', align_corners=True)
b_centers = b_prev
if self.attractor_type == 'exp':
dist = exp_attractor
else:
dist = inv_attractor
if not self.memory_efficient:
func = {'mean': torch.mean, 'sum': torch.sum}[self.kind]
# .shape N, nbins, h, w
delta_c = func(dist(A_normed.unsqueeze(
2) - b_centers.unsqueeze(1)), dim=1)
else:
delta_c = torch.zeros_like(b_centers, device=b_centers.device)
for i in range(self.n_attractors):
# .shape N, nbins, h, w
delta_c += dist(A_normed[:, i, ...].unsqueeze(1) - b_centers)
if self.kind == 'mean':
delta_c = delta_c / self.n_attractors
b_new_centers = b_centers + delta_c
B_centers = (self.max_depth - self.min_depth) * \
b_new_centers + self.min_depth
B_centers, _ = torch.sort(B_centers, dim=1)
B_centers = torch.clip(B_centers, self.min_depth, self.max_depth)
return b_new_centers, B_centers
class AttractorLayerUnnormed(nn.Module):
def __init__(self, in_features, n_bins, n_attractors=16, mlp_dim=128, min_depth=1e-3, max_depth=10,
alpha=300, gamma=2, kind='sum', attractor_type='exp', memory_efficient=False):
"""
Attractor layer for bin centers. Bin centers are unbounded
"""
super().__init__()
self.n_attractors = n_attractors
self.n_bins = n_bins
self.min_depth = min_depth
self.max_depth = max_depth
self.alpha = alpha
self.gamma = gamma
self.kind = kind
self.attractor_type = attractor_type
self.memory_efficient = memory_efficient
self._net = nn.Sequential(
nn.Conv2d(in_features, mlp_dim, 1, 1, 0),
nn.ReLU(inplace=True),
nn.Conv2d(mlp_dim, n_attractors, 1, 1, 0),
nn.Softplus()
)
def forward(self, x, b_prev, prev_b_embedding=None, interpolate=True, is_for_query=False):
"""
Args:
x (torch.Tensor) : feature block; shape - n, c, h, w
b_prev (torch.Tensor) : previous bin centers normed; shape - n, prev_nbins, h, w
Returns:
tuple(torch.Tensor,torch.Tensor) : new bin centers unbounded; shape - n, nbins, h, w. Two outputs just to keep the API consistent with the normed version
"""
if prev_b_embedding is not None:
if interpolate:
prev_b_embedding = nn.functional.interpolate(
prev_b_embedding, x.shape[-2:], mode='bilinear', align_corners=True)
x = x + prev_b_embedding
A = self._net(x)
n, c, h, w = A.shape
b_prev = nn.functional.interpolate(
b_prev, (h, w), mode='bilinear', align_corners=True)
b_centers = b_prev
if self.attractor_type == 'exp':
dist = exp_attractor
else:
dist = inv_attractor
if not self.memory_efficient:
func = {'mean': torch.mean, 'sum': torch.sum}[self.kind]
# .shape N, nbins, h, w
delta_c = func(
dist(A.unsqueeze(2) - b_centers.unsqueeze(1)), dim=1)
else:
delta_c = torch.zeros_like(b_centers, device=b_centers.device)
for i in range(self.n_attractors):
delta_c += dist(A[:, i, ...].unsqueeze(1) -
b_centers) # .shape N, nbins, h, w
if self.kind == 'mean':
delta_c = delta_c / self.n_attractors
b_new_centers = b_centers + delta_c
B_centers = b_new_centers
return b_new_centers, B_centers

View File

@@ -0,0 +1,121 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import torch
import torch.nn as nn
def log_binom(n, k, eps=1e-7):
""" log(nCk) using stirling approximation """
n = n + eps
k = k + eps
return n * torch.log(n) - k * torch.log(k) - (n-k) * torch.log(n-k+eps)
class LogBinomial(nn.Module):
def __init__(self, n_classes=256, act=torch.softmax):
"""Compute log binomial distribution for n_classes
Args:
n_classes (int, optional): number of output classes. Defaults to 256.
"""
super().__init__()
self.K = n_classes
self.act = act
self.register_buffer('k_idx', torch.arange(
0, n_classes).view(1, -1, 1, 1))
self.register_buffer('K_minus_1', torch.Tensor(
[self.K-1]).view(1, -1, 1, 1))
def forward(self, x, t=1., eps=1e-4):
"""Compute log binomial distribution for x
Args:
x (torch.Tensor - NCHW): probabilities
t (float, torch.Tensor - NCHW, optional): Temperature of distribution. Defaults to 1..
eps (float, optional): Small number for numerical stability. Defaults to 1e-4.
Returns:
torch.Tensor -NCHW: log binomial distribution logbinomial(p;t)
"""
if x.ndim == 3:
x = x.unsqueeze(1) # make it nchw
one_minus_x = torch.clamp(1 - x, eps, 1)
x = torch.clamp(x, eps, 1)
y = log_binom(self.K_minus_1, self.k_idx) + self.k_idx * \
torch.log(x) + (self.K - 1 - self.k_idx) * torch.log(one_minus_x)
return self.act(y/t, dim=1)
class ConditionalLogBinomial(nn.Module):
def __init__(self, in_features, condition_dim, n_classes=256, bottleneck_factor=2, p_eps=1e-4, max_temp=50, min_temp=1e-7, act=torch.softmax):
"""Conditional Log Binomial distribution
Args:
in_features (int): number of input channels in main feature
condition_dim (int): number of input channels in condition feature
n_classes (int, optional): Number of classes. Defaults to 256.
bottleneck_factor (int, optional): Hidden dim factor. Defaults to 2.
p_eps (float, optional): small eps value. Defaults to 1e-4.
max_temp (float, optional): Maximum temperature of output distribution. Defaults to 50.
min_temp (float, optional): Minimum temperature of output distribution. Defaults to 1e-7.
"""
super().__init__()
self.p_eps = p_eps
self.max_temp = max_temp
self.min_temp = min_temp
self.log_binomial_transform = LogBinomial(n_classes, act=act)
bottleneck = (in_features + condition_dim) // bottleneck_factor
self.mlp = nn.Sequential(
nn.Conv2d(in_features + condition_dim, bottleneck,
kernel_size=1, stride=1, padding=0),
nn.GELU(),
# 2 for p linear norm, 2 for t linear norm
nn.Conv2d(bottleneck, 2+2, kernel_size=1, stride=1, padding=0),
nn.Softplus()
)
def forward(self, x, cond):
"""Forward pass
Args:
x (torch.Tensor - NCHW): Main feature
cond (torch.Tensor - NCHW): condition feature
Returns:
torch.Tensor: Output log binomial distribution
"""
pt = self.mlp(torch.concat((x, cond), dim=1))
p, t = pt[:, :2, ...], pt[:, 2:, ...]
p = p + self.p_eps
p = p[:, 0, ...] / (p[:, 0, ...] + p[:, 1, ...])
t = t + self.p_eps
t = t[:, 0, ...] / (t[:, 0, ...] + t[:, 1, ...])
t = t.unsqueeze(1)
t = (self.max_temp - self.min_temp) * t + self.min_temp
return self.log_binomial_transform(p, t)

View File

@@ -0,0 +1,169 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import torch
import torch.nn as nn
class SeedBinRegressor(nn.Module):
def __init__(self, in_features, n_bins=16, mlp_dim=256, min_depth=1e-3, max_depth=10):
"""Bin center regressor network. Bin centers are bounded on (min_depth, max_depth) interval.
Args:
in_features (int): input channels
n_bins (int, optional): Number of bin centers. Defaults to 16.
mlp_dim (int, optional): Hidden dimension. Defaults to 256.
min_depth (float, optional): Min depth value. Defaults to 1e-3.
max_depth (float, optional): Max depth value. Defaults to 10.
"""
super().__init__()
self.version = "1_1"
self.min_depth = min_depth
self.max_depth = max_depth
self._net = nn.Sequential(
nn.Conv2d(in_features, mlp_dim, 1, 1, 0),
nn.ReLU(inplace=True),
nn.Conv2d(mlp_dim, n_bins, 1, 1, 0),
nn.ReLU(inplace=True)
)
def forward(self, x):
"""
Returns tensor of bin_width vectors (centers). One vector b for every pixel
"""
B = self._net(x)
eps = 1e-3
B = B + eps
B_widths_normed = B / B.sum(dim=1, keepdim=True)
B_widths = (self.max_depth - self.min_depth) * \
B_widths_normed # .shape NCHW
# pad has the form (left, right, top, bottom, front, back)
B_widths = nn.functional.pad(
B_widths, (0, 0, 0, 0, 1, 0), mode='constant', value=self.min_depth)
B_edges = torch.cumsum(B_widths, dim=1) # .shape NCHW
B_centers = 0.5 * (B_edges[:, :-1, ...] + B_edges[:, 1:, ...])
return B_widths_normed, B_centers
class SeedBinRegressorUnnormed(nn.Module):
def __init__(self, in_features, n_bins=16, mlp_dim=256, min_depth=1e-3, max_depth=10):
"""Bin center regressor network. Bin centers are unbounded
Args:
in_features (int): input channels
n_bins (int, optional): Number of bin centers. Defaults to 16.
mlp_dim (int, optional): Hidden dimension. Defaults to 256.
min_depth (float, optional): Not used. (for compatibility with SeedBinRegressor)
max_depth (float, optional): Not used. (for compatibility with SeedBinRegressor)
"""
super().__init__()
self.version = "1_1"
self._net = nn.Sequential(
nn.Conv2d(in_features, mlp_dim, 1, 1, 0),
nn.ReLU(inplace=True),
nn.Conv2d(mlp_dim, n_bins, 1, 1, 0),
nn.Softplus()
)
def forward(self, x):
"""
Returns tensor of bin_width vectors (centers). One vector b for every pixel
"""
B_centers = self._net(x)
return B_centers, B_centers
class Projector(nn.Module):
def __init__(self, in_features, out_features, mlp_dim=128):
"""Projector MLP
Args:
in_features (int): input channels
out_features (int): output channels
mlp_dim (int, optional): hidden dimension. Defaults to 128.
"""
super().__init__()
self._net = nn.Sequential(
nn.Conv2d(in_features, mlp_dim, 1, 1, 0),
nn.ReLU(inplace=True),
nn.Conv2d(mlp_dim, out_features, 1, 1, 0),
)
def forward(self, x):
return self._net(x)
class LinearSplitter(nn.Module):
def __init__(self, in_features, prev_nbins, split_factor=2, mlp_dim=128, min_depth=1e-3, max_depth=10):
super().__init__()
self.prev_nbins = prev_nbins
self.split_factor = split_factor
self.min_depth = min_depth
self.max_depth = max_depth
self._net = nn.Sequential(
nn.Conv2d(in_features, mlp_dim, 1, 1, 0),
nn.GELU(),
nn.Conv2d(mlp_dim, prev_nbins * split_factor, 1, 1, 0),
nn.ReLU()
)
def forward(self, x, b_prev, prev_b_embedding=None, interpolate=True, is_for_query=False):
"""
x : feature block; shape - n, c, h, w
b_prev : previous bin widths normed; shape - n, prev_nbins, h, w
"""
if prev_b_embedding is not None:
if interpolate:
prev_b_embedding = nn.functional.interpolate(prev_b_embedding, x.shape[-2:], mode='bilinear', align_corners=True)
x = x + prev_b_embedding
S = self._net(x)
eps = 1e-3
S = S + eps
n, c, h, w = S.shape
S = S.view(n, self.prev_nbins, self.split_factor, h, w)
S_normed = S / S.sum(dim=2, keepdim=True) # fractional splits
b_prev = nn.functional.interpolate(b_prev, (h,w), mode='bilinear', align_corners=True)
b_prev = b_prev / b_prev.sum(dim=1, keepdim=True) # renormalize for gurantees
# print(b_prev.shape, S_normed.shape)
# if is_for_query:(1).expand(-1, b_prev.size(0)//n, -1, -1, -1, -1).flatten(0,1) # TODO ? can replace all this with a single torch.repeat?
b = b_prev.unsqueeze(2) * S_normed
b = b.flatten(1,2) # .shape n, prev_nbins * split_factor, h, w
# calculate bin centers for loss calculation
B_widths = (self.max_depth - self.min_depth) * b # .shape N, nprev * splitfactor, H, W
# pad has the form (left, right, top, bottom, front, back)
B_widths = nn.functional.pad(B_widths, (0,0,0,0,1,0), mode='constant', value=self.min_depth)
B_edges = torch.cumsum(B_widths, dim=1) # .shape NCHW
B_centers = 0.5 * (B_edges[:, :-1, ...] + B_edges[:,1:,...])
return b, B_centers

View File

@@ -0,0 +1,91 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import torch
import torch.nn as nn
class PatchTransformerEncoder(nn.Module):
def __init__(self, in_channels, patch_size=10, embedding_dim=128, num_heads=4, use_class_token=False):
"""ViT-like transformer block
Args:
in_channels (int): Input channels
patch_size (int, optional): patch size. Defaults to 10.
embedding_dim (int, optional): Embedding dimension in transformer model. Defaults to 128.
num_heads (int, optional): number of attention heads. Defaults to 4.
use_class_token (bool, optional): Whether to use extra token at the start for global accumulation (called as "class token"). Defaults to False.
"""
super(PatchTransformerEncoder, self).__init__()
self.use_class_token = use_class_token
encoder_layers = nn.TransformerEncoderLayer(
embedding_dim, num_heads, dim_feedforward=1024)
self.transformer_encoder = nn.TransformerEncoder(
encoder_layers, num_layers=4) # takes shape S,N,E
self.embedding_convPxP = nn.Conv2d(in_channels, embedding_dim,
kernel_size=patch_size, stride=patch_size, padding=0)
def positional_encoding_1d(self, sequence_length, batch_size, embedding_dim, device='cpu'):
"""Generate positional encodings
Args:
sequence_length (int): Sequence length
embedding_dim (int): Embedding dimension
Returns:
torch.Tensor SBE: Positional encodings
"""
position = torch.arange(
0, sequence_length, dtype=torch.float32, device=device).unsqueeze(1)
index = torch.arange(
0, embedding_dim, 2, dtype=torch.float32, device=device).unsqueeze(0)
div_term = torch.exp(index * (-torch.log(torch.tensor(10000.0, device=device)) / embedding_dim))
pos_encoding = position * div_term
pos_encoding = torch.cat([torch.sin(pos_encoding), torch.cos(pos_encoding)], dim=1)
pos_encoding = pos_encoding.unsqueeze(1).repeat(1, batch_size, 1)
return pos_encoding
def forward(self, x):
"""Forward pass
Args:
x (torch.Tensor - NCHW): Input feature tensor
Returns:
torch.Tensor - SNE: Transformer output embeddings. S - sequence length (=HW/patch_size^2), N - batch size, E - embedding dim
"""
embeddings = self.embedding_convPxP(x).flatten(
2) # .shape = n,c,s = n, embedding_dim, s
if self.use_class_token:
# extra special token at start ?
embeddings = nn.functional.pad(embeddings, (1, 0))
# change to S,N,E format required by transformer
embeddings = embeddings.permute(2, 0, 1)
S, N, E = embeddings.shape
embeddings = embeddings + self.positional_encoding_1d(S, N, E, device=embeddings.device)
x = self.transformer_encoder(embeddings) # .shape = S, N, E
return x

View File

@@ -0,0 +1,92 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import torch
def load_state_dict(model, state_dict):
"""Load state_dict into model, handling DataParallel and DistributedDataParallel. Also checks for "model" key in state_dict.
DataParallel prefixes state_dict keys with 'module.' when saving.
If the model is not a DataParallel model but the state_dict is, then prefixes are removed.
If the model is a DataParallel model but the state_dict is not, then prefixes are added.
"""
state_dict = state_dict.get('model', state_dict)
# if model is a DataParallel model, then state_dict keys are prefixed with 'module.'
do_prefix = isinstance(
model, (torch.nn.DataParallel, torch.nn.parallel.DistributedDataParallel))
state = {}
for k, v in state_dict.items():
if k.startswith('module.') and not do_prefix:
k = k[7:]
if not k.startswith('module.') and do_prefix:
k = 'module.' + k
state[k] = v
model.load_state_dict(state)
print("Loaded successfully")
return model
def load_wts(model, checkpoint_path):
ckpt = torch.load(checkpoint_path, map_location='cpu')
return load_state_dict(model, ckpt)
def load_state_dict_from_url(model, url, **kwargs):
state_dict = torch.hub.load_state_dict_from_url(url, map_location='cpu', **kwargs)
return load_state_dict(model, state_dict)
def load_state_from_resource(model, resource: str):
"""Loads weights to the model from a given resource. A resource can be of following types:
1. URL. Prefixed with "url::"
e.g. url::http(s)://url.resource.com/ckpt.pt
2. Local path. Prefixed with "local::"
e.g. local::/path/to/ckpt.pt
Args:
model (torch.nn.Module): Model
resource (str): resource string
Returns:
torch.nn.Module: Model with loaded weights
"""
print(f"Using pretrained resource {resource}")
if resource.startswith('url::'):
url = resource.split('url::')[1]
return load_state_dict_from_url(model, url, progress=True)
elif resource.startswith('local::'):
path = resource.split('local::')[1]
return load_wts(model, path)
else:
raise ValueError("Invalid resource type, only url:: and local:: are supported")

View File

@@ -0,0 +1,31 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
from .zoedepth_v1 import ZoeDepth
all_versions = {
"v1": ZoeDepth,
}
get_version = lambda v : all_versions[v]

View File

@@ -0,0 +1,58 @@
{
"model": {
"name": "ZoeDepth",
"version_name": "v1",
"n_bins": 64,
"bin_embedding_dim": 128,
"bin_centers_type": "softplus",
"n_attractors":[16, 8, 4, 1],
"attractor_alpha": 1000,
"attractor_gamma": 2,
"attractor_kind" : "mean",
"attractor_type" : "inv",
"midas_model_type" : "DPT_BEiT_L_384",
"min_temp": 0.0212,
"max_temp": 50.0,
"output_distribution": "logbinomial",
"memory_efficient": true,
"inverse_midas": false,
"img_size": [392, 518]
},
"train": {
"train_midas": true,
"use_pretrained_midas": true,
"trainer": "zoedepth",
"epochs": 5,
"bs": 16,
"optim_kwargs": {"lr": 0.000161, "wd": 0.01},
"sched_kwargs": {"div_factor": 1, "final_div_factor": 10000, "pct_start": 0.7, "three_phase":false, "cycle_momentum": true},
"same_lr": false,
"w_si": 1,
"w_domain": 0.2,
"w_reg": 0,
"w_grad": 0,
"avoid_boundary": false,
"random_crop": false,
"input_width": 640,
"input_height": 480,
"midas_lr_factor": 50,
"encoder_lr_factor":50,
"pos_enc_lr_factor":50,
"freeze_midas_bn": true
},
"infer":{
"train_midas": false,
"use_pretrained_midas": false,
"pretrained_resource" : "url::https://github.com/isl-org/ZoeDepth/releases/download/v1.0/ZoeD_M12_N.pt",
"force_keep_ar": true
},
"eval":{
"train_midas": false,
"use_pretrained_midas": false,
"pretrained_resource" : "url::https://github.com/isl-org/ZoeDepth/releases/download/v1.0/ZoeD_M12_N.pt"
}
}

View File

@@ -0,0 +1,22 @@
{
"model": {
"bin_centers_type": "normed",
"img_size": [384, 768]
},
"train": {
},
"infer":{
"train_midas": false,
"use_pretrained_midas": false,
"pretrained_resource" : "url::https://github.com/isl-org/ZoeDepth/releases/download/v1.0/ZoeD_M12_K.pt",
"force_keep_ar": true
},
"eval":{
"train_midas": false,
"use_pretrained_midas": false,
"pretrained_resource" : "url::https://github.com/isl-org/ZoeDepth/releases/download/v1.0/ZoeD_M12_K.pt"
}
}

View File

@@ -0,0 +1,264 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import itertools
import torch
import torch.nn as nn
from zoedepth.models.depth_model import DepthModel
from zoedepth.models.base_models.midas import MidasCore
from zoedepth.models.base_models.depth_anything import DepthAnythingCore
from zoedepth.models.layers.attractor import AttractorLayer, AttractorLayerUnnormed
from zoedepth.models.layers.dist_layers import ConditionalLogBinomial
from zoedepth.models.layers.localbins_layers import (Projector, SeedBinRegressor,
SeedBinRegressorUnnormed)
from zoedepth.models.model_io import load_state_from_resource
class ZoeDepth(DepthModel):
def __init__(self, core, n_bins=64, bin_centers_type="softplus", bin_embedding_dim=128, min_depth=1e-3, max_depth=10,
n_attractors=[16, 8, 4, 1], attractor_alpha=300, attractor_gamma=2, attractor_kind='sum', attractor_type='exp', min_temp=5, max_temp=50, train_midas=True,
midas_lr_factor=10, encoder_lr_factor=10, pos_enc_lr_factor=10, inverse_midas=False, **kwargs):
"""ZoeDepth model. This is the version of ZoeDepth that has a single metric head
Args:
core (models.base_models.midas.MidasCore): The base midas model that is used for extraction of "relative" features
n_bins (int, optional): Number of bin centers. Defaults to 64.
bin_centers_type (str, optional): "normed" or "softplus". Activation type used for bin centers. For "normed" bin centers, linear normalization trick is applied. This results in bounded bin centers.
For "softplus", softplus activation is used and thus are unbounded. Defaults to "softplus".
bin_embedding_dim (int, optional): bin embedding dimension. Defaults to 128.
min_depth (float, optional): Lower bound for normed bin centers. Defaults to 1e-3.
max_depth (float, optional): Upper bound for normed bin centers. Defaults to 10.
n_attractors (List[int], optional): Number of bin attractors at decoder layers. Defaults to [16, 8, 4, 1].
attractor_alpha (int, optional): Proportional attractor strength. Refer to models.layers.attractor for more details. Defaults to 300.
attractor_gamma (int, optional): Exponential attractor strength. Refer to models.layers.attractor for more details. Defaults to 2.
attractor_kind (str, optional): Attraction aggregation "sum" or "mean". Defaults to 'sum'.
attractor_type (str, optional): Type of attractor to use; "inv" (Inverse attractor) or "exp" (Exponential attractor). Defaults to 'exp'.
min_temp (int, optional): Lower bound for temperature of output probability distribution. Defaults to 5.
max_temp (int, optional): Upper bound for temperature of output probability distribution. Defaults to 50.
train_midas (bool, optional): Whether to train "core", the base midas model. Defaults to True.
midas_lr_factor (int, optional): Learning rate reduction factor for base midas model except its encoder and positional encodings. Defaults to 10.
encoder_lr_factor (int, optional): Learning rate reduction factor for the encoder in midas model. Defaults to 10.
pos_enc_lr_factor (int, optional): Learning rate reduction factor for positional encodings in the base midas model. Defaults to 10.
"""
super().__init__()
self.core = core
self.max_depth = max_depth
self.min_depth = min_depth
self.min_temp = min_temp
self.bin_centers_type = bin_centers_type
self.midas_lr_factor = midas_lr_factor
self.encoder_lr_factor = encoder_lr_factor
self.pos_enc_lr_factor = pos_enc_lr_factor
self.train_midas = train_midas
self.inverse_midas = inverse_midas
if self.encoder_lr_factor <= 0:
self.core.freeze_encoder(
freeze_rel_pos=self.pos_enc_lr_factor <= 0)
N_MIDAS_OUT = 32
btlnck_features = self.core.output_channels[0]
num_out_features = self.core.output_channels[1:]
# print('core output channels:', self.core.output_channels)
self.conv2 = nn.Conv2d(btlnck_features, btlnck_features,
kernel_size=1, stride=1, padding=0) # btlnck conv
if bin_centers_type == "normed":
SeedBinRegressorLayer = SeedBinRegressor
Attractor = AttractorLayer
elif bin_centers_type == "softplus":
SeedBinRegressorLayer = SeedBinRegressorUnnormed
Attractor = AttractorLayerUnnormed
elif bin_centers_type == "hybrid1":
SeedBinRegressorLayer = SeedBinRegressor
Attractor = AttractorLayerUnnormed
elif bin_centers_type == "hybrid2":
SeedBinRegressorLayer = SeedBinRegressorUnnormed
Attractor = AttractorLayer
else:
raise ValueError(
"bin_centers_type should be one of 'normed', 'softplus', 'hybrid1', 'hybrid2'")
self.seed_bin_regressor = SeedBinRegressorLayer(
btlnck_features, n_bins=n_bins, min_depth=min_depth, max_depth=max_depth)
self.seed_projector = Projector(btlnck_features, bin_embedding_dim)
self.projectors = nn.ModuleList([
Projector(num_out, bin_embedding_dim)
for num_out in num_out_features
])
self.attractors = nn.ModuleList([
Attractor(bin_embedding_dim, n_bins, n_attractors=n_attractors[i], min_depth=min_depth, max_depth=max_depth,
alpha=attractor_alpha, gamma=attractor_gamma, kind=attractor_kind, attractor_type=attractor_type)
for i in range(len(num_out_features))
])
last_in = N_MIDAS_OUT + 1 # +1 for relative depth
# use log binomial instead of softmax
self.conditional_log_binomial = ConditionalLogBinomial(
last_in, bin_embedding_dim, n_classes=n_bins, min_temp=min_temp, max_temp=max_temp)
def forward(self, x, return_final_centers=False, denorm=False, return_probs=False, **kwargs):
"""
Args:
x (torch.Tensor): Input image tensor of shape (B, C, H, W)
return_final_centers (bool, optional): Whether to return the final bin centers. Defaults to False.
denorm (bool, optional): Whether to denormalize the input image. This reverses ImageNet normalization as midas normalization is different. Defaults to False.
return_probs (bool, optional): Whether to return the output probability distribution. Defaults to False.
Returns:
dict: Dictionary containing the following keys:
- rel_depth (torch.Tensor): Relative depth map of shape (B, H, W)
- metric_depth (torch.Tensor): Metric depth map of shape (B, 1, H, W)
- bin_centers (torch.Tensor): Bin centers of shape (B, n_bins). Present only if return_final_centers is True
- probs (torch.Tensor): Output probability distribution of shape (B, n_bins, H, W). Present only if return_probs is True
"""
# print('input shape', x.shape)
b, c, h, w = x.shape
# print("input shape:", x.shape)
self.orig_input_width = w
self.orig_input_height = h
rel_depth, out = self.core(x, denorm=denorm, return_rel_depth=True)
# print("output shapes", rel_depth.shape, out.shape)
# print('rel_depth shape:', rel_depth.shape)
# print('out type:', type(out))
# for k in range(len(out)):
# print(k, out[k].shape)
outconv_activation = out[0]
btlnck = out[1]
x_blocks = out[2:]
x_d0 = self.conv2(btlnck)
x = x_d0
_, seed_b_centers = self.seed_bin_regressor(x)
if self.bin_centers_type == 'normed' or self.bin_centers_type == 'hybrid2':
b_prev = (seed_b_centers - self.min_depth) / \
(self.max_depth - self.min_depth)
else:
b_prev = seed_b_centers
prev_b_embedding = self.seed_projector(x)
# unroll this loop for better performance
for projector, attractor, x in zip(self.projectors, self.attractors, x_blocks):
b_embedding = projector(x)
b, b_centers = attractor(
b_embedding, b_prev, prev_b_embedding, interpolate=True)
b_prev = b.clone()
prev_b_embedding = b_embedding.clone()
last = outconv_activation
if self.inverse_midas:
# invert depth followed by normalization
rel_depth = 1.0 / (rel_depth + 1e-6)
rel_depth = (rel_depth - rel_depth.min()) / \
(rel_depth.max() - rel_depth.min())
# concat rel depth with last. First interpolate rel depth to last size
rel_cond = rel_depth.unsqueeze(1)
rel_cond = nn.functional.interpolate(
rel_cond, size=last.shape[2:], mode='bilinear', align_corners=True)
last = torch.cat([last, rel_cond], dim=1)
b_embedding = nn.functional.interpolate(
b_embedding, last.shape[-2:], mode='bilinear', align_corners=True)
x = self.conditional_log_binomial(last, b_embedding)
# Now depth value is Sum px * cx , where cx are bin_centers from the last bin tensor
# print(x.shape, b_centers.shape)
b_centers = nn.functional.interpolate(
b_centers, x.shape[-2:], mode='bilinear', align_corners=True)
out = torch.sum(x * b_centers, dim=1, keepdim=True)
# Structure output dict
output = dict(metric_depth=out)
if return_final_centers or return_probs:
output['bin_centers'] = b_centers
if return_probs:
output['probs'] = x
return output
def get_lr_params(self, lr):
"""
Learning rate configuration for different layers of the model
Args:
lr (float) : Base learning rate
Returns:
list : list of parameters to optimize and their learning rates, in the format required by torch optimizers.
"""
param_conf = []
if self.train_midas:
if self.encoder_lr_factor > 0:
param_conf.append({'params': self.core.get_enc_params_except_rel_pos(
), 'lr': lr / self.encoder_lr_factor})
if self.pos_enc_lr_factor > 0:
param_conf.append(
{'params': self.core.get_rel_pos_params(), 'lr': lr / self.pos_enc_lr_factor})
# midas_params = self.core.core.scratch.parameters()
midas_params = self.core.core.depth_head.parameters()
midas_lr_factor = self.midas_lr_factor
param_conf.append(
{'params': midas_params, 'lr': lr / midas_lr_factor})
remaining_modules = []
for name, child in self.named_children():
if name != 'core':
remaining_modules.append(child)
remaining_params = itertools.chain(
*[child.parameters() for child in remaining_modules])
param_conf.append({'params': remaining_params, 'lr': lr})
return param_conf
@staticmethod
def build(midas_model_type="DPT_BEiT_L_384", pretrained_resource=None, use_pretrained_midas=False, train_midas=False, freeze_midas_bn=True, **kwargs):
# core = MidasCore.build(midas_model_type=midas_model_type, use_pretrained_midas=use_pretrained_midas,
# train_midas=train_midas, fetch_features=True, freeze_bn=freeze_midas_bn, **kwargs)
core = DepthAnythingCore.build(midas_model_type=midas_model_type, use_pretrained_midas=use_pretrained_midas,
train_midas=train_midas, fetch_features=True, freeze_bn=freeze_midas_bn, **kwargs)
model = ZoeDepth(core, **kwargs)
if pretrained_resource:
assert isinstance(pretrained_resource, str), "pretrained_resource must be a string"
model = load_state_from_resource(model, pretrained_resource)
return model
@staticmethod
def build_from_config(config):
return ZoeDepth.build(**config)

View File

@@ -0,0 +1,31 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
from .zoedepth_nk_v1 import ZoeDepthNK
all_versions = {
"v1": ZoeDepthNK,
}
get_version = lambda v : all_versions[v]

View File

@@ -0,0 +1,67 @@
{
"model": {
"name": "ZoeDepthNK",
"version_name": "v1",
"bin_conf" : [
{
"name": "nyu",
"n_bins": 64,
"min_depth": 1e-3,
"max_depth": 10.0
},
{
"name": "kitti",
"n_bins": 64,
"min_depth": 1e-3,
"max_depth": 80.0
}
],
"bin_embedding_dim": 128,
"bin_centers_type": "softplus",
"n_attractors":[16, 8, 4, 1],
"attractor_alpha": 1000,
"attractor_gamma": 2,
"attractor_kind" : "mean",
"attractor_type" : "inv",
"min_temp": 0.0212,
"max_temp": 50.0,
"memory_efficient": true,
"midas_model_type" : "DPT_BEiT_L_384",
"img_size": [392, 518]
},
"train": {
"train_midas": true,
"use_pretrained_midas": true,
"trainer": "zoedepth_nk",
"epochs": 10,
"bs": 16,
"optim_kwargs": {"lr": 0.0002512, "wd": 0.01},
"sched_kwargs": {"div_factor": 1, "final_div_factor": 10000, "pct_start": 0.7, "three_phase":false, "cycle_momentum": true},
"same_lr": false,
"w_si": 1,
"w_domain": 100,
"avoid_boundary": false,
"random_crop": false,
"input_width": 640,
"input_height": 480,
"w_grad": 0,
"w_reg": 0,
"midas_lr_factor": 50,
"encoder_lr_factor": 50,
"pos_enc_lr_factor": 50
},
"infer": {
"train_midas": false,
"pretrained_resource": "url::https://github.com/isl-org/ZoeDepth/releases/download/v1.0/ZoeD_M12_NK.pt",
"use_pretrained_midas": false,
"force_keep_ar": true
},
"eval": {
"train_midas": false,
"pretrained_resource": "url::https://github.com/isl-org/ZoeDepth/releases/download/v1.0/ZoeD_M12_NK.pt",
"use_pretrained_midas": false
}
}

View File

@@ -0,0 +1,341 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import itertools
import torch
import torch.nn as nn
from zoedepth.models.depth_model import DepthModel
from zoedepth.models.base_models.midas import MidasCore
from zoedepth.models.base_models.depth_anything import DepthAnythingCore
from zoedepth.models.layers.attractor import AttractorLayer, AttractorLayerUnnormed
from zoedepth.models.layers.dist_layers import ConditionalLogBinomial
from zoedepth.models.layers.localbins_layers import (Projector, SeedBinRegressor,
SeedBinRegressorUnnormed)
from zoedepth.models.layers.patch_transformer import PatchTransformerEncoder
from zoedepth.models.model_io import load_state_from_resource
class ZoeDepthNK(DepthModel):
def __init__(self, core, bin_conf, bin_centers_type="softplus", bin_embedding_dim=128,
n_attractors=[16, 8, 4, 1], attractor_alpha=300, attractor_gamma=2, attractor_kind='sum', attractor_type='exp',
min_temp=5, max_temp=50,
memory_efficient=False, train_midas=True,
is_midas_pretrained=True, midas_lr_factor=1, encoder_lr_factor=10, pos_enc_lr_factor=10, inverse_midas=False, **kwargs):
"""ZoeDepthNK model. This is the version of ZoeDepth that has two metric heads and uses a learned router to route to experts.
Args:
core (models.base_models.midas.MidasCore): The base midas model that is used for extraction of "relative" features
bin_conf (List[dict]): A list of dictionaries that contain the bin configuration for each metric head. Each dictionary should contain the following keys:
"name" (str, typically same as the dataset name), "n_bins" (int), "min_depth" (float), "max_depth" (float)
The length of this list determines the number of metric heads.
bin_centers_type (str, optional): "normed" or "softplus". Activation type used for bin centers. For "normed" bin centers, linear normalization trick is applied. This results in bounded bin centers.
For "softplus", softplus activation is used and thus are unbounded. Defaults to "normed".
bin_embedding_dim (int, optional): bin embedding dimension. Defaults to 128.
n_attractors (List[int], optional): Number of bin attractors at decoder layers. Defaults to [16, 8, 4, 1].
attractor_alpha (int, optional): Proportional attractor strength. Refer to models.layers.attractor for more details. Defaults to 300.
attractor_gamma (int, optional): Exponential attractor strength. Refer to models.layers.attractor for more details. Defaults to 2.
attractor_kind (str, optional): Attraction aggregation "sum" or "mean". Defaults to 'sum'.
attractor_type (str, optional): Type of attractor to use; "inv" (Inverse attractor) or "exp" (Exponential attractor). Defaults to 'exp'.
min_temp (int, optional): Lower bound for temperature of output probability distribution. Defaults to 5.
max_temp (int, optional): Upper bound for temperature of output probability distribution. Defaults to 50.
memory_efficient (bool, optional): Whether to use memory efficient version of attractor layers. Memory efficient version is slower but is recommended incase of multiple metric heads in order save GPU memory. Defaults to False.
train_midas (bool, optional): Whether to train "core", the base midas model. Defaults to True.
is_midas_pretrained (bool, optional): Is "core" pretrained? Defaults to True.
midas_lr_factor (int, optional): Learning rate reduction factor for base midas model except its encoder and positional encodings. Defaults to 10.
encoder_lr_factor (int, optional): Learning rate reduction factor for the encoder in midas model. Defaults to 10.
pos_enc_lr_factor (int, optional): Learning rate reduction factor for positional encodings in the base midas model. Defaults to 10.
"""
super().__init__()
self.core = core
self.bin_conf = bin_conf
self.min_temp = min_temp
self.max_temp = max_temp
self.memory_efficient = memory_efficient
self.train_midas = train_midas
self.is_midas_pretrained = is_midas_pretrained
self.midas_lr_factor = midas_lr_factor
self.encoder_lr_factor = encoder_lr_factor
self.pos_enc_lr_factor = pos_enc_lr_factor
self.inverse_midas = inverse_midas
N_MIDAS_OUT = 32
btlnck_features = self.core.output_channels[0]
num_out_features = self.core.output_channels[1:]
# self.scales = [16, 8, 4, 2] # spatial scale factors
self.conv2 = nn.Conv2d(
btlnck_features, btlnck_features, kernel_size=1, stride=1, padding=0)
# Transformer classifier on the bottleneck
self.patch_transformer = PatchTransformerEncoder(
btlnck_features, 1, 128, use_class_token=True)
self.mlp_classifier = nn.Sequential(
nn.Linear(128, 128),
nn.ReLU(),
nn.Linear(128, 2)
)
if bin_centers_type == "normed":
SeedBinRegressorLayer = SeedBinRegressor
Attractor = AttractorLayer
elif bin_centers_type == "softplus":
SeedBinRegressorLayer = SeedBinRegressorUnnormed
Attractor = AttractorLayerUnnormed
elif bin_centers_type == "hybrid1":
SeedBinRegressorLayer = SeedBinRegressor
Attractor = AttractorLayerUnnormed
elif bin_centers_type == "hybrid2":
SeedBinRegressorLayer = SeedBinRegressorUnnormed
Attractor = AttractorLayer
else:
raise ValueError(
"bin_centers_type should be one of 'normed', 'softplus', 'hybrid1', 'hybrid2'")
self.bin_centers_type = bin_centers_type
# We have bins for each bin conf.
# Create a map (ModuleDict) of 'name' -> seed_bin_regressor
self.seed_bin_regressors = nn.ModuleDict(
{conf['name']: SeedBinRegressorLayer(btlnck_features, conf["n_bins"], mlp_dim=bin_embedding_dim//2, min_depth=conf["min_depth"], max_depth=conf["max_depth"])
for conf in bin_conf}
)
self.seed_projector = Projector(
btlnck_features, bin_embedding_dim, mlp_dim=bin_embedding_dim//2)
self.projectors = nn.ModuleList([
Projector(num_out, bin_embedding_dim, mlp_dim=bin_embedding_dim//2)
for num_out in num_out_features
])
# Create a map (ModuleDict) of 'name' -> attractors (ModuleList)
self.attractors = nn.ModuleDict(
{conf['name']: nn.ModuleList([
Attractor(bin_embedding_dim, n_attractors[i],
mlp_dim=bin_embedding_dim, alpha=attractor_alpha,
gamma=attractor_gamma, kind=attractor_kind,
attractor_type=attractor_type, memory_efficient=memory_efficient,
min_depth=conf["min_depth"], max_depth=conf["max_depth"])
for i in range(len(n_attractors))
])
for conf in bin_conf}
)
last_in = N_MIDAS_OUT
# conditional log binomial for each bin conf
self.conditional_log_binomial = nn.ModuleDict(
{conf['name']: ConditionalLogBinomial(last_in, bin_embedding_dim, conf['n_bins'], bottleneck_factor=4, min_temp=self.min_temp, max_temp=self.max_temp)
for conf in bin_conf}
)
def forward(self, x, return_final_centers=False, denorm=False, return_probs=False, **kwargs):
"""
Args:
x (torch.Tensor): Input image tensor of shape (B, C, H, W). Assumes all images are from the same domain.
return_final_centers (bool, optional): Whether to return the final centers of the attractors. Defaults to False.
denorm (bool, optional): Whether to denormalize the input image. Defaults to False.
return_probs (bool, optional): Whether to return the probabilities of the bins. Defaults to False.
Returns:
dict: Dictionary of outputs with keys:
- "rel_depth": Relative depth map of shape (B, 1, H, W)
- "metric_depth": Metric depth map of shape (B, 1, H, W)
- "domain_logits": Domain logits of shape (B, 2)
- "bin_centers": Bin centers of shape (B, N, H, W). Present only if return_final_centers is True
- "probs": Bin probabilities of shape (B, N, H, W). Present only if return_probs is True
"""
b, c, h, w = x.shape
self.orig_input_width = w
self.orig_input_height = h
rel_depth, out = self.core(x, denorm=denorm, return_rel_depth=True)
outconv_activation = out[0]
btlnck = out[1]
x_blocks = out[2:]
x_d0 = self.conv2(btlnck)
x = x_d0
# Predict which path to take
embedding = self.patch_transformer(x)[0] # N, E
domain_logits = self.mlp_classifier(embedding) # N, 2
domain_vote = torch.softmax(domain_logits.sum(
dim=0, keepdim=True), dim=-1) # 1, 2
# Get the path
bin_conf_name = ["nyu", "kitti"][torch.argmax(
domain_vote, dim=-1).squeeze().item()]
try:
conf = [c for c in self.bin_conf if c.name == bin_conf_name][0]
except IndexError:
raise ValueError(
f"bin_conf_name {bin_conf_name} not found in bin_confs")
min_depth = conf['min_depth']
max_depth = conf['max_depth']
seed_bin_regressor = self.seed_bin_regressors[bin_conf_name]
_, seed_b_centers = seed_bin_regressor(x)
if self.bin_centers_type == 'normed' or self.bin_centers_type == 'hybrid2':
b_prev = (seed_b_centers - min_depth)/(max_depth - min_depth)
else:
b_prev = seed_b_centers
prev_b_embedding = self.seed_projector(x)
attractors = self.attractors[bin_conf_name]
for projector, attractor, x in zip(self.projectors, attractors, x_blocks):
b_embedding = projector(x)
b, b_centers = attractor(
b_embedding, b_prev, prev_b_embedding, interpolate=True)
b_prev = b
prev_b_embedding = b_embedding
last = outconv_activation
b_centers = nn.functional.interpolate(
b_centers, last.shape[-2:], mode='bilinear', align_corners=True)
b_embedding = nn.functional.interpolate(
b_embedding, last.shape[-2:], mode='bilinear', align_corners=True)
clb = self.conditional_log_binomial[bin_conf_name]
x = clb(last, b_embedding)
# Now depth value is Sum px * cx , where cx are bin_centers from the last bin tensor
# print(x.shape, b_centers.shape)
# b_centers = nn.functional.interpolate(b_centers, x.shape[-2:], mode='bilinear', align_corners=True)
out = torch.sum(x * b_centers, dim=1, keepdim=True)
output = dict(domain_logits=domain_logits, metric_depth=out)
if return_final_centers or return_probs:
output['bin_centers'] = b_centers
if return_probs:
output['probs'] = x
return output
def get_lr_params(self, lr):
"""
Learning rate configuration for different layers of the model
Args:
lr (float) : Base learning rate
Returns:
list : list of parameters to optimize and their learning rates, in the format required by torch optimizers.
"""
param_conf = []
if self.train_midas:
def get_rel_pos_params():
for name, p in self.core.core.pretrained.named_parameters():
# if "relative_position" in name:
if "pos_embed" in name:
yield p
def get_enc_params_except_rel_pos():
for name, p in self.core.core.pretrained.named_parameters():
# if "relative_position" not in name:
if "pos_embed" not in name:
yield p
encoder_params = get_enc_params_except_rel_pos()
rel_pos_params = get_rel_pos_params()
# midas_params = self.core.core.scratch.parameters()
midas_params = self.core.core.depth_head.parameters()
midas_lr_factor = self.midas_lr_factor if self.is_midas_pretrained else 1.0
param_conf.extend([
{'params': encoder_params, 'lr': lr / self.encoder_lr_factor},
{'params': rel_pos_params, 'lr': lr / self.pos_enc_lr_factor},
{'params': midas_params, 'lr': lr / midas_lr_factor}
])
remaining_modules = []
for name, child in self.named_children():
if name != 'core':
remaining_modules.append(child)
remaining_params = itertools.chain(
*[child.parameters() for child in remaining_modules])
param_conf.append({'params': remaining_params, 'lr': lr})
return param_conf
def get_conf_parameters(self, conf_name):
"""
Returns parameters of all the ModuleDicts children that are exclusively used for the given bin configuration
"""
params = []
for name, child in self.named_children():
if isinstance(child, nn.ModuleDict):
for bin_conf_name, module in child.items():
if bin_conf_name == conf_name:
params += list(module.parameters())
return params
def freeze_conf(self, conf_name):
"""
Freezes all the parameters of all the ModuleDicts children that are exclusively used for the given bin configuration
"""
for p in self.get_conf_parameters(conf_name):
p.requires_grad = False
def unfreeze_conf(self, conf_name):
"""
Unfreezes all the parameters of all the ModuleDicts children that are exclusively used for the given bin configuration
"""
for p in self.get_conf_parameters(conf_name):
p.requires_grad = True
def freeze_all_confs(self):
"""
Freezes all the parameters of all the ModuleDicts children
"""
for name, child in self.named_children():
if isinstance(child, nn.ModuleDict):
for bin_conf_name, module in child.items():
for p in module.parameters():
p.requires_grad = False
@staticmethod
def build(midas_model_type="DPT_BEiT_L_384", pretrained_resource=None, use_pretrained_midas=False, train_midas=False, freeze_midas_bn=True, **kwargs):
# core = MidasCore.build(midas_model_type=midas_model_type, use_pretrained_midas=use_pretrained_midas,
# train_midas=train_midas, fetch_features=True, freeze_bn=freeze_midas_bn, **kwargs)
core = DepthAnythingCore.build(midas_model_type='dinov2_large', use_pretrained_midas=use_pretrained_midas,
train_midas=train_midas, fetch_features=True, freeze_bn=freeze_midas_bn, **kwargs)
model = ZoeDepthNK(core, **kwargs)
if pretrained_resource:
assert isinstance(pretrained_resource, str), "pretrained_resource must be a string"
model = load_state_from_resource(model, pretrained_resource)
return model
@staticmethod
def build_from_config(config):
return ZoeDepthNK.build(**config)

View File

@@ -0,0 +1,326 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import os
import uuid
import warnings
from datetime import datetime as dt
from typing import Dict
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.distributed as dist
import torch.nn as nn
import torch.optim as optim
import wandb
from tqdm import tqdm
from zoedepth.utils.config import flatten
from zoedepth.utils.misc import RunningAverageDict, colorize, colors
def is_rank_zero(args):
return args.rank == 0
class BaseTrainer:
def __init__(self, config, model, train_loader, test_loader=None, device=None):
""" Base Trainer class for training a model."""
self.config = config
self.metric_criterion = "abs_rel"
if device is None:
device = torch.device(
'cuda') if torch.cuda.is_available() else torch.device('cpu')
self.device = device
self.model = model
self.train_loader = train_loader
self.test_loader = test_loader
self.optimizer = self.init_optimizer()
self.scheduler = self.init_scheduler()
def resize_to_target(self, prediction, target):
if prediction.shape[2:] != target.shape[-2:]:
prediction = nn.functional.interpolate(
prediction, size=target.shape[-2:], mode="bilinear", align_corners=True
)
return prediction
def load_ckpt(self, checkpoint_dir="./checkpoints", ckpt_type="best"):
import glob
import os
from zoedepth.models.model_io import load_wts
if hasattr(self.config, "checkpoint"):
checkpoint = self.config.checkpoint
elif hasattr(self.config, "ckpt_pattern"):
pattern = self.config.ckpt_pattern
matches = glob.glob(os.path.join(
checkpoint_dir, f"*{pattern}*{ckpt_type}*"))
if not (len(matches) > 0):
raise ValueError(f"No matches found for the pattern {pattern}")
checkpoint = matches[0]
else:
return
model = load_wts(self.model, checkpoint)
# TODO : Resuming training is not properly supported in this repo. Implement loading / saving of optimizer and scheduler to support it.
print("Loaded weights from {0}".format(checkpoint))
warnings.warn(
"Resuming training is not properly supported in this repo. Implement loading / saving of optimizer and scheduler to support it.")
self.model = model
def init_optimizer(self):
m = self.model.module if self.config.multigpu else self.model
if self.config.same_lr:
print("Using same LR")
if hasattr(m, 'core'):
m.core.unfreeze()
params = self.model.parameters()
else:
print("Using diff LR")
if not hasattr(m, 'get_lr_params'):
raise NotImplementedError(
f"Model {m.__class__.__name__} does not implement get_lr_params. Please implement it or use the same LR for all parameters.")
params = m.get_lr_params(self.config.lr)
return optim.AdamW(params, lr=self.config.lr, weight_decay=self.config.wd)
def init_scheduler(self):
lrs = [l['lr'] for l in self.optimizer.param_groups]
return optim.lr_scheduler.OneCycleLR(self.optimizer, lrs, epochs=self.config.epochs, steps_per_epoch=len(self.train_loader),
cycle_momentum=self.config.cycle_momentum,
base_momentum=0.85, max_momentum=0.95, div_factor=self.config.div_factor, final_div_factor=self.config.final_div_factor, pct_start=self.config.pct_start, three_phase=self.config.three_phase)
def train_on_batch(self, batch, train_step):
raise NotImplementedError
def validate_on_batch(self, batch, val_step):
raise NotImplementedError
def raise_if_nan(self, losses):
for key, value in losses.items():
if torch.isnan(value):
raise ValueError(f"{key} is NaN, Stopping training")
@property
def iters_per_epoch(self):
return len(self.train_loader)
@property
def total_iters(self):
return self.config.epochs * self.iters_per_epoch
def should_early_stop(self):
if self.config.get('early_stop', False) and self.step > self.config.early_stop:
return True
def train(self):
print(f"Training {self.config.name}")
if self.config.uid is None:
self.config.uid = str(uuid.uuid4()).split('-')[-1]
run_id = f"{dt.now().strftime('%d-%h_%H-%M')}-{self.config.uid}"
self.config.run_id = run_id
self.config.experiment_id = f"{self.config.name}{self.config.version_name}_{run_id}"
self.should_write = ((not self.config.distributed)
or self.config.rank == 0)
self.should_log = self.should_write # and logging
if self.should_log:
tags = self.config.tags.split(
',') if self.config.tags != '' else None
wandb.init(project=self.config.project, name=self.config.experiment_id, config=flatten(self.config), dir=self.config.root,
tags=tags, notes=self.config.notes, settings=wandb.Settings(start_method="fork"))
self.model.train()
self.step = 0
best_loss = np.inf
validate_every = int(self.config.validate_every * self.iters_per_epoch)
if self.config.prefetch:
for i, batch in tqdm(enumerate(self.train_loader), desc=f"Prefetching...",
total=self.iters_per_epoch) if is_rank_zero(self.config) else enumerate(self.train_loader):
pass
losses = {}
def stringify_losses(L): return "; ".join(map(
lambda kv: f"{colors.fg.purple}{kv[0]}{colors.reset}: {round(kv[1].item(),3):.4e}", L.items()))
for epoch in range(self.config.epochs):
if self.should_early_stop():
break
self.epoch = epoch
################################# Train loop ##########################################################
if self.should_log:
wandb.log({"Epoch": epoch}, step=self.step)
pbar = tqdm(enumerate(self.train_loader), desc=f"Epoch: {epoch + 1}/{self.config.epochs}. Loop: Train",
total=self.iters_per_epoch) if is_rank_zero(self.config) else enumerate(self.train_loader)
for i, batch in pbar:
if self.should_early_stop():
print("Early stopping")
break
# print(f"Batch {self.step+1} on rank {self.config.rank}")
losses = self.train_on_batch(batch, i)
# print(f"trained batch {self.step+1} on rank {self.config.rank}")
self.raise_if_nan(losses)
if is_rank_zero(self.config) and self.config.print_losses:
pbar.set_description(
f"Epoch: {epoch + 1}/{self.config.epochs}. Loop: Train. Losses: {stringify_losses(losses)}")
self.scheduler.step()
if self.should_log and self.step % 50 == 0:
wandb.log({f"Train/{name}": loss.item()
for name, loss in losses.items()}, step=self.step)
self.step += 1
########################################################################################################
if self.test_loader:
if (self.step % validate_every) == 0:
self.model.eval()
if self.should_write:
self.save_checkpoint(
f"{self.config.experiment_id}_latest.pt")
################################# Validation loop ##################################################
# validate on the entire validation set in every process but save only from rank 0, I know, inefficient, but avoids divergence of processes
metrics, test_losses = self.validate()
# print("Validated: {}".format(metrics))
if self.should_log:
wandb.log(
{f"Test/{name}": tloss for name, tloss in test_losses.items()}, step=self.step)
wandb.log({f"Metrics/{k}": v for k,
v in metrics.items()}, step=self.step)
if (metrics[self.metric_criterion] < best_loss) and self.should_write:
self.save_checkpoint(
f"{self.config.experiment_id}_best.pt")
best_loss = metrics[self.metric_criterion]
self.model.train()
if self.config.distributed:
dist.barrier()
# print(f"Validated: {metrics} on device {self.config.rank}")
# print(f"Finished step {self.step} on device {self.config.rank}")
#################################################################################################
# Save / validate at the end
self.step += 1 # log as final point
self.model.eval()
self.save_checkpoint(f"{self.config.experiment_id}_latest.pt")
if self.test_loader:
################################# Validation loop ##################################################
metrics, test_losses = self.validate()
# print("Validated: {}".format(metrics))
if self.should_log:
wandb.log({f"Test/{name}": tloss for name,
tloss in test_losses.items()}, step=self.step)
wandb.log({f"Metrics/{k}": v for k,
v in metrics.items()}, step=self.step)
if (metrics[self.metric_criterion] < best_loss) and self.should_write:
self.save_checkpoint(
f"{self.config.experiment_id}_best.pt")
best_loss = metrics[self.metric_criterion]
self.model.train()
def validate(self):
with torch.no_grad():
losses_avg = RunningAverageDict()
metrics_avg = RunningAverageDict()
for i, batch in tqdm(enumerate(self.test_loader), desc=f"Epoch: {self.epoch + 1}/{self.config.epochs}. Loop: Validation", total=len(self.test_loader), disable=not is_rank_zero(self.config)):
metrics, losses = self.validate_on_batch(batch, val_step=i)
if losses:
losses_avg.update(losses)
if metrics:
metrics_avg.update(metrics)
return metrics_avg.get_value(), losses_avg.get_value()
def save_checkpoint(self, filename):
if not self.should_write:
return
root = self.config.save_dir
if not os.path.isdir(root):
os.makedirs(root)
fpath = os.path.join(root, filename)
m = self.model.module if self.config.multigpu else self.model
torch.save(
{
"model": m.state_dict(),
"optimizer": None, # TODO : Change to self.optimizer.state_dict() if resume support is needed, currently None to reduce file size
"epoch": self.epoch
}, fpath)
def log_images(self, rgb: Dict[str, list] = {}, depth: Dict[str, list] = {}, scalar_field: Dict[str, list] = {}, prefix="", scalar_cmap="jet", min_depth=None, max_depth=None):
if not self.should_log:
return
if min_depth is None:
try:
min_depth = self.config.min_depth
max_depth = self.config.max_depth
except AttributeError:
min_depth = None
max_depth = None
depth = {k: colorize(v, vmin=min_depth, vmax=max_depth)
for k, v in depth.items()}
scalar_field = {k: colorize(
v, vmin=None, vmax=None, cmap=scalar_cmap) for k, v in scalar_field.items()}
images = {**rgb, **depth, **scalar_field}
wimages = {
prefix+"Predictions": [wandb.Image(v, caption=k) for k, v in images.items()]}
wandb.log(wimages, step=self.step)
def log_line_plot(self, data):
if not self.should_log:
return
plt.plot(data)
plt.ylabel("Scale factors")
wandb.log({"Scale factors": wandb.Image(plt)}, step=self.step)
plt.close()
def log_bar_plot(self, title, labels, values):
if not self.should_log:
return
data = [[label, val] for (label, val) in zip(labels, values)]
table = wandb.Table(data=data, columns=["label", "value"])
wandb.log({title: wandb.plot.bar(table, "label",
"value", title=title)}, step=self.step)

View File

@@ -0,0 +1,48 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
from importlib import import_module
def get_trainer(config):
"""Builds and returns a trainer based on the config.
Args:
config (dict): the config dict (typically constructed using utils.config.get_config)
config.trainer (str): the name of the trainer to use. The module named "{config.trainer}_trainer" must exist in trainers root module
Raises:
ValueError: If the specified trainer does not exist under trainers/ folder
Returns:
Trainer (inherited from zoedepth.trainers.BaseTrainer): The Trainer object
"""
assert "trainer" in config and config.trainer is not None and config.trainer != '', "Trainer not specified. Config: {0}".format(
config)
try:
Trainer = getattr(import_module(
f"zoedepth.trainers.{config.trainer}_trainer"), 'Trainer')
except ModuleNotFoundError as e:
raise ValueError(f"Trainer {config.trainer}_trainer not found.") from e
return Trainer

View File

@@ -0,0 +1,316 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.cuda.amp as amp
import numpy as np
KEY_OUTPUT = 'metric_depth'
def extract_key(prediction, key):
if isinstance(prediction, dict):
return prediction[key]
return prediction
# Main loss function used for ZoeDepth. Copy/paste from AdaBins repo (https://github.com/shariqfarooq123/AdaBins/blob/0952d91e9e762be310bb4cd055cbfe2448c0ce20/loss.py#L7)
class SILogLoss(nn.Module):
"""SILog loss (pixel-wise)"""
def __init__(self, beta=0.15):
super(SILogLoss, self).__init__()
self.name = 'SILog'
self.beta = beta
def forward(self, input, target, mask=None, interpolate=True, return_interpolated=False):
input = extract_key(input, KEY_OUTPUT)
if input.shape[-1] != target.shape[-1] and interpolate:
input = nn.functional.interpolate(
input, target.shape[-2:], mode='bilinear', align_corners=True)
intr_input = input
else:
intr_input = input
if target.ndim == 3:
target = target.unsqueeze(1)
if mask is not None:
if mask.ndim == 3:
mask = mask.unsqueeze(1)
input = input[mask]
target = target[mask]
with amp.autocast(enabled=False): # amp causes NaNs in this loss function
alpha = 1e-7
g = torch.log(input + alpha) - torch.log(target + alpha)
# n, c, h, w = g.shape
# norm = 1/(h*w)
# Dg = norm * torch.sum(g**2) - (0.85/(norm**2)) * (torch.sum(g))**2
Dg = torch.var(g) + self.beta * torch.pow(torch.mean(g), 2)
loss = 10 * torch.sqrt(Dg)
if torch.isnan(loss):
print("Nan SILog loss")
print("input:", input.shape)
print("target:", target.shape)
print("G", torch.sum(torch.isnan(g)))
print("Input min max", torch.min(input), torch.max(input))
print("Target min max", torch.min(target), torch.max(target))
print("Dg", torch.isnan(Dg))
print("loss", torch.isnan(loss))
if not return_interpolated:
return loss
return loss, intr_input
def grad(x):
# x.shape : n, c, h, w
diff_x = x[..., 1:, 1:] - x[..., 1:, :-1]
diff_y = x[..., 1:, 1:] - x[..., :-1, 1:]
mag = diff_x**2 + diff_y**2
# angle_ratio
angle = torch.atan(diff_y / (diff_x + 1e-10))
return mag, angle
def grad_mask(mask):
return mask[..., 1:, 1:] & mask[..., 1:, :-1] & mask[..., :-1, 1:]
class GradL1Loss(nn.Module):
"""Gradient loss"""
def __init__(self):
super(GradL1Loss, self).__init__()
self.name = 'GradL1'
def forward(self, input, target, mask=None, interpolate=True, return_interpolated=False):
input = extract_key(input, KEY_OUTPUT)
if input.shape[-1] != target.shape[-1] and interpolate:
input = nn.functional.interpolate(
input, target.shape[-2:], mode='bilinear', align_corners=True)
intr_input = input
else:
intr_input = input
grad_gt = grad(target)
grad_pred = grad(input)
mask_g = grad_mask(mask)
loss = nn.functional.l1_loss(grad_pred[0][mask_g], grad_gt[0][mask_g])
loss = loss + \
nn.functional.l1_loss(grad_pred[1][mask_g], grad_gt[1][mask_g])
if not return_interpolated:
return loss
return loss, intr_input
class OrdinalRegressionLoss(object):
def __init__(self, ord_num, beta, discretization="SID"):
self.ord_num = ord_num
self.beta = beta
self.discretization = discretization
def _create_ord_label(self, gt):
N,one, H, W = gt.shape
# print("gt shape:", gt.shape)
ord_c0 = torch.ones(N, self.ord_num, H, W).to(gt.device)
if self.discretization == "SID":
label = self.ord_num * torch.log(gt) / np.log(self.beta)
else:
label = self.ord_num * (gt - 1.0) / (self.beta - 1.0)
label = label.long()
mask = torch.linspace(0, self.ord_num - 1, self.ord_num, requires_grad=False) \
.view(1, self.ord_num, 1, 1).to(gt.device)
mask = mask.repeat(N, 1, H, W).contiguous().long()
mask = (mask > label)
ord_c0[mask] = 0
ord_c1 = 1 - ord_c0
# implementation according to the paper.
# ord_label = torch.ones(N, self.ord_num * 2, H, W).to(gt.device)
# ord_label[:, 0::2, :, :] = ord_c0
# ord_label[:, 1::2, :, :] = ord_c1
# reimplementation for fast speed.
ord_label = torch.cat((ord_c0, ord_c1), dim=1)
return ord_label, mask
def __call__(self, prob, gt):
"""
:param prob: ordinal regression probability, N x 2*Ord Num x H x W, torch.Tensor
:param gt: depth ground truth, NXHxW, torch.Tensor
:return: loss: loss value, torch.float
"""
# N, C, H, W = prob.shape
valid_mask = gt > 0.
ord_label, mask = self._create_ord_label(gt)
# print("prob shape: {}, ord label shape: {}".format(prob.shape, ord_label.shape))
entropy = -prob * ord_label
loss = torch.sum(entropy, dim=1)[valid_mask.squeeze(1)]
return loss.mean()
class DiscreteNLLLoss(nn.Module):
"""Cross entropy loss"""
def __init__(self, min_depth=1e-3, max_depth=10, depth_bins=64):
super(DiscreteNLLLoss, self).__init__()
self.name = 'CrossEntropy'
self.ignore_index = -(depth_bins + 1)
# self._loss_func = nn.NLLLoss(ignore_index=self.ignore_index)
self._loss_func = nn.CrossEntropyLoss(ignore_index=self.ignore_index)
self.min_depth = min_depth
self.max_depth = max_depth
self.depth_bins = depth_bins
self.alpha = 1
self.zeta = 1 - min_depth
self.beta = max_depth + self.zeta
def quantize_depth(self, depth):
# depth : N1HW
# output : NCHW
# Quantize depth log-uniformly on [1, self.beta] into self.depth_bins bins
depth = torch.log(depth / self.alpha) / np.log(self.beta / self.alpha)
depth = depth * (self.depth_bins - 1)
depth = torch.round(depth)
depth = depth.long()
return depth
def _dequantize_depth(self, depth):
"""
Inverse of quantization
depth : NCHW -> N1HW
"""
# Get the center of the bin
def forward(self, input, target, mask=None, interpolate=True, return_interpolated=False):
input = extract_key(input, KEY_OUTPUT)
# assert torch.all(input <= 0), "Input should be negative"
if input.shape[-1] != target.shape[-1] and interpolate:
input = nn.functional.interpolate(
input, target.shape[-2:], mode='bilinear', align_corners=True)
intr_input = input
else:
intr_input = input
# assert torch.all(input)<=1)
if target.ndim == 3:
target = target.unsqueeze(1)
target = self.quantize_depth(target)
if mask is not None:
if mask.ndim == 3:
mask = mask.unsqueeze(1)
# Set the mask to ignore_index
mask = mask.long()
input = input * mask + (1 - mask) * self.ignore_index
target = target * mask + (1 - mask) * self.ignore_index
input = input.flatten(2) # N, nbins, H*W
target = target.flatten(1) # N, H*W
loss = self._loss_func(input, target)
if not return_interpolated:
return loss
return loss, intr_input
def compute_scale_and_shift(prediction, target, mask):
# system matrix: A = [[a_00, a_01], [a_10, a_11]]
a_00 = torch.sum(mask * prediction * prediction, (1, 2))
a_01 = torch.sum(mask * prediction, (1, 2))
a_11 = torch.sum(mask, (1, 2))
# right hand side: b = [b_0, b_1]
b_0 = torch.sum(mask * prediction * target, (1, 2))
b_1 = torch.sum(mask * target, (1, 2))
# solution: x = A^-1 . b = [[a_11, -a_01], [-a_10, a_00]] / (a_00 * a_11 - a_01 * a_10) . b
x_0 = torch.zeros_like(b_0)
x_1 = torch.zeros_like(b_1)
det = a_00 * a_11 - a_01 * a_01
# A needs to be a positive definite matrix.
valid = det > 0
x_0[valid] = (a_11[valid] * b_0[valid] - a_01[valid] * b_1[valid]) / det[valid]
x_1[valid] = (-a_01[valid] * b_0[valid] + a_00[valid] * b_1[valid]) / det[valid]
return x_0, x_1
class ScaleAndShiftInvariantLoss(nn.Module):
def __init__(self):
super().__init__()
self.name = "SSILoss"
def forward(self, prediction, target, mask, interpolate=True, return_interpolated=False):
if prediction.shape[-1] != target.shape[-1] and interpolate:
prediction = nn.functional.interpolate(prediction, target.shape[-2:], mode='bilinear', align_corners=True)
intr_input = prediction
else:
intr_input = prediction
prediction, target, mask = prediction.squeeze(), target.squeeze(), mask.squeeze()
assert prediction.shape == target.shape, f"Shape mismatch: Expected same shape but got {prediction.shape} and {target.shape}."
scale, shift = compute_scale_and_shift(prediction, target, mask)
scaled_prediction = scale.view(-1, 1, 1) * prediction + shift.view(-1, 1, 1)
loss = nn.functional.l1_loss(scaled_prediction[mask], target[mask])
if not return_interpolated:
return loss
return loss, intr_input
if __name__ == '__main__':
# Tests for DiscreteNLLLoss
celoss = DiscreteNLLLoss()
print(celoss(torch.rand(4, 64, 26, 32)*10, torch.rand(4, 1, 26, 32)*10, ))
d = torch.Tensor([6.59, 3.8, 10.0])
print(celoss.dequantize_depth(celoss.quantize_depth(d)))

View File

@@ -0,0 +1,143 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import torch
import torch.cuda.amp as amp
import torch.nn as nn
from zoedepth.trainers.loss import GradL1Loss, SILogLoss
from zoedepth.utils.config import DATASETS_CONFIG
from zoedepth.utils.misc import compute_metrics
from .base_trainer import BaseTrainer
class Trainer(BaseTrainer):
def __init__(self, config, model, train_loader, test_loader=None, device=None):
super().__init__(config, model, train_loader,
test_loader=test_loader, device=device)
self.device = device
self.silog_loss = SILogLoss()
self.grad_loss = GradL1Loss()
self.domain_classifier_loss = nn.CrossEntropyLoss()
self.scaler = amp.GradScaler(enabled=self.config.use_amp)
def train_on_batch(self, batch, train_step):
"""
Expects a batch of images and depth as input
batch["image"].shape : batch_size, c, h, w
batch["depth"].shape : batch_size, 1, h, w
Assumes all images in a batch are from the same dataset
"""
images, depths_gt = batch['image'].to(
self.device), batch['depth'].to(self.device)
# batch['dataset'] is a tensor strings all valued either 'nyu' or 'kitti'. labels nyu -> 0, kitti -> 1
dataset = batch['dataset'][0]
# Convert to 0s or 1s
domain_labels = torch.Tensor([dataset == 'kitti' for _ in range(
images.size(0))]).to(torch.long).to(self.device)
# m = self.model.module if self.config.multigpu else self.model
b, c, h, w = images.size()
mask = batch["mask"].to(self.device).to(torch.bool)
losses = {}
with amp.autocast(enabled=self.config.use_amp):
output = self.model(images)
pred_depths = output['metric_depth']
domain_logits = output['domain_logits']
l_si, pred = self.silog_loss(
pred_depths, depths_gt, mask=mask, interpolate=True, return_interpolated=True)
loss = self.config.w_si * l_si
losses[self.silog_loss.name] = l_si
if self.config.w_grad > 0:
l_grad = self.grad_loss(pred, depths_gt, mask=mask)
loss = loss + self.config.w_grad * l_grad
losses[self.grad_loss.name] = l_grad
else:
l_grad = torch.Tensor([0])
if self.config.w_domain > 0:
l_domain = self.domain_classifier_loss(
domain_logits, domain_labels)
loss = loss + self.config.w_domain * l_domain
losses["DomainLoss"] = l_domain
else:
l_domain = torch.Tensor([0.])
self.scaler.scale(loss).backward()
if self.config.clip_grad > 0:
self.scaler.unscale_(self.optimizer)
nn.utils.clip_grad_norm_(
self.model.parameters(), self.config.clip_grad)
self.scaler.step(self.optimizer)
if self.should_log and self.step > 1 and (self.step % int(self.config.log_images_every * self.iters_per_epoch)) == 0:
depths_gt[torch.logical_not(mask)] = -99
self.log_images(rgb={"Input": images[0, ...]}, depth={"GT": depths_gt[0], "PredictedMono": pred[0]}, prefix="Train",
min_depth=DATASETS_CONFIG[dataset]['min_depth'], max_depth=DATASETS_CONFIG[dataset]['max_depth'])
self.scaler.update()
self.optimizer.zero_grad(set_to_none=True)
return losses
def validate_on_batch(self, batch, val_step):
images = batch['image'].to(self.device)
depths_gt = batch['depth'].to(self.device)
dataset = batch['dataset'][0]
if 'has_valid_depth' in batch:
if not batch['has_valid_depth']:
return None, None
depths_gt = depths_gt.squeeze().unsqueeze(0).unsqueeze(0)
with amp.autocast(enabled=self.config.use_amp):
m = self.model.module if self.config.multigpu else self.model
pred_depths = m(images)["metric_depth"]
pred_depths = pred_depths.squeeze().unsqueeze(0).unsqueeze(0)
mask = torch.logical_and(
depths_gt > self.config.min_depth, depths_gt < self.config.max_depth)
with amp.autocast(enabled=self.config.use_amp):
l_depth = self.silog_loss(
pred_depths, depths_gt, mask=mask.to(torch.bool), interpolate=True)
metrics = compute_metrics(depths_gt, pred_depths, **self.config)
losses = {f"{self.silog_loss.name}": l_depth.item()}
if val_step == 1 and self.should_log:
depths_gt[torch.logical_not(mask)] = -99
self.log_images(rgb={"Input": images[0]}, depth={"GT": depths_gt[0], "PredictedMono": pred_depths[0]}, prefix="Test",
min_depth=DATASETS_CONFIG[dataset]['min_depth'], max_depth=DATASETS_CONFIG[dataset]['max_depth'])
return metrics, losses

View File

@@ -0,0 +1,177 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import torch
import torch.cuda.amp as amp
import torch.nn as nn
from zoedepth.trainers.loss import GradL1Loss, SILogLoss
from zoedepth.utils.config import DATASETS_CONFIG
from zoedepth.utils.misc import compute_metrics
from zoedepth.data.preprocess import get_black_border
from .base_trainer import BaseTrainer
from torchvision import transforms
from PIL import Image
import numpy as np
class Trainer(BaseTrainer):
def __init__(self, config, model, train_loader, test_loader=None, device=None):
super().__init__(config, model, train_loader,
test_loader=test_loader, device=device)
self.device = device
self.silog_loss = SILogLoss()
self.grad_loss = GradL1Loss()
self.scaler = amp.GradScaler(enabled=self.config.use_amp)
def train_on_batch(self, batch, train_step):
"""
Expects a batch of images and depth as input
batch["image"].shape : batch_size, c, h, w
batch["depth"].shape : batch_size, 1, h, w
"""
images, depths_gt = batch['image'].to(
self.device), batch['depth'].to(self.device)
dataset = batch['dataset'][0]
b, c, h, w = images.size()
mask = batch["mask"].to(self.device).to(torch.bool)
losses = {}
with amp.autocast(enabled=self.config.use_amp):
output = self.model(images)
pred_depths = output['metric_depth']
l_si, pred = self.silog_loss(
pred_depths, depths_gt, mask=mask, interpolate=True, return_interpolated=True)
loss = self.config.w_si * l_si
losses[self.silog_loss.name] = l_si
if self.config.w_grad > 0:
l_grad = self.grad_loss(pred, depths_gt, mask=mask)
loss = loss + self.config.w_grad * l_grad
losses[self.grad_loss.name] = l_grad
else:
l_grad = torch.Tensor([0])
self.scaler.scale(loss).backward()
if self.config.clip_grad > 0:
self.scaler.unscale_(self.optimizer)
nn.utils.clip_grad_norm_(
self.model.parameters(), self.config.clip_grad)
self.scaler.step(self.optimizer)
if self.should_log and (self.step % int(self.config.log_images_every * self.iters_per_epoch)) == 0:
# -99 is treated as invalid depth in the log_images function and is colored grey.
depths_gt[torch.logical_not(mask)] = -99
self.log_images(rgb={"Input": images[0, ...]}, depth={"GT": depths_gt[0], "PredictedMono": pred[0]}, prefix="Train",
min_depth=DATASETS_CONFIG[dataset]['min_depth'], max_depth=DATASETS_CONFIG[dataset]['max_depth'])
if self.config.get("log_rel", False):
self.log_images(
scalar_field={"RelPred": output["relative_depth"][0]}, prefix="TrainRel")
self.scaler.update()
self.optimizer.zero_grad()
return losses
@torch.no_grad()
def eval_infer(self, x):
with amp.autocast(enabled=self.config.use_amp):
m = self.model.module if self.config.multigpu else self.model
pred_depths = m(x)['metric_depth']
return pred_depths
@torch.no_grad()
def crop_aware_infer(self, x):
# if we are not avoiding the black border, we can just use the normal inference
if not self.config.get("avoid_boundary", False):
return self.eval_infer(x)
# otherwise, we need to crop the image to avoid the black border
# For now, this may be a bit slow due to converting to numpy and back
# We assume no normalization is done on the input image
# get the black border
assert x.shape[0] == 1, "Only batch size 1 is supported for now"
x_pil = transforms.ToPILImage()(x[0].cpu())
x_np = np.array(x_pil, dtype=np.uint8)
black_border_params = get_black_border(x_np)
top, bottom, left, right = black_border_params.top, black_border_params.bottom, black_border_params.left, black_border_params.right
x_np_cropped = x_np[top:bottom, left:right, :]
x_cropped = transforms.ToTensor()(Image.fromarray(x_np_cropped))
# run inference on the cropped image
pred_depths_cropped = self.eval_infer(x_cropped.unsqueeze(0).to(self.device))
# resize the prediction to x_np_cropped's size
pred_depths_cropped = nn.functional.interpolate(
pred_depths_cropped, size=(x_np_cropped.shape[0], x_np_cropped.shape[1]), mode="bilinear", align_corners=False)
# pad the prediction back to the original size
pred_depths = torch.zeros((1, 1, x_np.shape[0], x_np.shape[1]), device=pred_depths_cropped.device, dtype=pred_depths_cropped.dtype)
pred_depths[:, :, top:bottom, left:right] = pred_depths_cropped
return pred_depths
def validate_on_batch(self, batch, val_step):
images = batch['image'].to(self.device)
depths_gt = batch['depth'].to(self.device)
dataset = batch['dataset'][0]
mask = batch["mask"].to(self.device)
if 'has_valid_depth' in batch:
if not batch['has_valid_depth']:
return None, None
depths_gt = depths_gt.squeeze().unsqueeze(0).unsqueeze(0)
mask = mask.squeeze().unsqueeze(0).unsqueeze(0)
if dataset == 'nyu':
pred_depths = self.crop_aware_infer(images)
else:
pred_depths = self.eval_infer(images)
pred_depths = pred_depths.squeeze().unsqueeze(0).unsqueeze(0)
with amp.autocast(enabled=self.config.use_amp):
l_depth = self.silog_loss(
pred_depths, depths_gt, mask=mask.to(torch.bool), interpolate=True)
metrics = compute_metrics(depths_gt, pred_depths, **self.config)
losses = {f"{self.silog_loss.name}": l_depth.item()}
if val_step == 1 and self.should_log:
depths_gt[torch.logical_not(mask)] = -99
self.log_images(rgb={"Input": images[0]}, depth={"GT": depths_gt[0], "PredictedMono": pred_depths[0]}, prefix="Test",
min_depth=DATASETS_CONFIG[dataset]['min_depth'], max_depth=DATASETS_CONFIG[dataset]['max_depth'])
return metrics, losses

View File

@@ -0,0 +1,24 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat

View File

@@ -0,0 +1,33 @@
def infer_type(x): # hacky way to infer type from string args
if not isinstance(x, str):
return x
try:
x = int(x)
return x
except ValueError:
pass
try:
x = float(x)
return x
except ValueError:
pass
return x
def parse_unknown(unknown_args):
clean = []
for a in unknown_args:
if "=" in a:
k, v = a.split("=")
clean.extend([k, v])
else:
clean.append(a)
keys = clean[::2]
values = clean[1::2]
return {k.replace("--", ""): infer_type(v) for k, v in zip(keys, values)}

View File

@@ -0,0 +1,437 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import json
import os
from zoedepth.utils.easydict import EasyDict as edict
from zoedepth.utils.arg_utils import infer_type
import pathlib
import platform
ROOT = pathlib.Path(__file__).parent.parent.resolve()
HOME_DIR = os.path.expanduser("./data")
COMMON_CONFIG = {
"save_dir": os.path.expanduser("./depth_anything_finetune"),
"project": "ZoeDepth",
"tags": '',
"notes": "",
"gpu": None,
"root": ".",
"uid": None,
"print_losses": False
}
DATASETS_CONFIG = {
"kitti": {
"dataset": "kitti",
"min_depth": 0.001,
"max_depth": 80,
"data_path": os.path.join(HOME_DIR, "Kitti/raw_data"),
"gt_path": os.path.join(HOME_DIR, "Kitti/data_depth_annotated_zoedepth"),
"filenames_file": "./train_test_inputs/kitti_eigen_train_files_with_gt.txt",
"input_height": 352,
"input_width": 1216, # 704
"data_path_eval": os.path.join(HOME_DIR, "Kitti/raw_data"),
"gt_path_eval": os.path.join(HOME_DIR, "Kitti/data_depth_annotated_zoedepth"),
"filenames_file_eval": "./train_test_inputs/kitti_eigen_test_files_with_gt.txt",
"min_depth_eval": 1e-3,
"max_depth_eval": 80,
"do_random_rotate": True,
"degree": 1.0,
"do_kb_crop": True,
"garg_crop": True,
"eigen_crop": False,
"use_right": False
},
"kitti_test": {
"dataset": "kitti",
"min_depth": 0.001,
"max_depth": 80,
"data_path": os.path.join(HOME_DIR, "Kitti/raw_data"),
"gt_path": os.path.join(HOME_DIR, "Kitti/data_depth_annotated_zoedepth"),
"filenames_file": "./train_test_inputs/kitti_eigen_train_files_with_gt.txt",
"input_height": 352,
"input_width": 1216,
"data_path_eval": os.path.join(HOME_DIR, "Kitti/raw_data"),
"gt_path_eval": os.path.join(HOME_DIR, "Kitti/data_depth_annotated_zoedepth"),
"filenames_file_eval": "./train_test_inputs/kitti_eigen_test_files_with_gt.txt",
"min_depth_eval": 1e-3,
"max_depth_eval": 80,
"do_random_rotate": False,
"degree": 1.0,
"do_kb_crop": True,
"garg_crop": True,
"eigen_crop": False,
"use_right": False
},
"nyu": {
"dataset": "nyu",
"avoid_boundary": False,
"min_depth": 1e-3, # originally 0.1
"max_depth": 10,
"data_path": os.path.join(HOME_DIR, "nyu"),
"gt_path": os.path.join(HOME_DIR, "nyu"),
"filenames_file": "./train_test_inputs/nyudepthv2_train_files_with_gt.txt",
"input_height": 480,
"input_width": 640,
"data_path_eval": os.path.join(HOME_DIR, "nyu"),
"gt_path_eval": os.path.join(HOME_DIR, "nyu"),
"filenames_file_eval": "./train_test_inputs/nyudepthv2_test_files_with_gt.txt",
"min_depth_eval": 1e-3,
"max_depth_eval": 10,
"min_depth_diff": -10,
"max_depth_diff": 10,
"do_random_rotate": True,
"degree": 1.0,
"do_kb_crop": False,
"garg_crop": False,
"eigen_crop": True
},
"ibims": {
"dataset": "ibims",
"ibims_root": os.path.join(HOME_DIR, "iBims1/m1455541/ibims1_core_raw/"),
"eigen_crop": True,
"garg_crop": False,
"do_kb_crop": False,
"min_depth_eval": 0,
"max_depth_eval": 10,
"min_depth": 1e-3,
"max_depth": 10
},
"sunrgbd": {
"dataset": "sunrgbd",
"sunrgbd_root": os.path.join(HOME_DIR, "SUNRGB-D"),
"eigen_crop": True,
"garg_crop": False,
"do_kb_crop": False,
"min_depth_eval": 0,
"max_depth_eval": 8,
"min_depth": 1e-3,
"max_depth": 10
},
"diml_indoor": {
"dataset": "diml_indoor",
"diml_indoor_root": os.path.join(HOME_DIR, "DIML/indoor/sample/testset/"),
"eigen_crop": True,
"garg_crop": False,
"do_kb_crop": False,
"min_depth_eval": 0,
"max_depth_eval": 10,
"min_depth": 1e-3,
"max_depth": 10
},
"diml_outdoor": {
"dataset": "diml_outdoor",
"diml_outdoor_root": os.path.join(HOME_DIR, "DIML/outdoor/test/LR"),
"eigen_crop": False,
"garg_crop": True,
"do_kb_crop": False,
"min_depth_eval": 2,
"max_depth_eval": 80,
"min_depth": 1e-3,
"max_depth": 80
},
"diode_indoor": {
"dataset": "diode_indoor",
"diode_indoor_root": os.path.join(HOME_DIR, "DIODE/val/indoors/"),
"eigen_crop": True,
"garg_crop": False,
"do_kb_crop": False,
"min_depth_eval": 1e-3,
"max_depth_eval": 10,
"min_depth": 1e-3,
"max_depth": 10
},
"diode_outdoor": {
"dataset": "diode_outdoor",
"diode_outdoor_root": os.path.join(HOME_DIR, "DIODE/val/outdoor/"),
"eigen_crop": False,
"garg_crop": True,
"do_kb_crop": False,
"min_depth_eval": 1e-3,
"max_depth_eval": 80,
"min_depth": 1e-3,
"max_depth": 80
},
"hypersim_test": {
"dataset": "hypersim_test",
"hypersim_test_root": os.path.join(HOME_DIR, "HyperSim/"),
"eigen_crop": True,
"garg_crop": False,
"do_kb_crop": False,
"min_depth_eval": 1e-3,
"max_depth_eval": 80,
"min_depth": 1e-3,
"max_depth": 10
},
"vkitti": {
"dataset": "vkitti",
"vkitti_root": os.path.join(HOME_DIR, "shortcuts/datasets/vkitti_test/"),
"eigen_crop": False,
"garg_crop": True,
"do_kb_crop": True,
"min_depth_eval": 1e-3,
"max_depth_eval": 80,
"min_depth": 1e-3,
"max_depth": 80
},
"vkitti2": {
"dataset": "vkitti2",
"vkitti2_root": os.path.join(HOME_DIR, "vKitti2/"),
"eigen_crop": False,
"garg_crop": True,
"do_kb_crop": True,
"min_depth_eval": 1e-3,
"max_depth_eval": 80,
"min_depth": 1e-3,
"max_depth": 80,
},
"ddad": {
"dataset": "ddad",
"ddad_root": os.path.join(HOME_DIR, "shortcuts/datasets/ddad/ddad_val/"),
"eigen_crop": False,
"garg_crop": True,
"do_kb_crop": True,
"min_depth_eval": 1e-3,
"max_depth_eval": 80,
"min_depth": 1e-3,
"max_depth": 80,
},
}
ALL_INDOOR = ["nyu", "ibims", "sunrgbd", "diode_indoor", "hypersim_test"]
ALL_OUTDOOR = ["kitti", "diml_outdoor", "diode_outdoor", "vkitti2", "ddad"]
ALL_EVAL_DATASETS = ALL_INDOOR + ALL_OUTDOOR
COMMON_TRAINING_CONFIG = {
"dataset": "nyu",
"distributed": True,
"workers": 16,
"clip_grad": 0.1,
"use_shared_dict": False,
"shared_dict": None,
"use_amp": False,
"aug": True,
"random_crop": False,
"random_translate": False,
"translate_prob": 0.2,
"max_translation": 100,
"validate_every": 0.25,
"log_images_every": 0.1,
"prefetch": False,
}
def flatten(config, except_keys=('bin_conf')):
def recurse(inp):
if isinstance(inp, dict):
for key, value in inp.items():
if key in except_keys:
yield (key, value)
if isinstance(value, dict):
yield from recurse(value)
else:
yield (key, value)
return dict(list(recurse(config)))
def split_combined_args(kwargs):
"""Splits the arguments that are combined with '__' into multiple arguments.
Combined arguments should have equal number of keys and values.
Keys are separated by '__' and Values are separated with ';'.
For example, '__n_bins__lr=256;0.001'
Args:
kwargs (dict): key-value pairs of arguments where key-value is optionally combined according to the above format.
Returns:
dict: Parsed dict with the combined arguments split into individual key-value pairs.
"""
new_kwargs = dict(kwargs)
for key, value in kwargs.items():
if key.startswith("__"):
keys = key.split("__")[1:]
values = value.split(";")
assert len(keys) == len(
values), f"Combined arguments should have equal number of keys and values. Keys are separated by '__' and Values are separated with ';'. For example, '__n_bins__lr=256;0.001. Given (keys,values) is ({keys}, {values})"
for k, v in zip(keys, values):
new_kwargs[k] = v
return new_kwargs
def parse_list(config, key, dtype=int):
"""Parse a list of values for the key if the value is a string. The values are separated by a comma.
Modifies the config in place.
"""
if key in config:
if isinstance(config[key], str):
config[key] = list(map(dtype, config[key].split(',')))
assert isinstance(config[key], list) and all([isinstance(e, dtype) for e in config[key]]
), f"{key} should be a list of values dtype {dtype}. Given {config[key]} of type {type(config[key])} with values of type {[type(e) for e in config[key]]}."
def get_model_config(model_name, model_version=None):
"""Find and parse the .json config file for the model.
Args:
model_name (str): name of the model. The config file should be named config_{model_name}[_{model_version}].json under the models/{model_name} directory.
model_version (str, optional): Specific config version. If specified config_{model_name}_{model_version}.json is searched for and used. Otherwise config_{model_name}.json is used. Defaults to None.
Returns:
easydict: the config dictionary for the model.
"""
config_fname = f"config_{model_name}_{model_version}.json" if model_version is not None else f"config_{model_name}.json"
config_file = os.path.join(ROOT, "models", model_name, config_fname)
if not os.path.exists(config_file):
return None
with open(config_file, "r") as f:
config = edict(json.load(f))
# handle dictionary inheritance
# only training config is supported for inheritance
if "inherit" in config.train and config.train.inherit is not None:
inherit_config = get_model_config(config.train["inherit"]).train
for key, value in inherit_config.items():
if key not in config.train:
config.train[key] = value
return edict(config)
def update_model_config(config, mode, model_name, model_version=None, strict=False):
model_config = get_model_config(model_name, model_version)
if model_config is not None:
config = {**config, **
flatten({**model_config.model, **model_config[mode]})}
elif strict:
raise ValueError(f"Config file for model {model_name} not found.")
return config
def check_choices(name, value, choices):
# return # No checks in dev branch
if value not in choices:
raise ValueError(f"{name} {value} not in supported choices {choices}")
KEYS_TYPE_BOOL = ["use_amp", "distributed", "use_shared_dict", "same_lr", "aug", "three_phase",
"prefetch", "cycle_momentum"] # Casting is not necessary as their int casted values in config are 0 or 1
def get_config(model_name, mode='train', dataset=None, **overwrite_kwargs):
"""Main entry point to get the config for the model.
Args:
model_name (str): name of the desired model.
mode (str, optional): "train" or "infer". Defaults to 'train'.
dataset (str, optional): If specified, the corresponding dataset configuration is loaded as well. Defaults to None.
Keyword Args: key-value pairs of arguments to overwrite the default config.
The order of precedence for overwriting the config is (Higher precedence first):
# 1. overwrite_kwargs
# 2. "config_version": Config file version if specified in overwrite_kwargs. The corresponding config loaded is config_{model_name}_{config_version}.json
# 3. "version_name": Default Model version specific config specified in overwrite_kwargs. The corresponding config loaded is config_{model_name}_{version_name}.json
# 4. common_config: Default config for all models specified in COMMON_CONFIG
Returns:
easydict: The config dictionary for the model.
"""
check_choices("Model", model_name, ["zoedepth", "zoedepth_nk"])
check_choices("Mode", mode, ["train", "infer", "eval"])
if mode == "train":
check_choices("Dataset", dataset, ["nyu", "kitti", "mix", None])
config = flatten({**COMMON_CONFIG, **COMMON_TRAINING_CONFIG})
config = update_model_config(config, mode, model_name)
# update with model version specific config
version_name = overwrite_kwargs.get("version_name", config["version_name"])
config = update_model_config(config, mode, model_name, version_name)
# update with config version if specified
config_version = overwrite_kwargs.get("config_version", None)
if config_version is not None:
print("Overwriting config with config_version", config_version)
config = update_model_config(config, mode, model_name, config_version)
# update with overwrite_kwargs
# Combined args are useful for hyperparameter search
overwrite_kwargs = split_combined_args(overwrite_kwargs)
config = {**config, **overwrite_kwargs}
# Casting to bool # TODO: Not necessary. Remove and test
for key in KEYS_TYPE_BOOL:
if key in config:
config[key] = bool(config[key])
# Model specific post processing of config
parse_list(config, "n_attractors")
# adjust n_bins for each bin configuration if bin_conf is given and n_bins is passed in overwrite_kwargs
if 'bin_conf' in config and 'n_bins' in overwrite_kwargs:
bin_conf = config['bin_conf'] # list of dicts
n_bins = overwrite_kwargs['n_bins']
new_bin_conf = []
for conf in bin_conf:
conf['n_bins'] = n_bins
new_bin_conf.append(conf)
config['bin_conf'] = new_bin_conf
if mode == "train":
orig_dataset = dataset
if dataset == "mix":
dataset = 'nyu' # Use nyu as default for mix. Dataset config is changed accordingly while loading the dataloader
if dataset is not None:
config['project'] = f"MonoDepth3-{orig_dataset}" # Set project for wandb
if dataset is not None:
config['dataset'] = dataset
config = {**DATASETS_CONFIG[dataset], **config}
config['model'] = model_name
typed_config = {k: infer_type(v) for k, v in config.items()}
# add hostname to config
config['hostname'] = platform.node()
return edict(typed_config)
def change_dataset(config, new_dataset):
config.update(DATASETS_CONFIG[new_dataset])
return config

View File

@@ -0,0 +1,158 @@
"""
EasyDict
Copy/pasted from https://github.com/makinacorpus/easydict
Original author: Mathieu Leplatre <mathieu.leplatre@makina-corpus.com>
"""
class EasyDict(dict):
"""
Get attributes
>>> d = EasyDict({'foo':3})
>>> d['foo']
3
>>> d.foo
3
>>> d.bar
Traceback (most recent call last):
...
AttributeError: 'EasyDict' object has no attribute 'bar'
Works recursively
>>> d = EasyDict({'foo':3, 'bar':{'x':1, 'y':2}})
>>> isinstance(d.bar, dict)
True
>>> d.bar.x
1
Bullet-proof
>>> EasyDict({})
{}
>>> EasyDict(d={})
{}
>>> EasyDict(None)
{}
>>> d = {'a': 1}
>>> EasyDict(**d)
{'a': 1}
>>> EasyDict((('a', 1), ('b', 2)))
{'a': 1, 'b': 2}
Set attributes
>>> d = EasyDict()
>>> d.foo = 3
>>> d.foo
3
>>> d.bar = {'prop': 'value'}
>>> d.bar.prop
'value'
>>> d
{'foo': 3, 'bar': {'prop': 'value'}}
>>> d.bar.prop = 'newer'
>>> d.bar.prop
'newer'
Values extraction
>>> d = EasyDict({'foo':0, 'bar':[{'x':1, 'y':2}, {'x':3, 'y':4}]})
>>> isinstance(d.bar, list)
True
>>> from operator import attrgetter
>>> list(map(attrgetter('x'), d.bar))
[1, 3]
>>> list(map(attrgetter('y'), d.bar))
[2, 4]
>>> d = EasyDict()
>>> list(d.keys())
[]
>>> d = EasyDict(foo=3, bar=dict(x=1, y=2))
>>> d.foo
3
>>> d.bar.x
1
Still like a dict though
>>> o = EasyDict({'clean':True})
>>> list(o.items())
[('clean', True)]
And like a class
>>> class Flower(EasyDict):
... power = 1
...
>>> f = Flower()
>>> f.power
1
>>> f = Flower({'height': 12})
>>> f.height
12
>>> f['power']
1
>>> sorted(f.keys())
['height', 'power']
update and pop items
>>> d = EasyDict(a=1, b='2')
>>> e = EasyDict(c=3.0, a=9.0)
>>> d.update(e)
>>> d.c
3.0
>>> d['c']
3.0
>>> d.get('c')
3.0
>>> d.update(a=4, b=4)
>>> d.b
4
>>> d.pop('a')
4
>>> d.a
Traceback (most recent call last):
...
AttributeError: 'EasyDict' object has no attribute 'a'
"""
def __init__(self, d=None, **kwargs):
if d is None:
d = {}
else:
d = dict(d)
if kwargs:
d.update(**kwargs)
for k, v in d.items():
setattr(self, k, v)
# Class attributes
for k in self.__class__.__dict__.keys():
if not (k.startswith('__') and k.endswith('__')) and not k in ('update', 'pop'):
setattr(self, k, getattr(self, k))
def __setattr__(self, name, value):
if isinstance(value, (list, tuple)):
value = [self.__class__(x)
if isinstance(x, dict) else x for x in value]
elif isinstance(value, dict) and not isinstance(value, self.__class__):
value = self.__class__(value)
super(EasyDict, self).__setattr__(name, value)
super(EasyDict, self).__setitem__(name, value)
__setitem__ = __setattr__
def update(self, e=None, **f):
d = e or dict()
d.update(f)
for k in d:
setattr(self, k, d[k])
def pop(self, k, d=None):
delattr(self, k)
return super(EasyDict, self).pop(k, d)
if __name__ == "__main__":
import doctest
doctest.testmod()

View File

@@ -0,0 +1,98 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
import numpy as np
def get_intrinsics(H,W):
"""
Intrinsics for a pinhole camera model.
Assume fov of 55 degrees and central principal point.
"""
f = 0.5 * W / np.tan(0.5 * 55 * np.pi / 180.0)
cx = 0.5 * W
cy = 0.5 * H
return np.array([[f, 0, cx],
[0, f, cy],
[0, 0, 1]])
def depth_to_points(depth, R=None, t=None):
K = get_intrinsics(depth.shape[1], depth.shape[2])
Kinv = np.linalg.inv(K)
if R is None:
R = np.eye(3)
if t is None:
t = np.zeros(3)
# M converts from your coordinate to PyTorch3D's coordinate system
M = np.eye(3)
M[0, 0] = -1.0
M[1, 1] = -1.0
height, width = depth.shape[1:3]
x = np.arange(width)
y = np.arange(height)
coord = np.stack(np.meshgrid(x, y), -1)
coord = np.concatenate((coord, np.ones_like(coord)[:, :, [0]]), -1) # z=1
coord = coord.astype(np.float32)
# coord = torch.as_tensor(coord, dtype=torch.float32, device=device)
coord = coord[None] # bs, h, w, 3
D = depth[:, :, :, None, None]
# print(D.shape, Kinv[None, None, None, ...].shape, coord[:, :, :, :, None].shape )
pts3D_1 = D * Kinv[None, None, None, ...] @ coord[:, :, :, :, None]
# pts3D_1 live in your coordinate system. Convert them to Py3D's
pts3D_1 = M[None, None, None, ...] @ pts3D_1
# from reference to targe tviewpoint
pts3D_2 = R[None, None, None, ...] @ pts3D_1 + t[None, None, None, :, None]
# pts3D_2 = pts3D_1
# depth_2 = pts3D_2[:, :, :, 2, :] # b,1,h,w
return pts3D_2[:, :, :, :3, 0][0]
def create_triangles(h, w, mask=None):
"""
Reference: https://github.com/google-research/google-research/blob/e96197de06613f1b027d20328e06d69829fa5a89/infinite_nature/render_utils.py#L68
Creates mesh triangle indices from a given pixel grid size.
This function is not and need not be differentiable as triangle indices are
fixed.
Args:
h: (int) denoting the height of the image.
w: (int) denoting the width of the image.
Returns:
triangles: 2D numpy array of indices (int) with shape (2(W-1)(H-1) x 3)
"""
x, y = np.meshgrid(range(w - 1), range(h - 1))
tl = y * w + x
tr = y * w + x + 1
bl = (y + 1) * w + x
br = (y + 1) * w + x + 1
triangles = np.array([tl, bl, tr, br, tr, bl])
triangles = np.transpose(triangles, (1, 2, 0)).reshape(
((w - 1) * (h - 1) * 2, 3))
if mask is not None:
mask = mask.reshape(-1)
triangles = triangles[mask[triangles].all(1)]
return triangles

View File

@@ -0,0 +1,368 @@
# MIT License
# Copyright (c) 2022 Intelligent Systems Lab Org
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# File author: Shariq Farooq Bhat
"""Miscellaneous utility functions."""
from scipy import ndimage
import base64
import math
import re
from io import BytesIO
import matplotlib
import matplotlib.cm
import numpy as np
import requests
import torch
import torch.distributed as dist
import torch.nn
import torch.nn as nn
import torch.utils.data.distributed
from PIL import Image
from torchvision.transforms import ToTensor
class RunningAverage:
def __init__(self):
self.avg = 0
self.count = 0
def append(self, value):
self.avg = (value + self.count * self.avg) / (self.count + 1)
self.count += 1
def get_value(self):
return self.avg
def denormalize(x):
"""Reverses the imagenet normalization applied to the input.
Args:
x (torch.Tensor - shape(N,3,H,W)): input tensor
Returns:
torch.Tensor - shape(N,3,H,W): Denormalized input
"""
mean = torch.Tensor([0.485, 0.456, 0.406]).view(1, 3, 1, 1).to(x.device)
std = torch.Tensor([0.229, 0.224, 0.225]).view(1, 3, 1, 1).to(x.device)
return x * std + mean
class RunningAverageDict:
"""A dictionary of running averages."""
def __init__(self):
self._dict = None
def update(self, new_dict):
if new_dict is None:
return
if self._dict is None:
self._dict = dict()
for key, value in new_dict.items():
self._dict[key] = RunningAverage()
for key, value in new_dict.items():
self._dict[key].append(value)
def get_value(self):
if self._dict is None:
return None
return {key: value.get_value() for key, value in self._dict.items()}
def colorize(value, vmin=None, vmax=None, cmap='gray_r', invalid_val=-99, invalid_mask=None, background_color=(128, 128, 128, 255), gamma_corrected=False, value_transform=None):
"""Converts a depth map to a color image.
Args:
value (torch.Tensor, numpy.ndarry): Input depth map. Shape: (H, W) or (1, H, W) or (1, 1, H, W). All singular dimensions are squeezed
vmin (float, optional): vmin-valued entries are mapped to start color of cmap. If None, value.min() is used. Defaults to None.
vmax (float, optional): vmax-valued entries are mapped to end color of cmap. If None, value.max() is used. Defaults to None.
cmap (str, optional): matplotlib colormap to use. Defaults to 'magma_r'.
invalid_val (int, optional): Specifies value of invalid pixels that should be colored as 'background_color'. Defaults to -99.
invalid_mask (numpy.ndarray, optional): Boolean mask for invalid regions. Defaults to None.
background_color (tuple[int], optional): 4-tuple RGB color to give to invalid pixels. Defaults to (128, 128, 128, 255).
gamma_corrected (bool, optional): Apply gamma correction to colored image. Defaults to False.
value_transform (Callable, optional): Apply transform function to valid pixels before coloring. Defaults to None.
Returns:
numpy.ndarray, dtype - uint8: Colored depth map. Shape: (H, W, 4)
"""
if isinstance(value, torch.Tensor):
value = value.detach().cpu().numpy()
value = value.squeeze()
if invalid_mask is None:
invalid_mask = value == invalid_val
mask = np.logical_not(invalid_mask)
# normalize
vmin = np.percentile(value[mask],2) if vmin is None else vmin
vmax = np.percentile(value[mask],85) if vmax is None else vmax
if vmin != vmax:
value = (value - vmin) / (vmax - vmin) # vmin..vmax
else:
# Avoid 0-division
value = value * 0.
# squeeze last dim if it exists
# grey out the invalid values
value[invalid_mask] = np.nan
cmapper = matplotlib.cm.get_cmap(cmap)
if value_transform:
value = value_transform(value)
# value = value / value.max()
value = cmapper(value, bytes=True) # (nxmx4)
# img = value[:, :, :]
img = value[...]
img[invalid_mask] = background_color
# return img.transpose((2, 0, 1))
if gamma_corrected:
# gamma correction
img = img / 255
img = np.power(img, 2.2)
img = img * 255
img = img.astype(np.uint8)
return img
def count_parameters(model, include_all=False):
return sum(p.numel() for p in model.parameters() if p.requires_grad or include_all)
def compute_errors(gt, pred):
"""Compute metrics for 'pred' compared to 'gt'
Args:
gt (numpy.ndarray): Ground truth values
pred (numpy.ndarray): Predicted values
gt.shape should be equal to pred.shape
Returns:
dict: Dictionary containing the following metrics:
'a1': Delta1 accuracy: Fraction of pixels that are within a scale factor of 1.25
'a2': Delta2 accuracy: Fraction of pixels that are within a scale factor of 1.25^2
'a3': Delta3 accuracy: Fraction of pixels that are within a scale factor of 1.25^3
'abs_rel': Absolute relative error
'rmse': Root mean squared error
'log_10': Absolute log10 error
'sq_rel': Squared relative error
'rmse_log': Root mean squared error on the log scale
'silog': Scale invariant log error
"""
thresh = np.maximum((gt / pred), (pred / gt))
a1 = (thresh < 1.25).mean()
a2 = (thresh < 1.25 ** 2).mean()
a3 = (thresh < 1.25 ** 3).mean()
abs_rel = np.mean(np.abs(gt - pred) / gt)
sq_rel = np.mean(((gt - pred) ** 2) / gt)
rmse = (gt - pred) ** 2
rmse = np.sqrt(rmse.mean())
rmse_log = (np.log(gt) - np.log(pred)) ** 2
rmse_log = np.sqrt(rmse_log.mean())
err = np.log(pred) - np.log(gt)
silog = np.sqrt(np.mean(err ** 2) - np.mean(err) ** 2) * 100
log_10 = (np.abs(np.log10(gt) - np.log10(pred))).mean()
return dict(a1=a1, a2=a2, a3=a3, abs_rel=abs_rel, rmse=rmse, log_10=log_10, rmse_log=rmse_log,
silog=silog, sq_rel=sq_rel)
def compute_metrics(gt, pred, interpolate=True, garg_crop=False, eigen_crop=True, dataset='nyu', min_depth_eval=0.1, max_depth_eval=10, **kwargs):
"""Compute metrics of predicted depth maps. Applies cropping and masking as necessary or specified via arguments. Refer to compute_errors for more details on metrics.
"""
if 'config' in kwargs:
config = kwargs['config']
garg_crop = config.garg_crop
eigen_crop = config.eigen_crop
min_depth_eval = config.min_depth_eval
max_depth_eval = config.max_depth_eval
if gt.shape[-2:] != pred.shape[-2:] and interpolate:
pred = nn.functional.interpolate(
pred, gt.shape[-2:], mode='bilinear', align_corners=True)
pred = pred.squeeze().cpu().numpy()
pred[pred < min_depth_eval] = min_depth_eval
pred[pred > max_depth_eval] = max_depth_eval
pred[np.isinf(pred)] = max_depth_eval
pred[np.isnan(pred)] = min_depth_eval
gt_depth = gt.squeeze().cpu().numpy()
valid_mask = np.logical_and(
gt_depth > min_depth_eval, gt_depth < max_depth_eval)
if garg_crop or eigen_crop:
gt_height, gt_width = gt_depth.shape
eval_mask = np.zeros(valid_mask.shape)
if garg_crop:
eval_mask[int(0.40810811 * gt_height):int(0.99189189 * gt_height),
int(0.03594771 * gt_width):int(0.96405229 * gt_width)] = 1
elif eigen_crop:
# print("-"*10, " EIGEN CROP ", "-"*10)
if dataset == 'kitti':
eval_mask[int(0.3324324 * gt_height):int(0.91351351 * gt_height),
int(0.0359477 * gt_width):int(0.96405229 * gt_width)] = 1
else:
# assert gt_depth.shape == (480, 640), "Error: Eigen crop is currently only valid for (480, 640) images"
eval_mask[45:471, 41:601] = 1
else:
eval_mask = np.ones(valid_mask.shape)
valid_mask = np.logical_and(valid_mask, eval_mask)
return compute_errors(gt_depth[valid_mask], pred[valid_mask])
#################################### Model uilts ################################################
def parallelize(config, model, find_unused_parameters=True):
if config.gpu is not None:
torch.cuda.set_device(config.gpu)
model = model.cuda(config.gpu)
config.multigpu = False
if config.distributed:
# Use DDP
config.multigpu = True
config.rank = config.rank * config.ngpus_per_node + config.gpu
dist.init_process_group(backend=config.dist_backend, init_method=config.dist_url,
world_size=config.world_size, rank=config.rank)
config.batch_size = int(config.batch_size / config.ngpus_per_node)
# config.batch_size = 8
config.workers = int(
(config.num_workers + config.ngpus_per_node - 1) / config.ngpus_per_node)
print("Device", config.gpu, "Rank", config.rank, "batch size",
config.batch_size, "Workers", config.workers)
torch.cuda.set_device(config.gpu)
model = nn.SyncBatchNorm.convert_sync_batchnorm(model)
model = model.cuda(config.gpu)
model = torch.nn.parallel.DistributedDataParallel(model, device_ids=[config.gpu], output_device=config.gpu,
find_unused_parameters=find_unused_parameters)
elif config.gpu is None:
# Use DP
config.multigpu = True
model = model.cuda()
model = torch.nn.DataParallel(model)
return model
#################################################################################################
#####################################################################################################
class colors:
'''Colors class:
Reset all colors with colors.reset
Two subclasses fg for foreground and bg for background.
Use as colors.subclass.colorname.
i.e. colors.fg.red or colors.bg.green
Also, the generic bold, disable, underline, reverse, strikethrough,
and invisible work with the main class
i.e. colors.bold
'''
reset = '\033[0m'
bold = '\033[01m'
disable = '\033[02m'
underline = '\033[04m'
reverse = '\033[07m'
strikethrough = '\033[09m'
invisible = '\033[08m'
class fg:
black = '\033[30m'
red = '\033[31m'
green = '\033[32m'
orange = '\033[33m'
blue = '\033[34m'
purple = '\033[35m'
cyan = '\033[36m'
lightgrey = '\033[37m'
darkgrey = '\033[90m'
lightred = '\033[91m'
lightgreen = '\033[92m'
yellow = '\033[93m'
lightblue = '\033[94m'
pink = '\033[95m'
lightcyan = '\033[96m'
class bg:
black = '\033[40m'
red = '\033[41m'
green = '\033[42m'
orange = '\033[43m'
blue = '\033[44m'
purple = '\033[45m'
cyan = '\033[46m'
lightgrey = '\033[47m'
def printc(text, color):
print(f"{color}{text}{colors.reset}")
############################################
def get_image_from_url(url):
response = requests.get(url)
img = Image.open(BytesIO(response.content)).convert("RGB")
return img
def url_to_torch(url, size=(384, 384)):
img = get_image_from_url(url)
img = img.resize(size, Image.ANTIALIAS)
img = torch.from_numpy(np.asarray(img)).float()
img = img.permute(2, 0, 1)
img.div_(255)
return img
def pil_to_batched_tensor(img):
return ToTensor()(img).unsqueeze(0)
def save_raw_16bit(depth, fpath="raw.png"):
if isinstance(depth, torch.Tensor):
depth = depth.squeeze().cpu().numpy()
assert isinstance(depth, np.ndarray), "Depth must be a torch tensor or numpy array"
assert depth.ndim == 2, "Depth must be 2D"
depth = depth * 256 # scale for 16-bit png
depth = depth.astype(np.uint16)
depth = Image.fromarray(depth)
depth.save(fpath)
print("Saved raw depth to", fpath)

3
requirements.txt Normal file
View File

@@ -0,0 +1,3 @@
torch
torchvision
opencv-python

111
run.py Normal file
View File

@@ -0,0 +1,111 @@
import argparse
import cv2
import numpy as np
import os
import torch
import torch.nn.functional as F
from torchvision.transforms import Compose
from tqdm import tqdm
from depth_anything.dpt import DPT_DINOv2
from depth_anything.util.transform import Resize, NormalizeImage, PrepareForNet
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--img-path', type=str)
parser.add_argument('--outdir', type=str, default='./vis_depth')
parser.add_argument('--encoder', type=str, default='vitl')
parser.add_argument('--load-from', type=str, required=True)
parser.add_argument('--localhub', dest='localhub', action='store_true', default=False)
args = parser.parse_args()
margin_width = 50
caption_height = 60
font = cv2.FONT_HERSHEY_SIMPLEX
font_scale = 1
font_thickness = 2
assert args.encoder in ['vits', 'vitb', 'vitl']
if args.encoder == 'vits':
depth_anything = DPT_DINOv2(encoder='vits', features=64, out_channels=[48, 96, 192, 384], localhub=args.localhub).cuda()
elif args.encoder == 'vitb':
depth_anything = DPT_DINOv2(encoder='vitb', features=128, out_channels=[96, 192, 384, 768], localhub=args.localhub).cuda()
else:
depth_anything = DPT_DINOv2(encoder='vitl', features=256, out_channels=[256, 512, 1024, 1024], localhub=args.localhub).cuda()
total_params = sum(param.numel() for param in depth_anything.parameters())
print('Total parameters: {:.2f}M'.format(total_params / 1e6))
depth_anything.load_state_dict(torch.load(args.load_from, map_location='cpu'), strict=True)
depth_anything.eval()
transform = Compose([
Resize(
width=518,
height=518,
resize_target=False,
keep_aspect_ratio=True,
ensure_multiple_of=14,
resize_method='lower_bound',
image_interpolation_method=cv2.INTER_CUBIC,
),
NormalizeImage(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
PrepareForNet(),
])
if os.path.isfile(args.img_path):
if args.img_path.endswith('txt'):
with open(args.img_path, 'r') as f:
filenames = f.read().splitlines()
else:
filenames = [args.img_path]
else:
filenames = os.listdir(args.img_path)
filenames.sort()
for filename in tqdm(filenames):
raw_image = cv2.imread(filename)
image = cv2.cvtColor(raw_image, cv2.COLOR_BGR2RGB) / 255.0
h, w = image.shape[:2]
image = transform({'image': image})['image']
image = torch.from_numpy(image).unsqueeze(0).cuda()
with torch.no_grad():
depth = depth_anything(image)
depth = F.interpolate(depth[None], (h, w), mode='bilinear', align_corners=False)[0, 0]
depth = (depth - depth.min()) / (depth.max() - depth.min()) * 255.0
depth = depth.cpu().numpy().astype(np.uint8)
depth_color = cv2.applyColorMap(depth, cv2.COLORMAP_INFERNO)
os.makedirs(args.outdir, exist_ok=True)
filename = os.path.basename(filename)
split_region = np.ones((raw_image.shape[0], margin_width, 3), dtype=np.uint8) * 255
combined_results = cv2.hconcat([raw_image, split_region, depth_color])
caption_space = np.ones((caption_height, combined_results.shape[1], 3), dtype=np.uint8) * 255
captions = ['Raw image', 'Depth Anything']
segment_width = w + margin_width
for i, caption in enumerate(captions):
# Calculate text size
text_size = cv2.getTextSize(caption, font, font_scale, font_thickness)[0]
# Calculate x-coordinate to center the text
text_x = int((segment_width * i) + (w - text_size[0]) / 2)
# Add text caption
cv2.putText(caption_space, caption, (text_x, 40), font, font_scale, (0, 0, 0), font_thickness)
final_result = cv2.vconcat([caption_space, combined_results])
cv2.imwrite(os.path.join(args.outdir, filename[:filename.find('.')] + '_img_depth.png'), final_result)

53
semseg/README.md Normal file
View File

@@ -0,0 +1,53 @@
# Depth Anything for Semantic Segmentation
We use our Depth Anything pre-trained ViT-L encoder to fine-tune downstream semantic segmentation models.
## Performance
### Cityscapes
Note that our results are obtained *without* Mapillary pre-training.
| Method | Encoder | mIoU (s.s.) | m.s. |
|:-:|:-:|:-:|:-:|
| SegFormer | MiT-B5 | 82.4 | 84.0 |
| Mask2Former | Swin-L | 83.3 | 84.3 |
| OneFormer | Swin-L | 83.0 | 84.4 |
| OneFormer | ConNeXt-XL | 83.6 | 84.6 |
| DDP | ConNeXt-L | 83.2 | 83.9 |
| **Ours** | ViT-L | **84.8** | **86.2** |
### ADE20K
| Method | Encoder | mIoU |
|:-:|:-:|:-:|
| SegFormer | MiT-B5 | 51.0 |
| Mask2Former | Swin-L | 56.4 |
| UperNet | BEiT-L | 56.3 |
| ViT-Adapter | BEiT-L | 58.3 |
| OneFormer | Swin-L | 57.4 |
| OneFormer | ConNeXt-XL | 57.4 |
| **Ours** | ViT-L | **59.4** |
## Pre-trained models
- [Cityscapes-ViT-L-mIoU-86.4](https://huggingface.co/spaces/LiheYoung/Depth-Anything/blob/main/checkpoints_semseg/cityscapes_vitl_mIoU_86.4.pth)
- [ADE20K-ViT-L-mIoU-59.4](https://huggingface.co/spaces/LiheYoung/Depth-Anything/blob/main/checkpoints_semseg/ade20k_vitl_mIoU_59.4.pth)
## Installation
Please refer to [MMSegmentation](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/en/get_started.md#installation) for instructions.
After installation:
- move our [config/depth_anything](./config/depth_anything/) to mmseg's [config](https://github.com/open-mmlab/mmsegmentation/tree/main/configs)
- move our [dinov2.py](./dinov2.py) to mmseg's [backbones](https://github.com/open-mmlab/mmsegmentation/tree/main/mmseg/models/backbones)
- add DINOv2 in mmseg's [models/backbones/__init__.py](https://github.com/open-mmlab/mmsegmentation/blob/main/mmseg/models/backbones/__init__.py)
For training or inference with our pre-trained models, please refer to MMSegmentation [instructions](https://github.com/open-mmlab/mmsegmentation/blob/main/docs/en/user_guides/4_train_test.md).

View File

@@ -0,0 +1,230 @@
_base_ = [
'../_base_/default_runtime.py', '../_base_/datasets/ade20k_640x640.py'
]
crop_size = (896, 896)
data_preprocessor = dict(
type='SegDataPreProcessor',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
bgr_to_rgb=True,
pad_val=0,
seg_pad_val=255,
size=crop_size)
num_classes = 150
model = dict(
type='EncoderDecoder',
data_preprocessor=data_preprocessor,
backbone=dict(
type='DINOv2',
version='large',
freeze=False,
load_from='../checkpoints/depth_anything_vitl14.pth'),
neck=dict(type='Feature2Pyramid', embed_dim=1024, rescales=[4, 2, 1, 0.5]),
decode_head=dict(
type='Mask2FormerHead',
in_channels=[1024, 1024, 1024, 1024],
# strides=[4, 8, 16, 32],
feat_channels=1024,
out_channels=1024,
num_classes=num_classes,
num_queries=200,
num_transformer_feat_level=3,
align_corners=False,
pixel_decoder=dict(
type='mmdet.MSDeformAttnPixelDecoder',
num_outs=3,
norm_cfg=dict(type='GN', num_groups=32),
act_cfg=dict(type='ReLU'),
encoder=dict( # DeformableDetrTransformerEncoder
num_layers=6,
layer_cfg=dict( # DeformableDetrTransformerEncoderLayer
self_attn_cfg=dict( # MultiScaleDeformableAttention
embed_dims=1024,
num_heads=32,
num_levels=3,
num_points=4,
im2col_step=64,
dropout=0.0,
batch_first=True,
norm_cfg=None,
init_cfg=None),
ffn_cfg=dict(
embed_dims=1024,
feedforward_channels=4096,
num_fcs=2,
ffn_drop=0.0,
act_cfg=dict(type='ReLU', inplace=True))),
init_cfg=None),
positional_encoding=dict( # SinePositionalEncoding
num_feats=512, normalize=True),
init_cfg=None),
enforce_decoder_input_project=False,
positional_encoding=dict( # SinePositionalEncoding
num_feats=512, normalize=True),
transformer_decoder=dict( # Mask2FormerTransformerDecoder
return_intermediate=True,
num_layers=9,
layer_cfg=dict( # Mask2FormerTransformerDecoderLayer
self_attn_cfg=dict( # MultiheadAttention
embed_dims=1024,
num_heads=32,
attn_drop=0.0,
proj_drop=0.0,
dropout_layer=None,
batch_first=True),
cross_attn_cfg=dict( # MultiheadAttention
embed_dims=1024,
num_heads=32,
attn_drop=0.0,
proj_drop=0.0,
dropout_layer=None,
batch_first=True),
ffn_cfg=dict(
embed_dims=1024,
feedforward_channels=4096,
num_fcs=2,
act_cfg=dict(type='ReLU', inplace=True),
ffn_drop=0.0,
dropout_layer=None,
add_identity=True)),
init_cfg=None),
loss_cls=dict(
type='mmdet.CrossEntropyLoss',
use_sigmoid=False,
loss_weight=2.0,
reduction='mean',
class_weight=[1.0] * num_classes + [0.1]),
loss_mask=dict(
type='mmdet.CrossEntropyLoss',
use_sigmoid=True,
reduction='mean',
loss_weight=5.0),
loss_dice=dict(
type='mmdet.DiceLoss',
use_sigmoid=True,
activate=True,
reduction='mean',
naive_dice=True,
eps=1.0,
loss_weight=5.0),
train_cfg=dict(
num_points=12544,
oversample_ratio=3.0,
importance_sample_ratio=0.75,
assigner=dict(
type='mmdet.HungarianAssigner',
match_costs=[
dict(type='mmdet.ClassificationCost', weight=2.0),
dict(
type='mmdet.CrossEntropyLossCost',
weight=5.0,
use_sigmoid=True),
dict(
type='mmdet.DiceCost',
weight=5.0,
pred_act=True,
eps=1.0)
]),
sampler=dict(type='mmdet.MaskPseudoSampler'))),
train_cfg=dict(),
test_cfg=dict(mode='slide', crop_size=crop_size, stride=(426, 426)))
# dataset config
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', reduce_zero_label=True),
dict(
type='RandomChoiceResize',
scales=[int(x * 0.1 * 896) for x in range(5, 21)],
resize_type='ResizeShortestEdge',
max_size=3584),
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
dict(type='RandomFlip', prob=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='PackSegInputs')
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='Resize', scale=(3584, 896), keep_ratio=True),
# add loading annotation after ``Resize`` because ground truth
# does not need to do resize data transform
dict(type='LoadAnnotations', reduce_zero_label=True),
dict(type='PackSegInputs')
]
train_dataloader = dict(batch_size=1, dataset=dict(pipeline=train_pipeline))
val_dataloader = dict(dataset=dict(pipeline=test_pipeline))
test_dataloader = val_dataloader
# optim_wrapper = dict(
# _delete_=True,
# type='OptimWrapper',
# optimizer=dict(
# type='AdamW', lr=3e-5, betas=(0.9, 0.999), weight_decay=0.05),
# constructor='LayerDecayOptimizerConstructor',
# paramwise_cfg=dict(num_layers=12, layer_decay_rate=0.9))
# set all layers in backbone to lr_mult=0.1
# set all norm layers, position_embeding,
# query_embeding, level_embeding to decay_multi=0.0
backbone_norm_multi = dict(lr_mult=0.1, decay_mult=0.0)
backbone_embed_multi = dict(lr_mult=0.1, decay_mult=0.0)
embed_multi = dict(lr_mult=1.0, decay_mult=0.0)
custom_keys = {
'backbone.dinov2': dict(lr_mult=0.1, decay_mult=1.0),
'backbone.dinov2.norm': backbone_norm_multi,
'pos_embed': backbone_embed_multi,
'query_embed': embed_multi,
'query_feat': embed_multi,
'level_embed': embed_multi
}
custom_keys.update({
f'backbone.dinov2.blocks.{block_id}.norm': backbone_norm_multi
for block_id in range(24)
})
# optimizer
optimizer = dict(
type='AdamW', lr=0.00003, weight_decay=0.05, eps=1e-8, betas=(0.9, 0.999))
optim_wrapper = dict(
type='OptimWrapper',
optimizer=optimizer,
clip_grad=dict(max_norm=0.01, norm_type=2),
paramwise_cfg=dict(custom_keys=custom_keys, norm_decay_mult=0.0))
find_unused_parameters=True
param_scheduler = [
dict(
type='LinearLR', start_factor=1e-6, by_epoch=False, begin=0, end=1500),
dict(
type='PolyLR',
power=1.0,
begin=1500,
end=160000,
eta_min=0.0,
by_epoch=False,
)
]
# training schedule for 160k
train_cfg = dict(
type='IterBasedTrainLoop', max_iters=160000, val_interval=5000)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
default_hooks = dict(
timer=dict(type='IterTimerHook'),
logger=dict(type='LoggerHook', interval=50, log_metric_by_epoch=False),
param_scheduler=dict(type='ParamSchedulerHook'),
checkpoint=dict(
type='CheckpointHook', by_epoch=False, interval=5000, save_best='mIoU', max_keep_ckpts=1),
sampler_seed=dict(type='DistSamplerSeedHook'),
visualization=dict(type='SegVisualizationHook'))
# Default setting for scaling LR automatically
# - `enable` means enable scaling LR automatically
# or not by default.
# - `base_batch_size` = (8 GPUs) x (2 samples per GPU).
auto_scale_lr = dict(enable=False, base_batch_size=16)
work_dir = './work_dirs/depth_anything_large_mask2former_16xb1_160k_ade20k_896x896'

View File

@@ -0,0 +1,222 @@
_base_ = [
'../_base_/default_runtime.py', '../_base_/datasets/cityscapes.py'
]
crop_size = (896, 896)
data_preprocessor = dict(
type='SegDataPreProcessor',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
bgr_to_rgb=True,
pad_val=0,
seg_pad_val=255,
size=crop_size)
num_classes = 19
model = dict(
type='EncoderDecoder',
data_preprocessor=data_preprocessor,
backbone=dict(
type='DINOv2',
version='large',
freeze=False,
load_from='../checkpoints/depth_anything_vitl14.pth'),
neck=dict(type='Feature2Pyramid', embed_dim=1024, rescales=[4, 2, 1, 0.5]),
decode_head=dict(
type='Mask2FormerHead',
in_channels=[1024, 1024, 1024, 1024],
# strides=[4, 8, 16, 32],
feat_channels=1024,
out_channels=1024,
num_classes=num_classes,
num_queries=200,
num_transformer_feat_level=3,
align_corners=False,
pixel_decoder=dict(
type='mmdet.MSDeformAttnPixelDecoder',
num_outs=3,
norm_cfg=dict(type='GN', num_groups=32),
act_cfg=dict(type='ReLU'),
encoder=dict( # DeformableDetrTransformerEncoder
num_layers=6,
layer_cfg=dict( # DeformableDetrTransformerEncoderLayer
self_attn_cfg=dict( # MultiScaleDeformableAttention
embed_dims=1024,
num_heads=32,
num_levels=3,
num_points=4,
im2col_step=64,
dropout=0.0,
batch_first=True,
norm_cfg=None,
init_cfg=None),
ffn_cfg=dict(
embed_dims=1024,
feedforward_channels=4096,
num_fcs=2,
ffn_drop=0.0,
act_cfg=dict(type='ReLU', inplace=True))),
init_cfg=None),
positional_encoding=dict( # SinePositionalEncoding
num_feats=512, normalize=True),
init_cfg=None),
enforce_decoder_input_project=False,
positional_encoding=dict( # SinePositionalEncoding
num_feats=512, normalize=True),
transformer_decoder=dict( # Mask2FormerTransformerDecoder
return_intermediate=True,
num_layers=9,
layer_cfg=dict( # Mask2FormerTransformerDecoderLayer
self_attn_cfg=dict( # MultiheadAttention
embed_dims=1024,
num_heads=32,
attn_drop=0.0,
proj_drop=0.0,
dropout_layer=None,
batch_first=True),
cross_attn_cfg=dict( # MultiheadAttention
embed_dims=1024,
num_heads=32,
attn_drop=0.0,
proj_drop=0.0,
dropout_layer=None,
batch_first=True),
ffn_cfg=dict(
embed_dims=1024,
feedforward_channels=4096,
num_fcs=2,
act_cfg=dict(type='ReLU', inplace=True),
ffn_drop=0.0,
dropout_layer=None,
add_identity=True)),
init_cfg=None),
loss_cls=dict(
type='mmdet.CrossEntropyLoss',
use_sigmoid=False,
loss_weight=2.0,
reduction='mean',
class_weight=[1.0] * num_classes + [0.1]),
loss_mask=dict(
type='mmdet.CrossEntropyLoss',
use_sigmoid=True,
reduction='mean',
loss_weight=5.0),
loss_dice=dict(
type='mmdet.DiceLoss',
use_sigmoid=True,
activate=True,
reduction='mean',
naive_dice=True,
eps=1.0,
loss_weight=5.0),
train_cfg=dict(
num_points=12544,
oversample_ratio=3.0,
importance_sample_ratio=0.75,
assigner=dict(
type='mmdet.HungarianAssigner',
match_costs=[
dict(type='mmdet.ClassificationCost', weight=2.0),
dict(
type='mmdet.CrossEntropyLossCost',
weight=5.0,
use_sigmoid=True),
dict(
type='mmdet.DiceCost',
weight=5.0,
pred_act=True,
eps=1.0)
]),
sampler=dict(type='mmdet.MaskPseudoSampler'))),
train_cfg=dict(),
test_cfg=dict(mode='slide', crop_size=crop_size, stride=(518, 518)))
# dataset config
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(
type='RandomChoiceResize',
scales=[int(x * 0.1 * 896) for x in range(5, 21)],
resize_type='ResizeShortestEdge',
max_size=896 * 4),
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
dict(type='RandomFlip', prob=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='PackSegInputs')
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='Resize', scale=(896 * 4, 896), keep_ratio=True),
# add loading annotation after ``Resize`` because ground truth
# does not need to do resize data transform
dict(type='LoadAnnotations'),
dict(type='PackSegInputs')
]
train_dataloader = dict(batch_size=1, dataset=dict(pipeline=train_pipeline))
val_dataloader = dict(dataset=dict(pipeline=test_pipeline))
test_dataloader = val_dataloader
# set all layers in backbone to lr_mult=0.1
# set all norm layers, position_embeding,
# query_embeding, level_embeding to decay_multi=0.0
backbone_norm_multi = dict(lr_mult=0.1, decay_mult=0.0)
backbone_embed_multi = dict(lr_mult=0.1, decay_mult=0.0)
embed_multi = dict(lr_mult=1.0, decay_mult=0.0)
custom_keys = {
'backbone.dinov2': dict(lr_mult=0.1, decay_mult=1.0),
'backbone.dinov2.norm': backbone_norm_multi,
'pos_embed': backbone_embed_multi,
'query_embed': embed_multi,
'query_feat': embed_multi,
'level_embed': embed_multi
}
custom_keys.update({
f'backbone.dinov2.blocks.{block_id}.norm': backbone_norm_multi
for block_id in range(24)
})
# optimizer
optimizer = dict(
type='AdamW', lr=0.00003, weight_decay=0.05, eps=1e-8, betas=(0.9, 0.999))
optim_wrapper = dict(
type='OptimWrapper',
optimizer=optimizer,
clip_grad=dict(max_norm=0.01, norm_type=2),
paramwise_cfg=dict(custom_keys=custom_keys, norm_decay_mult=0.0))
find_unused_parameters=True
param_scheduler = [
dict(
type='LinearLR', start_factor=1e-6, by_epoch=False, begin=0, end=1500),
dict(
type='PolyLR',
power=1.0,
begin=1500,
end=80000,
eta_min=0.0,
by_epoch=False,
)
]
# training schedule for 160k
train_cfg = dict(
type='IterBasedTrainLoop', max_iters=80000, val_interval=5000)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
default_hooks = dict(
timer=dict(type='IterTimerHook'),
logger=dict(type='LoggerHook', interval=50, log_metric_by_epoch=False),
param_scheduler=dict(type='ParamSchedulerHook'),
checkpoint=dict(
type='CheckpointHook', by_epoch=False, interval=5000, save_best='mIoU', max_keep_ckpts=1),
sampler_seed=dict(type='DistSamplerSeedHook'),
visualization=dict(type='SegVisualizationHook'))
# Default setting for scaling LR automatically
# - `enable` means enable scaling LR automatically
# or not by default.
# - `base_batch_size` = (8 GPUs) x (2 samples per GPU).
auto_scale_lr = dict(enable=False, base_batch_size=16)
work_dir = './work_dirs/depth_anything_large_mask2former_16xb1_80k_cityscapes_896x896'

View File

@@ -0,0 +1,244 @@
_base_ = [
'../_base_/default_runtime.py', '../_base_/datasets/cityscapes.py'
]
crop_size = (896, 896)
data_preprocessor = dict(
type='SegDataPreProcessor',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
bgr_to_rgb=True,
pad_val=0,
seg_pad_val=255,
size=crop_size,
test_cfg=dict(size_divisor=14))
num_classes = 19
model = dict(
type='EncoderDecoder',
data_preprocessor=data_preprocessor,
backbone=dict(
type='DINOv2',
version='large',
freeze=False,
load_from='../checkpoints/depth_anything_vitl14.pth'),
neck=dict(type='Feature2Pyramid', embed_dim=1024, rescales=[4, 2, 1, 0.5]),
decode_head=dict(
type='Mask2FormerHead',
in_channels=[1024, 1024, 1024, 1024],
# strides=[4, 8, 16, 32],
feat_channels=1024,
out_channels=1024,
num_classes=num_classes,
num_queries=200,
num_transformer_feat_level=3,
align_corners=False,
pixel_decoder=dict(
type='mmdet.MSDeformAttnPixelDecoder',
num_outs=3,
norm_cfg=dict(type='GN', num_groups=32),
act_cfg=dict(type='ReLU'),
encoder=dict( # DeformableDetrTransformerEncoder
num_layers=6,
layer_cfg=dict( # DeformableDetrTransformerEncoderLayer
self_attn_cfg=dict( # MultiScaleDeformableAttention
embed_dims=1024,
num_heads=32,
num_levels=3,
num_points=4,
im2col_step=64,
dropout=0.0,
batch_first=True,
norm_cfg=None,
init_cfg=None),
ffn_cfg=dict(
embed_dims=1024,
feedforward_channels=4096,
num_fcs=2,
ffn_drop=0.0,
act_cfg=dict(type='ReLU', inplace=True))),
init_cfg=None),
positional_encoding=dict( # SinePositionalEncoding
num_feats=512, normalize=True),
init_cfg=None),
enforce_decoder_input_project=False,
positional_encoding=dict( # SinePositionalEncoding
num_feats=512, normalize=True),
transformer_decoder=dict( # Mask2FormerTransformerDecoder
return_intermediate=True,
num_layers=9,
layer_cfg=dict( # Mask2FormerTransformerDecoderLayer
self_attn_cfg=dict( # MultiheadAttention
embed_dims=1024,
num_heads=32,
attn_drop=0.0,
proj_drop=0.0,
dropout_layer=None,
batch_first=True),
cross_attn_cfg=dict( # MultiheadAttention
embed_dims=1024,
num_heads=32,
attn_drop=0.0,
proj_drop=0.0,
dropout_layer=None,
batch_first=True),
ffn_cfg=dict(
embed_dims=1024,
feedforward_channels=4096,
num_fcs=2,
act_cfg=dict(type='ReLU', inplace=True),
ffn_drop=0.0,
dropout_layer=None,
add_identity=True)),
init_cfg=None),
loss_cls=dict(
type='mmdet.CrossEntropyLoss',
use_sigmoid=False,
loss_weight=2.0,
reduction='mean',
class_weight=[1.0] * num_classes + [0.1]),
loss_mask=dict(
type='mmdet.CrossEntropyLoss',
use_sigmoid=True,
reduction='mean',
loss_weight=5.0),
loss_dice=dict(
type='mmdet.DiceLoss',
use_sigmoid=True,
activate=True,
reduction='mean',
naive_dice=True,
eps=1.0,
loss_weight=5.0),
train_cfg=dict(
num_points=12544,
oversample_ratio=3.0,
importance_sample_ratio=0.75,
assigner=dict(
type='mmdet.HungarianAssigner',
match_costs=[
dict(type='mmdet.ClassificationCost', weight=2.0),
dict(
type='mmdet.CrossEntropyLossCost',
weight=5.0,
use_sigmoid=True),
dict(
type='mmdet.DiceCost',
weight=5.0,
pred_act=True,
eps=1.0)
]),
sampler=dict(type='mmdet.MaskPseudoSampler'))),
train_cfg=dict(),
test_cfg=dict(mode='slide', crop_size=crop_size, stride=(518, 518)))
# dataset config
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(
type='RandomChoiceResize',
scales=[int(x * 0.1 * 896) for x in range(5, 21)],
resize_type='ResizeShortestEdge',
max_size=896 * 4),
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
dict(type='RandomFlip', prob=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='PackSegInputs')
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='Resize', scale=(896 * 4, 896), keep_ratio=True),
# add loading annotation after ``Resize`` because ground truth
# does not need to do resize data transform
dict(type='LoadAnnotations'),
dict(type='PackSegInputs')
]
img_ratios = [0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0]
# img_ratios = [1.0]
tta_pipeline = [
dict(type='LoadImageFromFile', backend_args=None),
dict(
type='TestTimeAug',
transforms=[
[
dict(type='Resize', scale_factor=r, keep_ratio=True)
for r in img_ratios
],
[
dict(type='RandomFlip', prob=0., direction='horizontal'),
dict(type='RandomFlip', prob=1., direction='horizontal')
], [dict(type='LoadAnnotations')], [dict(type='PackSegInputs')]
])
]
train_dataloader = dict(batch_size=1, dataset=dict(pipeline=train_pipeline))
val_dataloader = dict(dataset=dict(pipeline=test_pipeline))
test_dataloader = val_dataloader
# set all layers in backbone to lr_mult=0.1
# set all norm layers, position_embeding,
# query_embeding, level_embeding to decay_multi=0.0
backbone_norm_multi = dict(lr_mult=0.1, decay_mult=0.0)
backbone_embed_multi = dict(lr_mult=0.1, decay_mult=0.0)
embed_multi = dict(lr_mult=1.0, decay_mult=0.0)
custom_keys = {
'backbone.dinov2': dict(lr_mult=0.1, decay_mult=1.0),
'backbone.dinov2.norm': backbone_norm_multi,
'pos_embed': backbone_embed_multi,
'query_embed': embed_multi,
'query_feat': embed_multi,
'level_embed': embed_multi
}
custom_keys.update({
f'backbone.dinov2.blocks.{block_id}.norm': backbone_norm_multi
for block_id in range(24)
})
# optimizer
optimizer = dict(
type='AdamW', lr=0.00003, weight_decay=0.05, eps=1e-8, betas=(0.9, 0.999))
optim_wrapper = dict(
type='OptimWrapper',
optimizer=optimizer,
clip_grad=dict(max_norm=0.01, norm_type=2),
paramwise_cfg=dict(custom_keys=custom_keys, norm_decay_mult=0.0))
find_unused_parameters=True
param_scheduler = [
dict(
type='LinearLR', start_factor=1e-6, by_epoch=False, begin=0, end=1500),
dict(
type='PolyLR',
power=1.0,
begin=1500,
end=80000,
eta_min=0.0,
by_epoch=False,
)
]
# training schedule for 160k
train_cfg = dict(
type='IterBasedTrainLoop', max_iters=80000, val_interval=5000)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
default_hooks = dict(
timer=dict(type='IterTimerHook'),
logger=dict(type='LoggerHook', interval=1, log_metric_by_epoch=False),
param_scheduler=dict(type='ParamSchedulerHook'),
checkpoint=dict(
type='CheckpointHook', by_epoch=False, interval=5000, save_best='mIoU', max_keep_ckpts=1),
sampler_seed=dict(type='DistSamplerSeedHook'),
visualization=dict(type='SegVisualizationHook'))
# Default setting for scaling LR automatically
# - `enable` means enable scaling LR automatically
# or not by default.
# - `base_batch_size` = (8 GPUs) x (2 samples per GPU).
auto_scale_lr = dict(enable=False, base_batch_size=16)
work_dir = './work_dirs/depth_anything_large_mask2former_16xb1_80k_cityscapes_896x896_ms'

47
semseg/dinov2.py Normal file
View File

@@ -0,0 +1,47 @@
import torch
from mmengine.model import BaseModule
from torch import nn
from mmseg.registry import MODELS
@MODELS.register_module()
class DINOv2(nn.Module):
"""Use DINOv2 pre-trained models
"""
def __init__(self, version='large', freeze=False, load_from=None):
super().__init__()
if version == 'large':
self.dinov2 = torch.hub.load('torchhub/facebookresearch_dinov2_main', 'dinov2_vit14', source='local', pretrained=False)
else:
raise NotImplementedError
if load_from is not None:
d = torch.load(load_from, map_location='cpu')
new_d = {}
for key, value in d.items():
if 'pretrained' in key:
new_d[key.replace('pretrained.', '')] = value
self.dinov2.load_state_dict(new_d)
self.freeze = freeze
def forward(self, inputs):
B, _, h, w = inputs.shape
if self.freeze:
with torch.no_grad():
features = self.dinov2.get_intermediate_layers(inputs, 4)
else:
features = self.dinov2.get_intermediate_layers(inputs, 4)
outs = []
for feature in features:
C = feature.shape[-1]
feature = feature.permute(0, 2, 1).reshape(B, C, h // 14, w // 14).contiguous()
outs.append(feature)
return outs