Monocular Depth Estimation
I have a read a little about monocular depth estimation. I want to look more into it.
References
Related
- Depth Estimation
- Depth estimation is the task of measuring the distance of each pixel relative to the camera. Depth is extracted from either monocular (single) or stereo (multiple views of a scene) images. Traditional methods use multi-view geometry to find the relationship between the images.
Notes
Monocular Depth Estimation is the task of estimating the depth value (distance relative to the camera) of each pixel given a single (monocular) RGB image. This challenging task is a key prerequisite for determining scene understanding for applications such as 3D scene reconstruction, autonomous driving, and AR. State-of-the-art methods usually fall into one of two categories: designing a complex network that is powerful enough to directly regress the depth map, or splitting the inputs into bins or windows to reduce computational complexity. The most popular benchmarks are the KITTI and NYUv2 dataset.
Monocular depth estimation is a computer vision task that involves predicting the depth information of a scene from a single image. In other words, it is the process of estimating distance of objects in a scene from a single camera viewpoint.
Monocular depth estimation has various applications:
- 3D reconstruction
- augmented reality
- autonomous driving
- robotics
It can be affected by factors such as lighting, conditions, occlusion, and texture.
Two main depth estimation categories:
- Absolute depth estimation: This task variant aims to provide exact depth measurements from the camera. The term is used interchangeably with metric depth estimation, where depth is provided in precise measurements in meters or feet.
- Relative depth estimation: Relative depth estimation aims to predict the depth order of objects or points in a scene without proving the precise measurements.