Difference between revisions of "DeepVO Towards end to end visual odometry with deep RNN"

From statwiki
Jump to: navigation, search
Line 17: Line 17:
== Conclusions ==
== Conclusions ==
== Critique/Future Work ==
== References ==
== References ==
== Appendix ==
== Appendix ==

Revision as of 00:20, 26 October 2018


Visual Odometry (VO) is a computer vision technique for estimating an object’s position and orientation from camera images. It is an important technique commonly used for “pose estimation and robot localization”, with notable applications on the Mars Exploration Rovers and Autonomous Vehicles [x1] [x2]. While the research field of VO is broad, this paper focuses on the topic of monocular visual odometry. Particularly, the authors examine prominent VO methods and argue mainstream geometry based monocular VO methods should be amended with deep learning approaches. Subsequently, the paper proposes a novel deep-learning based end-to-end VO algorithm, and then empirically demonstrates its viability.

Related Work

Visual odometry algorithms can be grouped into two main categories. The first is known as the conventional methods, and they are based on established principles of geometry. Specifically, an object’s position and orientation are obtained by identifying reference points and calculating how those points change over the image sequence. Moreover, algorithms in this category can be divided into two sub-categories, which differ by how they select reference points. Namely, sparse feature based methods establish reference points using image salient features, such as corners and edges [8]. Whereas, direct methods make use of the whole image and consider every pixel as a reference point [11]. Furthermore, semi-direct methods that combine both approaches are recently gaining popularity [16].

Today, most of state-of-the-art VO algorithms belong to the geometry family. However, they have significant limitations. For example, direct methods assume “photometric consistency” [11]. Whereas, sparse feature methods are prone to “drifting” because of outliers and noises. As the result, the paper argues geometry-based methods are difficult to engineer and calibrate, thus limiting its practicality. Figure 1 illustrates the general architecture of geometry-based algorithms, and it outlines necessary drift correction techniques such as Camera Calibration, Feature Detection, Feature Matching (tracking), Outlier Rejection, Motion Estimation, Scale Estimation, and Local optimization (bundle adjustment).

DeepVO Figure 1.png

End-to-End Visual odometry through RCNN

Experiments and Results

Critiques and Discussions