Detection of feature points
Matches feature points
Sparse point cloud building
All sets of matched point sets are triangulated to retrieve the 3D location of the points in the object. : Bundle adjustment is a non-linear least-square optimization technique. It is used to add and refine triangulated 3D points in an iterative approach by estimation of camera poses and relative motion between the image frames. All the matched points are iteratively added to the triangulation. The error being propagated by triangulation is minimized as more and more matched points are used to triangulate a 3D point.
IncrementalSfM : This algorithm is introduced from the paper "Adaptive Structure from Motion with a contrario model estimation" The incremental pipeline is a growing reconstruction process. It starts from an initial two-view reconstruction (the seed) that is iteratively extended by adding new views and 3D points, using pose estimation and triangulation. Due to the incremental nature of the process, successive steps of non-linear refinement, like Bundle Adjustment (BA) and Levenberg-Marquardt steps, are performed to minimize the accumulated error (drift).
compute correspondence tracks t
compute connectivity graph G (1 node per view, 1 edge when enough matches)
pick an edge e in G with sufficient baseline
* robustly estimate essential matrix from images of e
triangulate validated tracks, which provides an initial reconstruction
contract edge e
while G contains an edge do
pick edge e in G that maximizes union(track(e),3D points)
* robustly estimate pose (external orientation/resection)
triangulate new tracks
contract edge e
perform bundle adjustment
end while
Point Cloud
GlobalSfM : This algorithm is introduced from the paper “Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion.” Multi-view structure from motion (SfM) estimates the position and orientation of pictures in a common 3D coordinate frame. When views are treated incrementally, this external calibration can be subject to drift, contrary to global methods that distribute residual errors evenly. Here the method propose a new global calibration approach based on the fusion of relative motions between image pairs.
compute relative pairwise rotations
detect and remove false relative pairwise rotations
- using composition error of triplet of relative rotations
compute the global rotation
- using a dense least square and approximated rotations
compute relative translations
- using triplet of views for stability and colinear motion support
compute the global translation
- integration of the relative translation directions using a l-∞ method.
final structure and motion
- link tracks validated per triplets and compute global structure by triangulation,
- refine estimated parameter in a 2 step Bundle Adjustment
- refine structure and translations
- refine structure and camera parameters (rotations, translations).
Dense correspondence matching
Sparse point cloud generated using sfm only have 3D coordinates of identified feature points in it. The resulting point cloud is of low density and is inappropriate for reconstruction, as it does not contain finer details of the object being reconstructed. While interpolating the sparse point cloud to build the dense point cloud can be done, several finer details in the object are missed out during the process, and object geometry cannot be retained exactly by interpolation methods.
Dense 3D point cloud from a large image sequences, accurately describing the geometry of the scene or object being reconstructed.
Exhaustive search for all the pixels in the images are computationally exhaustive and thus infeasible.
Here is the example steps of dense stereo matching
Depthmap
Generation of 3D watertight surface
After obtaining the 3D dense point cloud, 3D watertight surface needs to be constructed, over the point cloud.
For construction of surface over a 3D dense point cloud, the point cloud needs to have 3D coordinates (X, Y, Z) and surface normals (nx, ny, nz)
: The surface normals for each point in a point cloud may be calculated from its (X, Y, Z) coordinates using principal component analysis (PCA).
Additionally, the dense point cloud generated from image sequence has color information (RGB) attached to each of the points.
Mesh generation : There are several algorithm for mesh generation. Here three methods will be introduced.
Intuitively, think of a 3D ball with a given radius that we drop on the point cloud. If it hits any 3 points (and it does not fall through those 3 points) it creates a triangles. Then, the algorithm starts pivoting from the edges of the existing triangles and every time it hits 3 points where the ball does not fall through we create another triangle.
The radius is obtained empirically based on the size and scale of the input point cloud. Based on my experience, the below case is the good start.
distances = pcd.compute_nearest_neighbor_distance()
avg_dist = np.mean(distances)
radius = 3 * avg_dist
o3d.geometry.TriangleMesh.create_from_point_cloud_ball_pivoting(
pcd,
o3d.utility.DoubleVector([radius, radius * 2]))
If the points are too far apart at some location, then it may miss the appropriate point on the surface and instead hit another point on the object. In this case, we check that the normal of the new triangle Facet is consistently oriented with the point's Vertex normal. If it is not, then we reject that triangle and create a hole.
A 3D indicator function x is defined in 3D space, so that its value is 1 inside the surface (the surface to be created) and 0 at points outside the surface. This indicator function needs to be approximated for reconstruction of a watertight surface.
Gradient of the indicator function is a vector field, as x is a piecewise continuous function. ∇x is non-zero only at the points near surface as those areas have changes in value of x. (Value of x is 1 inside the surface and 0 outside. So changes in x only occur near the surface).
At the points where ∇x has non-zero value, the value of ∇x is found out to be equal to the surface normal vectors of those points. Thus the oriented normal vectors of the point samples can be taken to be the samples of ∇x (grad of indicator function).
So, the problem is reduced to finding out the indicator function x, whose gradient ∇x is best-fit to the vector field V defined by the input point cloud.
To convert the above equation to a standard Poisson problem, divergence operator is applied to both sides of the equation. As divergence of gradient is Laplacian, the problem is transformed into finding out the function x, whose Laplacian equals the gradient of vector field V· (∇·V) is a function representing the surface normals at each point of the point cloud.
Texturing : wrapping a 2D image around a 3D object and defining how light would affect it.
Reference: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4929111/pdf/40064_2016_Article_2425.pdf
https://towardsdatascience.com/5-step-guide-to-generate-3d-meshes-from-point-clouds-with-python-36bad397d8ba https://arxiv.org/pdf/1710.02726.pdf
https://openmvg.readthedocs.io/en/latest/
https://dreamfarmstudios.com/blog/getting-to-know-3d-texturing-in-animation-production/