DenseMapping From an Accurate Tracking SLAM

2020-11-05 09:36WeijieHuangGuoshanZhangandXiaoweiHan
IEEE/CAA Journal of Automatica Sinica 2020年6期

Weijie Huang,Guoshan Zhang,and Xiaowei Han

Abstract—In recent years, reconstructing a sparsemap from a simultaneous localization and mapping(SLAM)system on a conventionalCPU hasundergone remarkable progress. However,obtaining a dense map from the system often requires a highperformance GPU to accelerate computation. This paper proposes a dense mapping approach which can remove outliers and obtain a clean 3Dmodelusing a CPU in real-time. The dense mapping approach processes keyframes and establishes data association by usingmulti-threading technology. The outliers are removed by changing detections of associated vertices between keyframes. The implicit surface data of inliers is represented by a truncated signed distance function and fused with an adaptive weight. A global hash table and a local hash table are used to store and retrieve surface data for data-reuse. Experiment results show that the proposed approach can precisely remove the outliers in scene and obtain a dense 3D map with a better visual effect in real-time.

I.In t roduction

THE goal of visual simultaneous localization and mapping(VSLAM)is to reconstruct the scene from a camera,such as monocular camera[1],stereo camera or RGBD camera[2].The obtained map uses sparse representation for robot navigation but it can not provide occlusion information and high-quality surface model[3].High-quality dense mapping methods reconstruct rich 3D models in real-time w ith the development of large-scale parallel processors[4],especially w ith RGBD sensors.

M icrosoft’s Kinect sensor,which is a consumer grade RGBD sensor,came out in 2010.Since then,many dense mappingmethods based on this piece of equipmenthave been proposed. New combeetal.[5] proposed kinect-fusion,which is a notable densemapping approach, perm itting data fusion from the raw depth image to the 3D model.They used a truncated signed distance function(TSDF)to denote the model surface and viewerswere able to feel the visual impulse[5] w ith the reconstructed result.Ren and Reid[6] proposed a novel objective function that takes advantage of the gradient of a 3D level-set and can be efficiently solved by gradientsbased optim ization. Nießneret al.[7] used a spatial hashing scheme that compresses space,and allows for real-time access and updates of implicit surface data,w ithout the need for a regular or hierarchical grid data structure.Whelanet al.[8]combined color and depth information in the motion estimation to find six degree of freedom(DoF) parameters by minim izing the sum of the RGBD and ICP cost.Their approach utilized dense fully colored models of spatially extended environments for robotics and virtual reality applications.Soon afterwards,Whelanet al.[9]presented another novel approach,capturing dense consistent surfelbased maps w ithout pose graph optimization.The aforementioned dense mapping research is implemented in real time w ith a high-performance GPU.However, because GPUs are expensive,many researchershave considered scene reconstruction only using the CPU.ORB-SLAM,a featurebased monocular SLAM system running on a CPU,is robust to severe motion clutter,allows for w ide baseline loop closing and relocalization,and includes full automatic initialization[10].Mur-Artalet al.[11]extended ORB-SLAM to a stereo camera and a RGBD camera while keeping the original performance the same.When these approaches are used to reconstruct the scene containing dynamic objects,the final densemaps are destroyed.There are some approacheswhich process dynam ic objects in RGBD frames,for instance,[12]and[13]used the expectation-maxim ization(EM)A lgorithm and its extended form to segmentmoving hands and moving humans in RGBD frames, but they have not been used in the densemapping approach.

This paper proposes a dense mapping approach which reconstructsa scenew ith a RGBD sensor using a CPU in realtime and removes outliers containing noise and dynam ic objects effectively.Our contributions are listed as follows:1)Prior information from an accurate tracking SLAM is used to associate dense vertices between keyframes based on multithreaded processing and multi-threaded priority settings.2)The angle change and position change of the associated vertices are constructed,and then exam ined to determ ine if they are w ithin two setting ranges to remove outliers.The two ranges are designed by using a rotation angle histogram and a beam-based environment measurement model,respectively.3)An adaptiveweight isassigned to each inlier and theweighted fusion is implemented as the update process of the Kalman filter.4)The surfaces of inliers are stored in a global hash table and a local hash table for fast data operation and data reuse.

This paper is organized as follows.Section IIgives a brief review of an accurate tracking SLAM.Section III describes the whole process of the proposed approach.Section IV validates the proposed approach and compares it w ith existing mapping approaches.Section V draws conclusions.

Fig.1.Block diagram of four threads,containing three threads from ORB-SLAM[11]and the proposed thread.

II.The Ou t l ine of ORB-SLAM

We propose a novel dense mapping approach which can remove outliers and obtain a clean 3D model only on a CPU in real-time.Our approach adds a new thread to ORB-SLAM and uses prior information from the SLAM.Fig.1 shows the block diagram of the proposed approach,where the left part is the dense mapping detailed in Section III and the right part shows the ORB-SLAM described in this section.ORB-SLAM consists of three threads:the tracking thread,the local mapping thread,and the loop closure thread.In the follow ing part of thissection, we review the three important threads.

A.The Tracking Thread

The tracking thread contains the follow ing steps:

2)Track the current frame using its reference keyframe.If tracking is lost,relocalize the current frame.

3)Optim ize the current frame by using the local map.

4)Insert the current frame into the keyframe set if it is detected asa keyframe.

B.The Local Mapping Thread

The localmapping thread contains the follow ing steps:

1)Calculate the map points of the current keyframe and insert them into themap.

2)Remove theunqualified map points in thekeyframe.

3)Restore somemap points by using triangulation between adjacent keyframes.

4)Optimize keyframes using local bundleadjustment.

5)If ninety percent of the map points of the current keyframe can be observed by adjacent keyframes, this keyframe w ill be culled.

When the culling process is finished, the remaining keyframesare inserted into the loop closure thread.

C.The Loop Closure Thread

The loop closure thread involves the follow ing steps:

1)Calculate the sim3 transformation[15]optim izingDoF parameters between the current keyframe and the closed loop keyframe to dealw ith scale drift.

2)Optimize pose and map points based on sim3 transformation.

3)Update the covisibility graph of the keyframe,and obtain a new connection w ith this keyframe.

4)Optimize the essentialgraph[15]w ith the new ly formed loop.

5)Optim izeall posesand map pointsw ith theglobalbundle adjustment.

III.Dense M apping

The left part of Fig.1 is the added dense mapping thread.The outline and details of this thread are described in this section.

A.The Outline of the Dense Mapping Thread

Fig.2.Dense map from ORB-SLAM 2.(a)thewhole map;(b)two local enlarged draw ing.

A ll valid pixels in RGBD frames can not be stacked directly in the final 3D model because the result may have the follow ing defects:

1)The noise has a great influence on the final 3D model,which can be seen in Fig.2.

2)Themoving object in the scene directly causes the failure of densemapping.

3)The final 3Dmodel isnot smooth.

4)The massive amountof points in the densemap are hard to retrieveand store.

In order to solve the above defects,our approach adds a thread(dense mapping thread)into the ORB-SLAM framework and processes the keyframes kept from the local mapping thread. All threads use multi-threaded parallel processing,and multi-threaded priority settings is classified into two categories:if the loop is not detected,the priority order is:loop closure thread → localmapping thread → dense mapping thread → tracking thread;if the loop is detected,the priority order is:loop closure thread → local mapping thread → tracking thread → dense mapping thread. Multi-threaded priority setting avoids a lengthy block for the tracking thread and improves the real-time performance.Keyframes optimized by local mapping and loop closureare input into the dense mapping thread w ith a small delay[16](10−15 keyframes)so the information of the future keyframes can be used for the current keyframe.Themap points are also input into the densemapping thread and are considered as accurate prior information.The outline of the added thread is listed as follows:

1)The current keyframes are associated w ith adjacent keyframesusing perspective projection (see Section III-B).

2)A rotation angle histogram is used to examine the angle change of the associated vertices and a candidate inlier set is obtained.A beam-based environmentmeasurement model is used to exam ine the position change of candidate inliers and a true inlier set isobtained (see Section III-C).

3)The TSDF value of each inlier is calculated to represent its implicit surface.We assume that each inlier conforms to the Gaussianmodel and is fused to the final3D model as the update process of the Kalman filter.Weights can be adjusted adaptively based on noiseand depth (see Section III-D).

4)Dense points are exchanged between a global hash table and a local hash table to improve the efficiency of data(see Section III-E).

B. Data Association

C. Removing Outliers

Map points in the covisiblity graph are setas accurate prior information.Comparing the associated vertices w ith map points can sieve out outliers because the change of outliers is not consistent w ith the changeof map points.

Fig.3.Theangle changeof a vetex.

D. Fusing the TSDF Value

E. Data Storage in a Hash Table

IV.Exper iment Resu l ts

In this section,we demonstrate the results of several experiments to verify the proposed approach and compare w ith some existing approaches to show the increased performance.

A. Implementation Details

B. Removing Dynamic Objects

One of the main characteristics of our dense mapping approach is that dynam ic objects can be effectively removed.The change of vertices corresponding to dynam ic objects between keyframes is different w ith that of inliers.We establish a data association between keyframesand design two cascaded changing detections to distinguish dynam ic objects(see Section III-C).Figs.4 and 5 show the resultsof removing dynam ic objects.A hand is placed in front of the camera and keepsmoving,which is discarded when the densemap runs,i.e.,the pixels of the hand in the depth image are set to zero.Fig.4 contains three original images of the scene where a moving hand can be seen.Fig.5 shows the result images where the moving hand has been removed.In these images,pixels corresponding to the moving hand are set as invalid points.Removing dynam ic objects results in the dense reconstruction being obtained successfully.

C. Removing Random Noise

D. Real-time Performance

Fig.4.Theoriginal images of the scene.

E. Comparing the Proposed Approach With Some Dense Mapping Approaches

Fig.7.Result map of the proposed approach and [16]in fr1−desk2:(a)semi dense map of [16];(b)densemap of the proposed approach.

Fig.8.Performance comparison of the proposed approach and Infinitam:(a)dense map of Infinitam;(b)densemap of the proposed approach.

TABLE I Reconst ruction Times of the Proposed Approach

TABLE II Reconst ruction Times of[16]

Fig.10.The proposed approach and RGBD-SLAM can generate similar resultmaps in ,which is a general scene w ithout loop closure:(a)dense map of the proposed approach;(b)dense map of RGBD-SLAM.

V.Conc lusions

This paper proposed a dense mapping approach,which is performed in real-time on a CPU w ithout a GPU,that

Fig.11.The proposed approach and ORB-SLAM 2 can generate sim ilar resultmaps in freiburg2_pioneer_360, which is a general scene w ith loop closure:(a)dense map of the proposed approach;(b)dense map of ORBSLAM 2.

Fig.12.Dense map of the proposed approach in freiburg3_walking_halfsphere:(a)one image of this scene;(b)another image of this scene;(c)final result.