Simultaneous localization and map building (SLAM) is a technology used by self-driving cars that allows you to not only build a map but also to locate your vehicle on that map. SLAM algorithms allow cars to build maps of unknown environments. Engineers use the map information to perform tasks such as path planning and obstacle avoidance.
Technical research into SLAM began many years ago. With the significant increase in computer processing speed and the widespread availability of low-cost sensors such as cameras and laser range finders, SLAM is being used in more practical applications.
Why is SLAM important? To answer this question, we can look at the following examples to see what benefits and applications it has.
Suppose there is a home sweeper. Without SLAM, it would just move randomly around the room and would not be able to clean the entire floor space. In addition, this approach would consume more power so the battery would run out faster. Instead, a robot with SLAM can use information such as the number of wheel revolutions and data from cameras and other imaging sensors to determine the amount of movement needed. This is called localization. The robot can also simultaneously use cameras and other sensors to create a map of its surrounding obstacles, avoiding cleaning the same area twice. This is called map building.
The benefits of SLAM for floor sweepers
SLAM can also be used in many other application scenarios, such as having a fleet of mobile robots move through a warehouse and organize shelves, having a self-driving car park in a space, or having a complete drone delivery in an unknown environment.MATLAB and Simulink provide SLAM algorithms, functions, and analysis tools to develop various applications. You can perform other tasks such as sensor fusion, target tracking, path planning, and path following while enabling simultaneous localization and map construction.
Two types of technologies are required to implement SLAM. One type of technology is sensor signal processing (including front-end processing), which depends heavily on the sensors used. The other class of techniques is the bit-pose map optimization (including back-end processing), which is sensor-independent.
SLAM processing flow
To further understand the front-end processing techniques, it is helpful to look at two different SLAM approaches - visual SLAM and LiDAR SLAM.
Visual SLAM (Islam) uses images acquired from cameras and other image sensors. Visual SLAM can use regular cameras (wide-angle, fisheye, and spherical cameras), compound eye cameras (stereo and multi-cameras), and RGB-D cameras (depth and ToF cameras).
The cameras required for visual SLAM are relatively inexpensive and therefore less costly to implement. In addition, the cameras can provide a large amount of information and can therefore be used to detect road signs (i.e., previously measured locations). Waypoint detection can also be used in combination with graph-based optimization, which helps to achieve SLAM flexibly.
A slam that uses a single camera as the only sensor is called monocular SLAM when it is difficult to define the depth. This problem can be solved by detecting AR markers, checkerboard grids, or other known targets in the image to be located or by fusing camera information with other sensor information, such as inertial measurement unit (IMU) information that measures physical quantities such as velocity and orientation. vSLAM-related techniques include motion reconstruction (SfM), visual ranging, and bundle adjustment.
Visual SLAM algorithms can be broadly classified into two categories. Sparse methods: match feature points of the image and use algorithms such as PTAM and ORB-SLAM. Dense methods: use the overall brightness of the image and algorithms such as DTAM, LSD-SLAM, DSO, and SVO.
RGB-D SLAM point cloud alignment
Light Detection and Ranging (LIDAR) methods primarily use laser sensors (or distance sensors).
Lasers enable much greater accuracy than cameras, ToFs, and other sensors. They are often used in applications related to high-speed mobile delivery devices such as self-driving cars and drones. The output value of a laser sensor is typically a two-dimensional (x, y) or three-dimensional (x, y, z) point cloud. Laser sensor point clouds provide highly accurate distance measurement data and are particularly suitable for SLAM mapping. In general, the movement is first estimated continuously by point cloud matching. Then, the calculated movement data (distance traveled) is used for vehicle localization. For laser point cloud matching, alignment algorithms such as Iterative Closest Point (ICP) and Normal Distribution Transform (NDT) are used. 2D or 3D point cloud maps can be represented as raster maps or voxel maps.
However, point clouds are less detailed than images in terms of density and therefore do not always provide sufficient features for matching. For example, point cloud matching will be difficult where there are few obstacles and may result in lost vehicles. In addition, point cloud matching typically requires high processing power, so the process must be optimized to increase speed. Given these challenges, self-driving vehicle localization may require the fusion of other measurements such as wheeled range, GNSS, and IMU data. Applications such as warehouse robotics typically use 2D LiDAR SLAM, while 3D LiDAR point cloud SLAM is available for UAVs and autonomous vehicles.
2D LiDAR SLAM
3D LiDAR SLAM
Common Challenges Facing SLAM
Although SLAM has been applied in some scenarios, it still faces many technical challenges, which make it difficult to be more widely used. However, each of these challenges can be overcome with specific countermeasures.
1) Accumulation of positioning errors, resulting in deviations from actual values
SLAM estimates continuous movement and allows for a certain amount of error. However, errors accumulate over time and lead to significant deviations from the actual values. Errors can also lead to disintegration or distortion of map data, making subsequent searches difficult. Let's look at an example of driving around a square channel. As the error accumulates, the robot's start and end points don't match up. This is called a closed-loop problem. This type of positional estimation error is unavoidable. We must try to detect the closed loop and determine how to correct or offset the accumulated errors.
One of the countermeasures is to remember certain features of a previously visited place and use them as signposts, thus minimizing the localization error. Constructing a positional map helps to correct the error. The error minimization problem is an optimization problem to be solved to generate more accurate map data. This optimization is called bundle adjustment in visual SLAM.
2) Positioning failure and loss of position on the map
Image and point cloud building does not consider the robot's movement characteristics. In some cases, this approach generates discontinuous position estimates. For example, there may be calculations that show that a robot moving at a speed of 1 m/s has suddenly "transiently" moved 10 m forward. There are two ways to avoid such positioning failures: using a recovery algorithm or fusing the motion model with multiple sensors to calculate based on the sensor data.
There are several ways to implement sensor fusion of motion models. One common approach is to use Kalman filtering for localization. Since most differential drive robots and four-wheeled vehicles generally use nonlinear motion models, extended Kalman filters and particle filters (Monte Carlo localization) are often used. In some cases, more flexible Bayesian filters such as the traceless Kalman filter can also be used. Some common sensors are inertial measurement devices such as inertial measurement units (IMUs), airborne attitude reference systems (AHRS), inertial navigation systems (INS), accelerometer sensors, gyroscope sensors, and magnetic sensors. Wheel encoders mounted to the vehicle are typically used for ranging.
When positioning fails, one recovery response is to remember a key frame of a previously passed position and use it as a waypoint. When searching for road signs, feature extraction is performed specifically for high-speed scanning. Some methods are based on image features, such as a bag of features (BoF) and a bag of visual words (BoVW). Deep learning has also been used to compare feature distances in recent years.
3) High computational cost due to image processing, point cloud processing, and optimization
The computational cost is an issue when implementing SLAM on-vehicle hardware. The computation is usually performed on compact, low-power embedded microprocessors with limited processing power. Image processing and point cloud matching must be performed at high frequency to achieve accurate positioning. In addition, optimization calculations such as closed-loop are high-cost computational processes. The challenge is how to perform this high-cost processing on an embedded microprocessor.
One solution is to run several different processes in parallel. For example, feature extraction for pre-processing matching processes is relatively suitable for running in parallel. Single Instruction Multiple Data (SIMD) computation and embedded GPUs can further increase the speed in some cases when using multi-core CPUs for processing. Moreover, since the pose map optimization can be executed in relatively long cycles, lowering its priority and executing it at regular intervals can also improve performance.
MATLAB can help you implement SLAM applications on your target system and can help you meet the challenges of various known SLAM technologies.
SLAM front-end sensor signal and image processing
2D/3D LiDAR processing and scan matching using Lidar Toolbox and Navigation Toolbox
3D point cloud processing and point cloud alignment
Closed-loop detection using bag-of-features and visual bag-of-words
Target detection and semantic segmentation using deep learning
Map generation with 3D LiDAR point clouds using Automated Driving Toolbox
Sensor fusion for localization and multi-target tracking using Sensor Fusion and Tracking Toolbox
SLAM back-end 2D/3D positional mapping
Generate 2D/3D positional maps using Navigation Toolbox
Optimize pose maps based on node and edge constraints
Bundle adjustment using Computer Vision Toolbox
Occupancy raster generation using SLAM Map Generator
Import 2D LiDAR data from MATLAB workspace or rosbag file and create occupancy raster
Find and modify the closed loop, then export the map as an occupation raster for path planning.
Use the output maps from SLAM algorithms for path planning and control.
Implement path planning algorithms such as RRT or Hybrid A* using Navigation Toolbox
Send control commands to follow planned paths and avoid obstacles.
Run computationally intensive processes, such as those related to image processing, in parallel using the Parallel Computing Toolbox to speed up process processing
Use the ROS Toolbox to deploy standalone ROS nodes from MATLAB and Simulink and communicate with ROS-enabled robots.
Deploy image processing and navigation algorithms developed in MATLAB and Simulink to embedded microprocessors using MATLAB Coder and GPU Coder