# Probabilistic Data Association for Semantic SLAM

## Introduction

- Motivation: Allows vehicles to autonomously navigate in an environment without apriori knowledge of the map and without access to independent position information
- Problem to solve:
- (What the enviroment looks like & Where is the robots, After given a path, how does the robot go)
- Sensing: How the robot measures properties of itself and its environment

- What is SLAM?
- simultaneous localization and mapping
- wiki
- Localization: inferring location given a map
- Mapping: inferring a map given locations

- SLAM is a chicken-or-egg problem:
- a map is needed for localization
- a pose estimate is needed for mapping

- Classical solutions:
- Landmark extraction
- data association
- State estimation
- state update
- landmark update

- Data Association:
- ascertaining which parts of one image correspond to which parts of another image, where differences are due to movement of the camera, the elapse of time, and/or movement of objects in the photos.
- Problems:
- You might not re-observe landmarks every time.
- You might observe something as being a landmark but fail to ever see it again.
- You might wrongly associate a landmark to a previously seen landmark.

- Loop Closure
- Loop closure is the problem of recognizing a previously visited location and updating the states accordingly.

- Limitation of traditional approaches:
- rely on low-level geometric features- > loop closure recognition based on low-level features is often viewpoint-dependent and subject to failure in ambiguous or repetitive environments
- object recognition methods can infer landmark classes and scales, resulting in a small set of easily recognizable landmarks, ideal for view-independent unambiguous loop closure

- Goal: address the metric and semantic SLAM problems jointly,
- providing a meaningful interpretation of the scene,
- semantically-labeled landmarks address two critical issues of geometric SLAM: data association (matching sensor observations to map landmarks) and loop closure (recognizing previously-visited locations).

- Other approches:
- filtering methods
- batch methods: pose graph optimization, iterative optimization methods
- use both spatial and semantic representation
- For localization: incorporate semantic observations in the metric optimization
- realtime implementation
- global optimization for 3d reconstruction and semantic parsing. 3d space is voxelized
- structure from motion
- semantic mapping

**Contributions:**- the first to tightly couple inertial, geometric, and semantic observations into a
**single optimization framework** **provide a formal decomposition of the joint metric-semantic SLAM problem**into continuous (pose) and discrete (data association and semantic label) optimiza-tion sub-problems,**carry experiments on several long-trajectory real indoor and outdoor datasets**which include odometry and visual measurements in cluttered scenes and varying lighting conditions.

- the first to tightly couple inertial, geometric, and semantic observations into a

## Methods

### EM Algorithms

- Wiki: link
- In statistics, an expectation–maximization (EM) algorithm is an iterative method to find maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables. The EM iteration alternates between performing an expectation (E) step, which creates a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step. These parameter-estimates are then used to determine the distribution of the latent variables in the next E step
- Coin flipping Examples: link

### Probabilistic Data Association in SLAM

- Formualtation of the problem
- $\mathcal{L} \triangleq$

### Semantic SLAM

## Experiments

- backend: GTSAM and its iSAM2 implementation, realtime
- frontend:
- every 15th camera frame as a keyframe
- ORB features
- outlier tracks: estimating the essential matrix using RANSAC
- assume timeframe is short => oritation differences using gyroscope measurements => only need to estimate unit translation vector (only using two point correspondence)
- object detector: deformable parts model detection algorithm
- a new landmark: Mahalanobis distance to all known landmarks, above a threshold, initial position: camera ray with estimation of depth(median depth of all geometry feature measurements)

- in practice, solve only once for the constraints weights
- datasets:
- experimental platform: VI-Sensor (IMU, left camera)
- medium length(~175m), one floor of office building, object classes(red office chairs and brown four legs chairs)
- long length(~625m), two different floors, object classes(red chairs and doors)
- several loops around one room(with vicon motion tracking system), object class(red chair)

- KITTI (05, 06)

- experimental platform: VI-Sensor (IMU, left camera)
Results:

- their own indoor dataset: compared with ROVIO, ORB-SLAM2
- Fig. 3, 4,
- Fig. 1, 2, 5 Fig. 6: bag-of-words based loop closure detections
- Fig. 7, 8

- KITTI outdooe dataset: campared to VISO2, ORB-SLAM2
- object: cars
- not inertial odometry => VISO2 odometry algorithm

- their own indoor dataset: compared with ROVIO, ORB-SLAM2
Future Work / Limitation:

- estimate full pose of the semantic objects (orientation in addition to positions)
- consider data associations for past keyframes
- consider multiple sensors (only one camera?)
- consider non-stationary objects

## Slides:

https://people.eecs.berkeley.edu/~jrs/speaking.html

**Motivation**: why come up this problem**Problem**: What problem is the paper trying to solve?**Background**: Give background so everyone is on the same page**Topics in class**: Connect ideas in paper to topics covered in class.**Sensors: gyroscope(IMU), cameras****Contributions**: What is the contribution compared to prior work?**New ideas**: What can now be done that couldn’t be done before?**New Methods / Approprate Tech Depth**: What new ideas enable this to be done?**Experiments & Results**: What evaluation/experiments were performed?**Limitation & Opinion for Effectiveness**: Give your opinion/analysis of effectiveness of proposed method