# Introduction

In high populated cities where there are many buildings locating individuals in the case of an emergency is an important task. For emergency responders, time is of the essence. Therefore, accurately locating a 911 caller plays an integral role in this important process.

The motivation for this problem in the context of 911 calls: Victims trapped in a tall building who seeks immediate medical attention, locating emergency personnel such as firefighters or paramedics, or a minor calling on behalf of an incapacitated adult.

In this paper a novel approach is presented to accurately predict floor level for 911 calls by leveraging neural networks and sensor data from smartphones.

In large cities with tall buildings, relying on GPS or Wi-Fi signals are not able to to provide an accurate location of a caller.

In this work there are two major contributions. The first is that they trained a recurrent neural network to classify whether a smartphone was either inside or outside of a buildings. The second contribution is that they used the output of their previously trained classifier to aid in predicting the change in the barometric pressure of the smartphone from once it entered the building to its current location. In the final part of their algorithm they are able to predict the floor level by clustering the measurements of height.

# Related Work

In general, previous work falls under two categories. The first category of methods are classification methods based on the user's activity. Therefore, some current methods leverages the user's activity to predict which is based from the offset in their movement [2]. These activities include running, walking, and moving through the elevator. The second set of methods focus more on the use of a barometer which measures the atmospheric pressure. As a result utilizing a barometer can provide the changes in altitude.

Avinash Parnandi and his coauthors used multiple classifiers in the predicting the floor level [2]. The steps in their algorithmic process are:

1. Classifier to predict whether the user is indoors or outdoors
2. Classifier to identify if the activity of the user, i.e. walking, standing still etc.
3. Classifier to measure the displacement

One of the downsides of this work is that in order to achieve high accuracy the user's step size is needed, therefore heavily relying on pre-training to the specific user. In a real world application of this method this would not be practical.

Song and his colleagues model the way or cause of ascent. That is, was the ascent a result of taking the elevator, stairs or escalator [3]. Then by using infrastructure support of the buildings and as well as additional tuning they are able to predict floor level. This method also suffers from relying on data specific to the building.

Overall, these methods suffer from relying on pre training to a specific user, needing additional infrastructure support, or data specific to the building. The method proposed in this paper aims to predict floor level without these constraints.

# Method

In their paper the authors claim that to their knowledge "there does not exist a dataset for predicting floor heights" [4].

To collect data they designed and developed a iOS application specifically the iPhone 6s to aggregate the data. They used the smartphone's sensor to record different features such as barometric pressure,GPS course, GPS speed, RSSI strength GPS longitude, GPS latitude and altitude.

From [4] the data was collected as follows:

Their algorithm used to predict floor level is a 3 part process:

1. Classifying whether smartphone is indoor or outdoor
2. Indoor/Outdoor Transition detector
3. Estimating vertical height and resolving to absolute floor level

## 1) Classifying Indoor/Outdoor

From [5] they are using 6 features which was found through forests of trees feature reduction. The features are smartphone's barometric pressure, GPS vertical accuracy, GPS horizontal accuracy, GPS speed, device RSSI level, and magnetometer total reading.

The magnetometer total reading was calculated from given the 3 dimensional reading $x, y, z$

Total Magnetic field strength $= \sqrt{x^{2} + y^{2} + z^{2}}$

They used a 3 layer LSTM where the inputs are $d$ consecutive time steps. The output $y = 1$ if smartphone is indoor and $y = 0$ if smartphone is outdoor.

In their design they set $d = 3$ by random search [6]. The point to make is that they wanted the network to learn the relationship given a little bit of information from both the past and future.

For the overall signal sequence: $\{x_1, x_2,x_j, ... , x_n\}$ the aim is to classify $d$ consecutive sensor readings $X_i = \{x_1, x_2, ..., x_d \}$ as $y = 1$ or $y = 0$ as noted above.

This is a critical part of their system and they only focus on the predictions in the subspace of being indoor.

They have trained the LSTM to minimize the binary cross entropy between the true indoor state $y$ of example $i$.

The cost function is shown below:

## 2) Transition Detector

Given the predictions from the previous step, now the next part is to find when the transition of going in or out of a building has occurred.

In this figure, they convolve filters $V_1, V_2$ across the predictions T and they pick a subset $s_i$ such that the Jacard distance (defined below) is $\gt = 0.4$

Jacard Distance:

After this process we are now left with a set of $b_i$'s describing the index of each indoor/outdoor transition. The process is shown in the first figure.

## 3) Vertical height and floor level

In the final part of the system the vertical offset needs to be computed given the smartphone's last known location i.e. the last known transition which can easily be computed given the set of transitions from the previous step. All that needs to be done is to pull the index of most recent transition from the previous step and set $p_0$ to the lowest pressure within a ~15 second window around that index.

The second parameter is $p_1$ which is the current pressure reading. In order to generate the relative change in height $m_\Delta$

After plugging this into the formula defined above we are now left with a scalar value which represents the height displacement between the entrance and the smartphone's current location of the building [7].

In order to resolve to an absolute floor level they use the index number of the clusters of $m_\Delta$ 's. As seen above $5.1$ is the third cluster implying floor number 3.

# Experiments and Results

All of these classifiers were trained and validated on data from a total of 5082 data points. The set split was 80% training and 20% validation. For the LSTM the network was trained for a total of 24 epochs with a batch size of 128 and using a Adam optimizer where the learning rate was 0.006. Although the baselines performed considerably well the objective here was to show that an LSTM can be used in the future to model the entire system with an LSTM.

The above chart shows the success that their system is able to achieve in floor level prediction.

# Future Work

The first part of the system used an LSTM for indoor/outdoor classification. Therefore, this separate module can be used in many other location problems. Working on this separate problem seems to be an approach that the authors will take. They also would like to aim towards modelling the whole problem within the LSTM in order to generate floor level predictions solely from sensor reading data.

# Critique

In this paper, the authors presented a novel system which can predict a smartphone's floor level with 100% accuracy, which has not been done. Previous work relied heavily on pre-training and information regarding the building or users beforehand. Their work can generalize well to many types of tall buildings which are more than 19 stories. Another benefit to their system is that they don't need any additional infrastructure support in advance making it a practical solution for deployment.

A weakness is that they claim that they can get 100% accuracy, but this is only if they know the floor to ceiling height, and their accuracy relies on this key piece of information. Otherwise when conditioned on the height of the building their accuracy drops by 35% to 65%.

It is also not clear that the LSTM is the best approach especially since a simple feed forward network achieved the same accuracy in their experiments.

They also go against their claim stated in the beginning of the paper where they say they "..does not require the use of beacons, prior knowledge of the building infrastructure..." as in their clustering step they are in a way using prior knowledge from previous visits [4].

# References

[1] Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. Neural Computation, 9(8): 1735–1780, 1997.

[2] Parnandi, A., Le, K., Vaghela, P., Kolli, A., Dantu, K., Poduri, S., & Sukhatme, G. S. (2009, October). Coarse in-building localization with smartphones. In International Conference on Mobile Computing, Applications, and Services (pp. 343-354). Springer, Berlin, Heidelberg.

[3] Wonsang Song, Jae Woo Lee, Byung Suk Lee, Henning Schulzrinne. "Finding 9-1-1 Callers in Tall Buildings". IEEE WoWMoM '14. Sydney, Australia, June 2014.

[4] W Falcon, H Schulzrinne, Predicting Floor-Level for 911 Calls with Neural Networks and Smartphone Sensor Data, 2018

[5] Kawakubo, Hideko and Hiroaki Yoshida. “Rapid Feature Selection Based on Random Forests for High-Dimensional Data.” (2012).

[6] James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13 (February 2012), 281-305.

[7] Greg Milette, Adam Stroud: Professional Android Sensor Programing, 2012, Wiley India