# Solution Overview – Beacon-Enabled Museum Guides

I’ll explain how a beacon-enabled museum guide app uses an instance-based classification algorithm – a k nearest-neighbour algorithm – to locate one out of many exhibits in a room. Before diving into the details of the k nearest-neighbour algorithm, I argue why the obvious algorithm of choosing the beacon with minimum distance from the phone isn’t good enough and why we need a cleverer algorithm.

## Why the Obvious Location Solution Falls Short

The location framework of iOS notifies interested apps about changes of the beacons in range of the visitor’s phone. This happens about once in a second. The app receives a list of beacon objects storing the beacon ID, the signal strength, the distance of the beacon from the phone and the proximity of the beacon to the phone. We call such a list of beacons a fingerprint. The fingerprint last received by the app is the current fingerprint

We are mostly interested in the distance (a.k.a. accuracy) of the beacon from the phone. The distance is measured in metres. The distance is calculated from the signal strength, which varies considerably depending on whether there are obstacles (people, walls, display cabinets, etc.) between a beacon and the phone or not. Hence, the distance of a beacon is only approximate. The proximity of the beacon to the phone divides the distance into four classes: Immediate (< 30cm), Near (< 2.5m), Far (>= 2.5m) and Unknown (no signal at moment of reading).

For the discussion of the obvious location solution and why it falls short, we assume that we have a room with three beacons. Each beacon is associated with an exhibit. A location solution should be able to locate the exhibit in front which a visitor is standing with her phone.

### The Obvious Solution: Choosing Beacon with Minimum Distance

The location algorithm simply chooses the beacon with the minimum distance from the current fingerprint, looks up the exhibit associated with this beacon and starts playing the audio commentary of the associated exhibit. In an ideal world, this would work perfectly. Unfortunately, wireless signals as emitted by beacons live in the real world full of obstacles.

Let us assume that our phone is 2.1m away from beacon A and 2.6m away from beacon B. There is clean line of sight between the phone and the two beacons. The wireless signal can travel unhindered. Now, a person squeezes in between us and beacon A. As the beacon signal must go through this person, the signal strength gets weaker and a fortiori the distance – calculated from the signal strength – increases. The distance between the phone and beacon A is now 2.7m. Therefore, beacon B is now the beacon with the minimum distance from the phone, although in reality the distance between the phone and beacon A did not change at all. The app plays the wrong audio commentary.

In the sample beacon deployment in my apartment, some beacons are fixed to the two sides of the same wall. Depending on where I am with my phone, the nearest beacon is on this side of the wall or on the other side. I was surprised how easily the Bluetooth Low Energy signals penetrated the walls. We would see the same problem in large open spaces with easily penetrable dividers or no dividers at all.

The location accuracy can get even worse, if the beacon signal fluctuates heavily despite having a clear line of sight between phone and beacons. Such a fluctuation of the signal strength and distance is caused by the beacon hardware or firmware. Beacons from the manufacturer Estimote show a very strong fluctuation as a study by IT intouch News shows (an English translation is available here). The study also shows that beacons from Kontakt do not show that fluctuation. Signal stability is an important criterion when choosing a beacon manufacturer.

We can conclude that the obvious location algorithm is normally not suited for locating exhibits in a room, because the signal strength and distance fluctuate too much because of obstacles between phone and beacons.

### Where the Obvious Solution May Be Good Enough

As the obvious solution is very simple to implement (in a couple of seconds), I didn’t want to give up on it right away. I found some contexts in which the obvious solution could be good enough. One way to keep the obvious solution is to ensure that beacon ranges do not overlap.

This non-overlapping criterion is certainly satisfied, if the location algorithm selects the only beacon with Immediate proximity from the current fingerprint and if we place the beacons so far apart that there are never two beacons with Immediate proximity in the current fingerprint. A distance of 50cm or more between beacons is sufficient.

Visitors would have to tap their phone to the beacon. Tapping would be the equivalent for reading the number from the exhibit’s information plate and entering it in the old-fashioned audio guide. It would save the visitor entering the number in the audio guide. It would not be the best user experience, but already a step up from standard audio guides.

When tapping is used to locate an exhibit, RFID tags would be the cheaper alternative to beacons (less than 1 USD versus 20-25 USD). Butthere are much less phones with NFC support for reading RFID tags than phones with Bluetooth Smart support for receiving beacon updates. These days, basically every new phone sold supports Bluetooth smart, but fewer and fewer support NFC. So, RFID tags – though much cheaper – are not a real alternative.

Now, it is easy to see how to generalise the location-by-tapping algorithm. We only have to place the beacons far enough apart. Big venues with few locations of interest like botanical gardens, parks, zoos or a palaces (think Alhambra, Versaille or Neuschwanstein) could be suited. The app could, for example, start playing an audio commentary, if it finds a beacon with a distance of less than 2.5 metres in the current fingerprint. By placing beacons far enough apart and by fine-tuning the transmission power, we can ensure that the beacon ranges don’t overlap.

We could use a slightly modified version of the obvious algorithm to identify in which room a visitor is, but not at which exhibit in a room the visitor looks. In other words, we decrease the location accuracy (rooms instead of exhibits) for the sake of a simple location algorithm. For this end, we put at least three beacons in each room. The location algorithm selects the room that has a majority of beacons among the three closest beacons in the current fingerprint. With four beacons per room and the majority of the five nearest beacons the accuracy is close to 100%.

Locating rooms instead of exhibits can be enough for palaces or even museums. The audio commentary would tell visitors where to find the exhibits in the room. The three fairy-tale castles of Ludwig II in Bavaria – Neuschwanstein, Herrenchiemsee and Linderhof – would lend themselves very well to such an approach. So would many other palaces, castles and buildings all over the world.

## Using Nearest-Neighbour Classification Algorithm for Location

At this point, I knew what didn’t work, but I had no clue how an algorithm should look that reaches a high location accuracy even in the presence of obstacles. Then, one of these serendipitous moments happened. I watched a webinar about Large Beacon Deployments, where a guy from indoo.rs explained how to use beacon fingerprints for indoor location. He stayed pretty vague on the actual location algorithm, but searching for keywords like “bluetooth fingerprint indoor location” got me to nearest-neighbour algorithms. A research paper titled Rank based fingerprinting algorithm for indoor positioning was one of the better ones I found.

Based on this research paper, I implemented a first version of the nearest-neighbour algorithm. The results were considerably better than with the obvious algorithm: 75% accuracy versus 55%. As 75% accuracy were still not good enough, I started fine-tuning the heuristics of the algorithm and looking for ideas how others did that. That was when I stumbled over the fact that nearest-neighbour algorithms are commonly used in machine learning and are an instance of instance-based classification (see the presentation Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian Classifiers by Pier Luca Lanzi for an excellent introduction). Looking at the problem of indoor location from this more general viewpoint gives us good insight why nearest-neighbour algorithms are a well-suited solution.

Instance-based classification is done in two phases: the training and the prediction phase. In the training phase, we walk around the museum with a phone, record beacon fingerprints and assign each fingerprint to the correct exhibit. This gives us a database of fingerprints, where each fingeprint identifies an exhibit.

In the prediction phase, a visitor walks through the museum with her phone. The current fingerprint seen by the phone is compared to all fingerprints recorded during the training phase and stored in the fingerprint database. Some distance metric (e.g., Euclidean or Manhattan distance) is used to determine how near the current fingerprint is to each fingerprint in the database.

We use the k fingerprints that are nearest to the current fingerprint to determine the exhibit at which the visitor is looking. k is a positive, typically odd integer. A simple way to select the most likely exhibit is to use a majority vote of the exhibits among the k nearest fingerprints. Another way is to compute a weight for each exhibit. The only way to determine which way is better is to run tests.

### Training Phase – Recording and Classifying Fingerprints

Before we can start the training phase of our instance-based classification algorithm, we need a “configuration” of the museum. The museum configuration defines the available floors, rooms and exhibits and their mutual relationships. It defines which exhibits are in which room, which rooms are on which floor and which floors are in the museum. For example:

```Ground Floor
Room 100
Umbrellas by Pierre-Auguste Renoir
Dance at the Mill by Pierre-Auguste Renoir
Swing by Pierre-Auguste Renoir
Room 200
Small Meadows in Spring by Alfred Sisley
...
```

When the museum configuration is available, we run the guide app in a special setup mode to perform the training.

First, we must aasign the beacons to their exhibits. We select an exhibit in the guide app and then tap the corresponding beacon with the phone. The app assigns the beacon to the exhibit. This procedure builds a one-to-one correspondence between exhibits and beacons, which the app will use in the prediction phase to map the nearest beacon to the nearest exhibit.

Second, we must build up the fingerprint database. Again, we select an exhibit first. Then, we walk around with the phone in the area, where a visitor would stand to look at the exhibit – catchment area, for short. The app records the fingerprints the phone sees while moving around in the catchment area. The app assigns the recorded fingerprints to the initially selected exhibit. Assigning a training sample (here: fingerprint) to a class (here: exhibit) is called classification in instance-based learning.

By repeating the above record-and-classify procedure for all exhibits, we build up the fingerprint database. The location algorithm will determine which fingerprint from the database is nearest to the current fingerprint seen by the phone and return the exhibit assigned to this fingerprint as the result.

As we learned in the previous section about the shortcomings of the obvious algorithm, fingerprints depend considerably on obstacles being or not being between beacons and phone. Hence, we should perform the classification at different times, when the museum is heavily crowded, averagely crowded and hardly crowded. This yields a more representative set of fingerprints with more accurate location results.

### Prediction Phase – Locating Exhibits with k-Nearest-Neighbour Algorithm

When a visitor walks around the museum, the guide app receives regular updates of the current fingerprint. These updates eventually result in calling the function `locateExhibit` with the `currentFingerprint`. The function `locateExhibit` implements the k nearest-neighbour algorithm or kNN algorithm, for short.

``` 1 func locateExhibit(currentFingerprint: Fingerprint) -> Exhibit {
2    let neighbours = [:]
3    for fingerprint in self.fingerprintDB {
4        let distance = self.distance(currentFingerprint, fingerprint)
5        neighbours[fingerprint] = distance
6        if neighbours.count > k {
7            neighbours[self.farthestFingerprint(neighbours)] = nil
8        }
9    }
10    return self.nearestExhibit(neighbours)
11 }
```

Line 2 initialises the `neighbours` dictionary, which contains the `k` neighbours that are currently nearest to the `currentFingerprint`.

Lines 3-9 are the main loop of the kNN algorithm iterating over all fingerprints in the fingerprint database. For each `fingerprint` from the `fingerprintDB`, the following steps are executed in the body of the loop.

Line 4 calculates the distance between the `currentFingerprint` and the `fingerprint` from the `fingerprintDB` using a distance metric like Euclidean or Manhattan distance.

Line 5 stores the just computed `distance` for the key `fingerprint` in the dictionary `neighbours`. If `neighbours` contains less than or equal to `k` elements (the negative case of line 6), the function `locateExhibit` continues with the next `fingerprint` in the main loop. If not (the positive case of line 6), `neighbours` contains `k + 1` elements and line 7 removes the fingerprint with the maximum distance from the `neigbhours` dictionary. This guarantees that `neighbours` either contains less than `k` elements or exactly the `k` elements with the smallest distance from the `currentFingerprint`.

When the function `locateExhibit` has determined the `k` nearest neighbours of the `currentFingerprint`, the function `nearestExhibit` in line 10 returns the exhibit with the maximum weight in `neighbours`. The weight could be the number of times an exhibit occurs in `neighbours` or the sum of inverse distances of an exhibit.

Although we now know the basic kNN prediction or classification algorithm, it takes more than just implementing this algorithm to come up with a good location solution. The algorithm needs quite a bit of fine-tuning. What is the best value of `k`? What is the best distance metric? What is the best weight for determining the nearest exhibit? “Best” means a tradeoff between location accuracy, speed and memory usage of calculating the nearest exhibit and battery life. We can only answer these questions by trying out different variants of the basic algorithm.

## Stay Tuned for More

I hope that I got you interested in a beacon-enabled museum guide realised as a phone app. I have actually built a prototype guide app for the iPhone over the last three months. My focus was on good location accuracy and an easy setup. I will share my learnings in a series of posts and the source code on github.

I plan the following posts. If there is a link to a post, it’s not a plan any more 😉

• The post Motivation – Beacon-Enabled Museum Guides argues that beacon-enabled museum guide apps provide a much better user experience and cost much less than their traditional counterparts.
• The post Beacon Ranging in the Background shows how to enable beacon ranging for an iPhone app both in the foreground (easy) and in the background (not so easy).
• The current post Solution Overview – Beacon-Enabled Museum Guides shows how instance-based classification – implemented by a k nearest-neighbour algorithm – can be used to locate one out of many exhibits in a room. It also argues why the obvious algorithm of choosing the beacon/exhibit with the minimum distance to the phone isn’t good enough.
• The post “Training Phase – Beacon-Enabled Museum Guides” explains how to provide the content for a tour through a museum, how to associate beacons to exhibits and how to record reference fingerprints for the location algorithm.
• The post “Prediction Phase – Beacon-Enabled Museum Guides” describes the actual location algorithm – a k nearest-neighbour search algorithm commonly used for classification tasks in machine learning.