GIS Quest - Part II

With the data processing and visualization implemented in Part I, it’s now time to do some more interesting GIS work. Let’s see how much of our polygon on mount Fløyen we’ve explored, and identify any remaining trails yet to endeavour.

The K-D tree

The k-d tree is a space partioning data structure, basically it allows spatial queries on points, such as the nearest point, or what points are within a range from a given coordinate. An example is demonstrated below by querying a range of approximately 5 meters from the mouse cursor.

The data structure itself isn’t very complicated, a binary search tree is built from the coordinates. For each level the dimension is changed, so for coordinates all even levels are longitude and all odd levels latitude. At each level the coordinates are sorted based on the level’s dimension, and branched on the value of the median index, so the left and right subtrees are points lesser or greater to the median point, respectively.

const kdtree = (coordinates) => {
    const kdnode = (input, level) => {
        if (input.length === 0) {
            return undefined;

        const points = [...input].sort((a, b) => a[level & 1] - b[level & 1])
        const medianIndex = ~~(points.length / 2);
        const point = points[medianIndex];

        return {
            left: kdnode(points.slice(0, medianIndex), level + 1),
            right: kdnode(points.slice(medianIndex + 1), level + 1)

    return kdnode(coordinates, 0)

We can find points within an area by doing a range search on the kd-tree:

const kdRange = (root, bbox) => {
    const search = (node, depth, result) => {
        if (!node) return;

        const axis = depth & 1;
        const currentPoint = node.point;

        if (currentPoint[0] >= bbox[0][0]
            && currentPoint[0] <= bbox[1][0]
            && currentPoint[1] >= bbox[0][1]
            && currentPoint[1] <= bbox[1][1]) {

        const leftTree = currentPoint[axis] >= bbox[0][axis];

        const nearTree = leftTree ? node.left : node.right;
        const farTree = leftTree ? node.right : node.left;

        search(nearTree, depth + 1, result);

        if (bbox[0][axis] <= currentPoint[axis]) {
            search(farTree, depth + 1, result);
        return result;

    return search(root, 0, []);

Getting the trails - Overpass Turbo

Overpass turbo allows us to query the openstreetmap database, extracting features we’re interested in. For routes, we include highway types footway | unclassified | path | track | pedestrian | steps | service within our bounding box. Giving us this query:


out geom;

Plotting the result, after filtering the data to the polygon defined in Part I, we’re left with this:

Remaining trails

Feeding the above routes into the k-d-tree, we can now match points to their nearest trail. If a recorded GPS point is within a threshold distance to a point in the k-d-tree, we mark that point as complete. By removing all the points we’ve visited we can identify segments above a certain length that have yet to be explored:

Matching GPS traces against OSM routes

While kd-tree subtraction using GPS routes was useful in uncovering some unexplored areas, we can improve this even more by reducing the noise in the GPS data by matching them to actual routes. For this we can use the Open Source Routing Machine.

OSRM - Open Source Routing Machine

The Open Source Routing Machine is a high performance C++ routing engine to find paths and match points against existing paths. It’s available via a web service, but it’s also possible to install and run locally. Following the steps mentioned in the documentation I first downloaded open street map data for Norway and extracted walking routes from it.

$ osrm-extact -p /path/to/osrm/profiles/foot norway-latest.osm.pbf

I then ran the OSRM tooling to make the data more efficient for route processing:

$ osrm-partition norway-latest
$ osrm-customize norway-latest

And then exposed the routing engine’s API via HTTP (on port 8192) by running the osrm-routed service

$ osrm-routed norway-latest -p 8192 --max-matching-size 1000 --algorithm=MLD

We can now query the API for a route, for example, using this route from area used earlier:


the input route (red) is matched to the correct path (blue):

The full curl request was:

$ curl "http://localhost:8192/match/v1/foot/5.32839,60.39643;5.32847,60.3965;5.32859,60.3965;5.32874,60.39644;5.3288,60.39646;5.32881,60.39656;5.3287,60.39665;5.32906,60.39657;5.32914,60.39662;5.32887,60.39682;5.32888,60.39695;5.32925,60.39674;5.32934,60.3968;5.3291,60.39691;5.32918,60.39696;5.3294,60.39691;5.32947,60.39696;5.3292,60.39704;5.32927,60.39714?geometries=geojson&overview=full"

Custom Tooling

When running OSRM on all of my own routes I had varying success with matching, which is understandable as it’s a difficult problem. So to give OSRM a helping hand I wrote a non-destructive interactive editing tool to post-process GPS recordings. The tool lets one edit and test matching interactively and save a DAG describing the necessary operations to generate a good route.

For example, here’s a dag that fixes two problems:

  1. Some GPS coordinates deviating too far off the path
  2. Invalid gaps between features
  "name": "edit",
  "args": {
    "idx": 0,
    "edit": [{
      "name": "delete",
      "args": [71]
    }, {
      "name": "delete",
      "args": [72]
    }, {
      "name": "delete",
      "args": [73]
    }, {
      "name": "delete",
      "args": [74]
}, {
  "name": "merge",
  "args": [0, 1]

Another positive aspect of matching my own GPS traces against OSM was that I discovered some mistakes in OSM, such as misclassified routes preventing matching, which were easy to fix upstream:

After painstakingly inspecting all the routes within the pre-defined polygon, and correctly matching them, this is the result:

And after running them through the kd-tree, where the minimum length of a trail used in subtraction could be reduced to about 5m, this is what we’re left with:

Vanity Poster

The vanity poster, including completion progress: