Home Artificial Intelligence Map-Matching for Speed Prediction Introduction Proposed Solution Conclusion Future Work Licensing Information Reference Material

Map-Matching for Speed Prediction Introduction Proposed Solution Conclusion Future Work Licensing Information Reference Material

0
Map-Matching for Speed Prediction
Introduction
Proposed Solution
Conclusion
Future Work
Licensing Information
Reference Material

How briskly will you drive?

Towards Data Science
Photo by Julian Hochgesang on Unsplash

When planning future vehicle routes, you trust the digital map providers for accurate speed predictions. You do that when picking your phone up to organize for a automotive trip or, in knowledgeable setting, when planning routes in your vehicle fleet. The forecasted speeds are a vital part of the trip’s cost, especially as they’re one in every of the basic drivers of energy consumption in electric and combustion vehicles.

The digital mapping service providers collect information from live traffic and historical records to estimate how briskly you’ll be able to drive along a selected road at any given time. With this data and intelligent algorithms, they estimate how quickly a median vehicle will travel through the planned route. Some services will accept each vehicle’s features to tune the route path and the expected trip times.

But what about specific vehicles and drivers like yours? Do these predictions apply? Your cars and drivers might need particular requirements or habits that don’t fit the standardized forecasts. Can we do higher than the digital map service providers? We’ve a probability for those who keep a very good record of your historical telematics data.

In this text, we’ll improve the standard of speed predictions from digital map providers by leveraging the historical speed profiles from a telematics database. This database comprises records of past trips that we use to modulate the usual speed inferences from a digital map provider.

Central to that is map-matching, the method by which we “snap” the observed GPS locations to the underlying digital map. This correcting step allows us to bring the GPS measurements in keeping with the map’s road network representation, thus making all location sources comparable.

The Road Network

A road network is a mathematical concept that supports most digital mapping applications. Often implemented as a directed multi-graph, each node represents a known geospatial location, normally some noteworthy landmark resembling an intersection or a defining point on a road bend, and the connecting directed edges represent straight-line paths along the road. Figure 1 below illustrates the concept.

Figure 1 — The diagram above shows the simplified representation of a road segment as represented by a digital map. Each node, represented by a red dot, corresponds to a known geospatial location defined by latitude, longitude, and altitude coordinates. Each connecting line is an edge, and digital maps use these to represent the roads (the travel direction is omitted here). Once we ask a digital map provider for directions, we get a protracted sequence of those nodes and edges and an estimate of the time it’ll take to travel such roads. (Image source: Creator)

Once we request a route from a digital map provider, we get a sequence of road network nodes and their connecting edges. The service also provides the estimated travel times and corresponding speeds between all pairs of nodes (in some cases, the speed estimates cover a variety of nodes). We get the entire trip duration by adding all of the partial times together.

If we improve estimates for these times, we can even have higher speed estimates and a greater route prediction overall. The source for these higher estimates is your historical telematics data. But knowing the historic speeds is just an element of the method. Before we are able to use these speeds, we must make sure that that we are able to project them onto the digital map, and for this, we use map-matching.

Map-Matching

Map-matching projects sequences of GPS coordinates sampled from a moving object’s path to an existing road graph. The matching process uses a Hidden Markov Model to map the sampled locations to the most probably graph edge sequence. In consequence, this process produces each the sting projections and the implicit node sequence. You’ll be able to read a more detailed explanation in my previous article on map matching:

After reading the above article, you’ll understand that the Valhalla map-matching algorithm projects the sampled GPS locations into road network edges, to not the nodes. The service might also return the matched poly-line defined when it comes to the road network nodes. So, we are able to get each edge projections and the implicit node sequence.

However, when retrieving a route plan from the identical provider, we also get a sequence of road network nodes. By matching these nodes to the previously map-matched ones, we are able to overlay the known telematics information to the newly generated route and thus improve the time and speed estimates with actual data.

Before using the map-matched locations to infer actual speeds, we must project them to the nodes and adjust the known travel times, as illustrated in Figure 2 below.

Figure 2 — The diagram above illustrates the challenge of mapping the implicit speeds of sampled GPS locations, displayed as green dots, to the speeds between the map nodes, represented as red dots. The orange diamonds represent the map-matched sampled GPS locations. On this problem, we all know the typical speeds between the green dots and need to infer the typical speeds between the red dots. (Image source: Creator)

As a prerequisite, we must accurately sequence each sets of locations, the nodes, and the map matches. This process is depicted in Figure 2 above, where the map matches, represented by the orange diamonds, are sequenced together with the road network nodes, represented as red dots. The travel sequence is clearly from left to right.

We assume the time differences between the GPS locations are the identical because the map-matched ones. This assumption, illustrated by Figure 3 below, is important because there is no such thing as a approach to infer what effect in time the map matching has. This assumption simplifies the calculation while keeping a very good approximation.

Figure 3 — We assume that the time differences between consecutive samples are the identical because the corresponding map-matched ones. (Image source: Creator)

Now that we all know the time differences between consecutive orange diamonds, our challenge is to make use of this information to infer the time differences between consecutive red dots (nodes). Figure 4 below shows the connection between the 2 sequences of time differences.

Figure 4 — The diagram above helps us understand how we interpolate the time differences between road network nodes (red dots) using the time differences from the sequence of map-matched locations (orange diamonds). (Image source: Creator)

We will safely assume that the average speeds between consecutive orange diamonds are constant. This assumption is important for what comes next. But before we proceed, let’s define some terminology. We’ll use some simplifications as a result of Medium’s typesetting limitations.

We want to take care of two fundamental quantities: distances and timespans. Using Figure 4 above as a reference, we define the space between the orange diamond one and red dot one as d(n1, m1). Here, the letter “m” stands for “map-matched,” and the letter “n” stands for node. Similarly, the corresponding timespan is t(n1, m1), and the typical speed is v(n1, m1).

Allow us to give attention to the primary two nodes and see how we are able to derive the typical speed (and the corresponding timespan) using the known timespans from orange diamonds one to 4. The common speed of travel between the primary two map-matched locations is thus:

Because the typical speed is constant, we are able to now compute the primary timespan.

The second timespan is just t(m2, m3). For the ultimate period, we are able to repeat the method above. The full time is thus:

We must repeat this process, adapting it to the sequence of nodes and map matches to calculate the projected trip times between all nodes.

Now that we have now seen methods to project measured speeds onto a digital map let’s see where to get the information.

Telematics Database

This text uses a telematics database to infer unknown road segment average speeds. All of the geospatial data within the database is already map-matched to the underlying digital map. This characteristic helps us match future service-provided routes to the known or projected road segment speeds using the abovementioned process.

Here, we’ll use a tried-and-true open-source telematics database I actually have been exploring currently and presented in a previously published article, the Prolonged Vehicle Energy Dataset (EVED), licensed under Apache 2.0.

We develop the answer in two steps: data preparation and prediction. In the information preparation step, we traverse all known trips within the telematics database and project the measured trip times to the corresponding road network edges. These computed edge traversal times are then stored in one other database using maximum resolution H3 indices for faster searches during exploration. At the tip of the method, we have now collected traversal time distributions for the known edges, information that may allow us to estimate travel speeds within the prediction phase.

The prediction phase requires a source route expressed as a sequence of road network nodes, resembling what we get from the Valhalla route planner. We query each pair of consecutive nodes’ corresponding traversal time distribution (if any) from the speed database and use its mean (or median) to estimate the local average speed. By adding all edge estimates, we get the intended result, the expected total trip time.

Data Preparation

To organize the information and generate the reference time distribution database, we must iterate through all of the trips within the source data. Fortunately, the source database makes this easy by readily identifying all of the trips (see the article above).

Allow us to have a look at the code that prepares the sting traversal times.

Figure 5— The code above shows the foremost loop for the information preparation script. (Image source: Creator)

The code in Figure 5 above shows the foremost loop of the information preparation code. We use the previously created EVED database and save the output data to a brand new speed database. Each record is a time traversal sample for a single road network edge. For a similar edge, a set of those samples makes up for a statistical time distribution, for which we calculate the typical, median, and other statistics.

The decision on line 5 retrieves an inventory of all of the known trips within the source database as triplets containing the trajectory identifier (the table sequential identifier), the vehicle identifier, and the trip identifier. We want these last two items to retrieve the trip’s signals, as shown in line 10.

Lines 10 to 16 contain the code that retrieves the trip’s trajectory as a sequence of latitude, longitude, and timestamps. These locations don’t necessarily correspond to road network nodes; they are going to mostly be projections into the perimeters (the orange diamonds in Figure 2).

Now, we are able to ask the Valhalla map-matching engine to take these points and return a poly-line with the corresponding road network node sequence, as shown in lines 18 to 25. These are the nodes that we store within the database, together with the projected traversal times, which we derive in the ultimate lines of the code.

The traversal time projection from the map-matched locations to the node locations occurs in two steps. First, line 27 creates a “compound trajectory” object that merges the map-matched locations and the corresponding nodes within the trip sequence. The thing stores each map-matched segment individually for later joining. Figure 6 below shows the article constructor (source file).

Figure 6 — The compound trajectory constructor receives the map-matched trajectory and the map nodes as separate arrays of latitudes and longitudes. The code merges each trajectories into an inventory of segments and later converts these to a trajectory list. (Image source: Creator)

The compound trajectory constructor starts by merging the sequence of map-matched points to the corresponding road network nodes. Referring to the symbols in Figure 2 above, the code combines the orange diamond sequence with the red dot sequence in order that they keep the trip order. In step one, listed in Figure 7 below, we create an inventory of sequences of orange diamond pairs with any red dots in between.

Figure 7 — The above function merges the map-matched trajectory points with the corresponding road network node sequence. The function tries to assign sequential nodes, if any, between each pair of trajectory points. An inventory collects all merged sequences for a final consolidation step. (Image source: Creator)

Once merged, we convert the trajectory segments to node-based trajectories, removing the map-matched endpoints and recomputing the traversal times. Figure 8 below shows the function that computes the equivalent between-node traversal times.

Figure 8 — The conversion from a segment to a trajectory requires calculating the equivalent times between road network nodes. (Image source: Creator)

Using the symbology of Figure 2, the code above uses the traversal times between two orange diamonds and calculates the times for all sub-segment traversals, namely between node-delimited ones. This fashion, we are able to later reconstruct all between-node traversal times through easy addition.

The ultimate conversion step occurs on line 28 of Figure 5 once we convert the compound trajectory to a straightforward trajectory using the functions listed in Figure 9 below.

Figure 9 — The ultimate reconstruction step concatenates the projected trajectory segments. Note how arrays with greater than two elements are clipped. The clipped elements correspond to the map-matched positions and are thus removed. Arrays with only two components have map-matched positions that coincide with road network nodes. (Image source: Creator)

The ultimate step of the code in Figure 5 (lines 30–32) is to save lots of the computed edge traversal times to the database for posterior use.

Data Quality

How good is the information that we just prepared? Does the EVED allow for good speed predictions? Unfortunately, this database was not designed for this purpose so that we are going to see some issues.

The primary issue is the variety of single-edge records in the ultimate database, on this case, somewhat over two million. The full variety of rows is over 5.6 million, so the unusable single-edge records represent a large proportion of the database. Almost half the rows are from edges with ten or fewer records.

The second issue is trips with very low frequencies. When querying an ad-hoc trip, we may fall into areas of very low density, where edge time records are scarce or nonexistent. In such conditions, the prediction code tries to compensate for the information loss using a straightforward heuristic: assume the identical average speed as within the last edge. For larger road sections, and as we see below, we may even copy the information from the Valhalla route predictor.

The underside line is that a few of these predictions will likely be bad. A greater use case for this algorithm can be to make use of a telematics database from fleets that often travel through the identical routes. It will be even higher to get more data for a similar routes.

Prediction

To explore this time-prediction enhancement algorithm, we’ll use two different scripts: one interactive Streamlit application that lets you freely use the map and an analytics script that tries to evaluate the standard of the expected times by comparing them to known trip times in a LOOCV kind of approach.

Interactive Map

You run the interactive application by executing the next command line on the project root:

streamlit run speed-predict.py

The interactive application lets you specify endpoints of a route for Valhalla to predict. Figure 10 below shows what the user interface looks like.

LEAVE A REPLY

Please enter your comment!
Please enter your name here