Evaluating MapBox for OakCrime.org

The demise of MapZen has forced us to consider other services for OakCrime.org.  This note describes experiments to use MapBox, and comparisons of both with Google’s service.

TL;DR: MapBox seems a very serviceable replacement for MapZen vector tiles, and that is OakCrime.org’s most pressing need.  Getting MapBox’s heatmaps to work did require a fair amount of touch.  MapBox geocoding is significantly less satisfactory than via MapZen.  We’re going to need to lean on the kindness/hegemony of GoogleMaps until something better comes along.

Methods

Most incidents reported by OPD now carry with them geocoded (latitude, longitude) coordinates.  Other location strings in OPD records (e.g., in Patrol Logs) do not.  These location strings are normalized slightly and then submitted to all three systems.

Returned results are filtered to include only those relevant to Oakland, CA. For Google the string “Oakland CA” was appended to the location string, and then only results with a locality name of the “Oakland” are kept. For MapBox, a proximity location (centered on Oakland) was provided to its forward() method.

For what they’re worth, MapZen and MapBox both provide “confidence” / “relevance” (resp.)  measures on their returned geocoded coordinates.

Results

Vector tiling

Placing simple markers on a MapBox map was pretty straight-forward.

The hard part began when trying to replicate the heatmap functionality everyone liked so much in the MapZen interface.  The number of knobs to tweak for heatmaps using Mapbox is much larger; maybe that’s a good thing.  But it took a long time to get them dialed into a useful range.  Dynamic range mapping was the hardest to get close.  Start with the heatmap-intensity variable.

More generally, MapBox’s GL-JS library is idiosyncratic, awkward, and hard to debug, imho.  You need to use its “expressions”, eg, to make use of exponential ranges ('heatmap-intensity': ["interpolate", ["exponential", 2], ["zoom"], 0,2, 14,4 ],), rather than just calling javascript functionality.

Geocoding

A random set of 70 recent locations strings were used as a sample.  Across these, MapZen missed 10 (ie, was unable to provide accurate coordinates) and MapBox missed 15. Google found all of them.

In most situations, when MapZen is able to geocode an address, MapBox identifies approximately the same coordinates.  Typically, all three systems generated coordinates proximate to one another.  (cf. Figure 1: 1100-87th-ave)

Figure 1: 1100-87th-ave

(All of these images generated by using GoogleMaps to place points generated by MapZen, MapBox and/or Google geotagging.)

There are some addresses where MapBox did geocode when MapZen could not.  (cf. Figure 2: 5300-shafter-st)

Figure 2: 5300-shafter-st

In other cases both MapZen and MapBox provided coordinates that were quite far from the (correct) ones indentified by Google.  (cf. Figure 3: 500-e-11th-st)

Figure 3: 500-e-11th-st

The most important deficiency in the current MapBox service is its inability to handle street intersections.  This type of location description is very common in OPD reporting (cf. Figure 4: 104th-ave&royal-anne-st).

Figure 4: 104th-ave&royal-anne-st

Summary, next steps

The dependence on MapZen for vector tiles was the most important, and
(except for the hassles making heatmap work) MapBox seems a very serviceable if awkward replacement for this task.

For geocoding MapBox seems slightly inferior in general, and fatally flawed in terms of street intersections. This means that at least for now the system will depend exclusively on Google geocoding.

Because such a sole source dependence is always a concern, an important development goal is to add street intersection identification to MapBox. This is the subject of a long-open (2015) issue on their github repo

You might be able to implement it client side too, splitting the query into two queries “5th Street” then “Main Street”, however for this you’d want the result to return a street with a `LineString` geometry, rather than a point in the centroid of the street.

Subsequent discussion of what keeps geometries from being exposed makes it seem as if this will remain a difficult problem because of intellectual property reasons:

Carmen doesn’t return full geometries on purpose; at Mapbox we’re often working with proprietary datasets that we don’t want to return in responses.

Late note: Sounds like something called NextZen might be another alternative to check out.

Leave a Reply

Your email address will not be published. Required fields are marked *