Crime @ OpenDataDay

Another high energy OpenOakland event (23 Feb 13) in a nice new venue (the 81st Avenue Oakland Public Library).  thanks to all who made it possible.

I’m putting my notes from the group focused especially on crime up here.  i think i have wordPress configured so that you can comment on it (as soon as i approve it the first time) and all will be publically viewable   We may decide to move these materials somewhere else, but to keep the great energy we had today I’ll start here.

OaklandWiki: Crime resources

Create an Oakland wiki resource page for crime related resources

Just in our first conversation in many references to information find
resources. These include better access to OPD’s web site information,
as well as other resources like Oakland’s Measure Y

Reach-out to Community-based crime groups

Things like Neighborhood Crime Prevention Councils (NCPCs)
( and the Neighborhood Block Watch
organizations exist across much of Oakland. We need to develop a
technology resource that we can provide to these various
organizations. It could include things like:

  • – basic hardware info: strong door locks, window security, etc. (eg,
    Reed Brothers gave a great session at a recent neighborhood meeting at
    the Dimond Library)
  • – more exotic open source solutions to camera, motion sensors, (drones!?)
    technologies that are becoming available
  •  community building web tools (thanks Chris!); much better than Yahoo mailing lists!
  • – coordinating picture exchanges among neighbors concerning crime events
  • – coordination across block captains
  • – anonymization of members’ contact information as requested: An
    important feature mentioned by some participants is that some block
    captains are reluctant to be publicly identified in these roles.


Much of the group’s attention was spent talking about data sets
generated by OPD and urban strategies Council. These feed into
requests for an API for new crime data provided by OPD in the future.

Add geocoding to OPD data

Several groups are pursuing the basic and critical task of producing a
table like the following from the OPD data

05-024771,8000 INTERNATIONAL BLVD,37.756228,-122.181777
05-024770,10300 INTERNATIONAL BLVD,37.740329,-122.167777
05-024777,8800 INTERNATIONAL BLVD,37.750728,-122.175577
05-024797,3400 FOOTHILL BLVD,37.782728,-122.220279

That is, given the CASENUMBER incident ID (primary key) for incidents
and their addresses in the OPD data, compute a latitude and longitude
to be associated with it.

An important idea that came up early (maybe because we had several
Google employees as part of our group!) was approaching Google
concerning the licensing rights to their Geocoder
(cf. If
we did a preliminary, proof of principle project with our Oakland
brigade, this would provide data for Google to evaluate as part of a
national relationship with Code For America.

It wouldn’t be bad if multiple people tackled this task, because
comparing across geo-servers would be illuminating.

Statutes’ chapter&verse

The statute codes are appearing to be a key representation for crime
data. Eg, the USC data set has them but the OPD data does not.
Someone (who?!) is actively scraping textual passages associated with
statutes as a resource for further use

Inductive maps

There are several “machine learning” tasks that are potentially useful
bits we can use on attributes in the USC data set to predict the
similar indicators in the OPD data set. Two specific tasks that would
be useful:

  • – OPD_CTYPE -> Statute
  • – OPD_CTYPE -> USC_Indicators (eg, ‘Violence’, ‘Property’, ‘Homicide’,
    ‘Assaults’, ‘Robbery’, ‘Shootings’, ‘Burglary’,
    ‘MV_Theft’, ‘Rape’, ‘Weapons’, ‘Drugs’, ‘Sex’)

2 thoughts on “Crime @ OpenDataDay

  1. Hi,

    I enjoyed today’s discussions.

    A couple of things: the private neighborhood facebook-style online community is We started using in our neighborhood and it has worked out much better than the yahoo group promoted by the homeowner’s association.

    Thanks for the instructions on setting up nominatim with a CA-specific subset of the data. That looks pretty straightforward. I don’t know whether these instructions address the issue of starting the server:

    Another option – they are hosting the OpenStreetMap dataset. As far as I can tell there aren’t any limits.

  2. (from Erick Tryzelaar)


    One of the problems my team was trying to address is the difficulty of
    converting an address from a crime report into a geo coordinate. Google
    allows you do use their service, but they allow you to convert only 1000
    of addresses a day. Another approach I was working on was trying to take
    advantage of OpenStreetMap ( to resolve
    those addresses. They have a sub-project, called Nominatim
    ( that does exactly this.
    It allows for forward resolution of an address:,+birmingham&format=xml&polygon=1&addressdetails=1

    Or to go from a geocoord to an address:

    Because we weren’t able to connect out to EC2, I spent my day trying to
    get Nominatim working on my Mac laptop. I eventually found this guide,
    which mostly works:

    Unfortunately it is not perfectly up to date. Here are some additions I
    had to make:

    1. Download the california dataset here:

    wget **
    % bunzip2 california.osm

    2. Install things with:

    % brew install postgresql
    % brew install postgis
    % brew install protobuf
    % brew tap homebrew/dupes
    % brew tap josegonzalez/homebrew-php
    % brew install php54
    % pear install DB
    % launchctl load -w

    3. Create the nominatim database with:

    % initdb -E utf8 -D /usr/local/var/postgres
    % createdb nominatim
    % psql -d nominatim
    nominatim=# CREATE EXTENSION postgis;
    nominatim=# CREATE EXTENSION htable;

    4. Create the table:

    osm2pgsql –create –latlong –database nominatim –slim –prefix
    planet_osm –cache 2048 california.osm

    I haven’t yet figured out how to start the webserver, but assuming it’s
    up, you should now be able to make http calls locally to resolve
    addresses without hitting the Google request limits.


Leave a Reply

Your email address will not be published. Required fields are marked *