This version of the OPD crime data set should be the last ad hoc publication; from now on I intend to publish a new one approximately every month. This version captures Oakland Police Department department data for the period 2007 – Mar’14. The number of unique case IDs (NumberCID) is 486663 and the number of indicents associated with these (NIncid) is 805187. Three formats are available:
- OPD_140403.csv.zip (23.2 MB, zipped) comma separated with a header line showing these fields:
- Idx, OPD_RD, OIdx, Date, Time, CType, Desc, Beat, Addr, Lat, Long, UCR, Statute, CrimeCat
- OPD_140403_5.json(19.8 MB zipped) JSON for a dictionary
- cid – > [date,time,beat,addr,lat,long, [ctype,desc,ucr,statute,cc]+ ]
- OPD_140403.db.zip (34.1 MB zipped) a SqlLite database created via
- CREATE TABLE INCIDENT (incididx int, rd text, date text, beat text, addr text, lat real, long real)
- CREATE TABLE CHARGE (chgidx int, rd text, rdchgidx int, ctype text, desc text, ucr text, statute text, crimeCat text)
Note that only the JSON version is available here.
For the CSV or SQLITE versions, go to data.OpenOakland.org. This is a CKAN site,with issues.. Let me know if you have troubles getting it and we’ll work something else out.
UPDATE 2 May 14: It seems data.OpenOakland.org will be broken for some time. I have put (compressed) JSON, SQLite and CSV files for this 140403 distribution under the data directory of my OakCrime Github project. I’ll continue to put monthly snapshots there until further notice.
thanks for your interest, Marco, and bringing East Bay concerns to the Netherlands.
the first thing to say is that there are much newer versions with a year of further data (150308); check http://data.openoakland.org/dataset/crime-reports for updates.
second, while i wish OPD provided their own geocoding, they do not. i have had to scavange geo-coordinates for street addresses from a variety of sources. i appreciate the importance of specifying particular mapping systems, but just getting ANY mappable coordinates has been my primary concern. also, experiments i’ve done comparing geo-tagging alternatives have not generally found large errors. can you tell me how these difference will impact your questions?
also note that many addresses (and most of the recent ones) have been ’rounded’ to 100-blocks, sometimes with a “00” suffix and sometimes with a “50”, so coordinates will always have this level of ambiguity irrespective of geocoding source.
Hi Rik, thanks for your work on this website! I am a former student of UCB and Rockridge (Oakland) resident, currently I am working in the Netherlands. I got interested in the crime stats you posted and I have a specific-technical question.
I see most of the crimes in OPD_140403.csv.zip are geocoded, i.e. longitude and latitude are reported. I wanted to do some spatial analysis using ArchGis, however I could not gather from Oakland public website two key information I need to construct the GIS system: 1) which is the cartographic system they used 2) which is the geographic system they refer to. I will be happy to share with the community the result of my studies if I’ll be able to get these data! Thank you and best regards Marco