tPP Preliminary Statistical Report #1 of 5

The following report was completed by statistics students utilizing a version of tPP dataset as of March 13, 2019. These analyses are focused on developing models for future use, and the interpretations and conclusions they contain reflect a dataset still in development, and only a superficial engagement with the wider literature on political violence. We continue to expand, improve and refine the data, and as such, these analyses should be seen as preliminary and subject to change. This views expressed in these reports belong solely to the authors, and do not necessarily reflect the findings of tPP team and are subject to further inquiry and revision.

Below you will find an analytical summary and selected visualizations for Team #1 (Descriptive Analysis) of 5.

This report was authored by Emma Ellis, Sikai Huang, Haiduan Tao & Haosen Yang. To download the complete report, click here.

The main question answered in this report is: How does the US legal system prosecute acts of political violence (descriptive) and how has this changed over time and space?

First, the data was mined and edited using RStudio. The final format had 1280 observations. The only observations that were removed from the data set were cases that had ‘pending’ as values because these had no information and would negatively impact the descriptive statistics that were created. Each of the variables chosen had a table created. These tables looked at Category, Number of Observations, Average Prison Sentence Length, Percentage of Life Sentences, and Percentage of Death Sentences. Multiple tables had a lot of zeros under the death sentence column.

After tables were initially created it was decided that the combination of some categories depending on the variable would occur. The only variable that did not have a table created was the location. That is because a geomap was found to be more beneficial as a visualization. The geomap showed that states with higher populations also had a higher amount of life and death sentences.

The white color states (Wyoming, Nebraska, Rhode Island, and Hawaii) have no information in the data provided in the project. New York has the largest prosecution count number, far more than other states. Overall, about 87% states’ length of prison sentences is fewer than 200 months. Oklahoma and New Hampshire have longer prison sentence than other states, but they have few prosecution counts. Texas, California and New York also have relatively longer prison sentence with more prosecution count. Oklahoma has the largest percentage of life sentence and death sentence. Nearly half of the states have life sentences and 23% of states have death sentences.

Since this analysis is wholly descriptive there can be no definite conclusions drawn for predicting the length of a prison sentence. From the tables that were created and the geomap, there are some trends that were found in regards to life and death sentences.

One major finding is that there were no death sentences given to any case where the criminal was not of U.S. Citizenship.

Another notable find was that if there were no deaths involved there was no death sentence given, the most interesting part of this is that there were over 1,000 observations of zero killed.

The last notable find was that if an informant was present there were no cases that resulted in the death penalty. This can be explained by a crime being able to be stopped if the police were informed beforehand.


Hadley Wickham (2017). tidyverse: Easily Install and Load the ‘Tidyverse’. R package version 1.2.1.

Garrett Grolemund, Hadley Wickham (2011). Dates and Times Made Easy with lubridate. Journal of Statistical Software, 40(3), 1-25. URL .

David M Diez, Christopher D Barr and Mine Cetinkaya-Rundel (2017). openintro: Data Sets and Supplemental Functions from ‘OpenIntro’ Textbooks. R package version 1.7.1.

Paolo Di Lorenzo (2018). usmap: US Maps Including Alaska and Hawaii. R package version 0.4.0.

Carson Sievert (2018) plotly for R.

Leave a Reply

Your email address will not be published. Required fields are marked *