A Comparison of Artificial Intelligence Methods for Predicting the Rate of Violent Crime

DOI : 10.17577/IJERTV13IS040077

Download Full-Text PDF Cite this Publication

Text Only Version

A Comparison of Artificial Intelligence Methods for Predicting the Rate of Violent Crime

Onyeachonam Dominic-Mario Chiadika Department of Electronic and Computer Engineering

Brunel University, London.

Ujerekre Ekoko

Department Computer Engineering

Delta State Polytechnic, Ogwashi-uku, Delta State.

Oghenerukevwe Affun

Department Electrical & Electronic Engineering Delta State Polytechnic, Ogwashi-uku, Delta State.

Ochuko Goodluck Utu

Department Mechnainical Engineering

Delta State Polytechnic, Ogwashi-uku, Delta State.


Every nation in the planet is extremely concerned about the rise in violent crime. Crime forecasting is one of the many crime analysis techniques that have been used to lower the frequency of violent crimes. Since it helps law enforcement agencies plan successful crime prevention measures, crime forecasting is a useful tool. It has been noted recently that researchers are favouring the use of artificial intelligence (AI) approaches in crime predicting and analysis. This development serves as the impetus for this study, which compares the effectiveness of three artificial intelligence (AI) techniquesneural network (ANN), support vector regression (SVR), and gradient tree boosting (GTB)in forecasting the rates of four different categories of crimes in the US. Quantitative error measurement was used to compare each AI technique's forecasting ability. Based on the acquired data, GTB outperformed ANN and SVR in terms of forecast accuracy, with the fewest observed error readings.

Keyword: Crime,AI,ANN, GTB, SVR


All nations in the world are somewhat concerned about the recent rise in crime rates, particularly those of a violent nature. Crimes harm numerous parties and persons in addition to causing enormous financial damages [1]. Additionally, crimes pose a serious danger to community stability and communities. The erratic political and economic climate of late has also acted as a trigger for the rise in violent crimes. Researchers and criminologists have devised and used a variety of crime analysis techniques to examine and track violent crime patterns in an effort to decrease violent crimes. Among the applied techniques, crime forecasting or prediction has gained popularity since it can anticipate the likelihood of violent crimes occurring in the future in addition to analysing patterns of violent crime. Crime forecasting has the benefit of helping numerous federal and State law enforcement organisations, including police departments and security services, can design and oversee

effective crime prevention initiatives with the use of crucial information provided. Previous research has demonstrated that complicated nonlinear data structures with various representations make up real-world crime data. These features make it difficult for researchers to choose the right model or method to handle such complicated data.

However, due to their recent quick progress, artificial intelligence (AI) techniques have gained favour and are being explored by researchers. The fact that AI has nonlinear functions that can spot nonlinear trends in data and enhance predicting ability overall is one of its benefits [2]. This fact serves as motivation for the study's application, comparison, and analysis of the forecasting capabilities of three particular AI techniques: artificial the Nigeria's violent crime rate data using neural networks (ANN), support vector regression (SVR), and gradient tree boosting (GTB).

  1. Synopsis of the Application of Artificial Intelligence (AI) Method in Crime Prediction

    Researchers mostly use machine learning (AI) to predict crime.

    Researchers mostly use machine learning approaches to anticipate and assess the value of target data (crime rate) when forecasting crime using AI techniques. Because of its resilience and versatility, artificial intelligence (AI) has become a popular tool for professionals, academics, and data analysts to evaluate and predict a wide range of data sets. The AI technique's capacity to comprehend and model a complicated and non-linear relationship of data is another factor [3]. As previously stated, there are numerous distribution, representation, and form factors for real-world crime data structures. Therefore, without having to worry about data format restrictions, the AI technique is a highly suggested solution for analysing and forecasting crime rate data. Three AI techniquesartificial neural networks (ANN), support vector regression (SVR), and gradient tree boosting (GTB)were chosen for this study in order to forecast the rates of violent crime. The performance of each


    (This work is licensed under a Creative Commons Attribution 4.0 International License.)

    in terms of forecast accuracy .Next, a chosen AI technique was examined and contrasted.

    1. ANN

      An AI method called artificial neural networks (ANNs) was motivated by the human nervous system. A huge parallel network is eventually formed by the interconnection of numerous layers of neurons, each with its own weight matrix, bias vector, and output vector [4]. ANN neurons serve as local memory or processing units that take in information from other neurons, process it, and then send it on to other neurons. Because artificial neural networks (ANNs) are capable of self-learning, researchers have been using them extensively in crime predicting for many years. This is because ANNs may give outcomes that are not limited by the input data that is provided [5-7].

    2. SVR

      An AI method called support vector regression (SVR) uses a symmetrical loss function to estimate a real-value function [4]. It took inspiration from the computation of the linear regression function in high-dimensional feature space, where a nonlinear function maps the input data [8]. The One benefit of SVR is that, given the right kernel functions, it can handle a variety of data forms with flexibility and generalise previously unknown data [9]. Because of its outstanding generalisation ability and great prediction accuracy, SVR is one of the AI approaches used in crime predicting the most [1012].

    3. GTB

      An AI method called gradient tree boosting (GTB) was first presented by [13]. It is based on a fusion of boosting and decision tree approaches. In GTB, the boosting strategy is applied to lower the error during the decision tree learning process, with the decision tree serving as the base learner. The augmentation procedure is carried out iteratively, with each boosting iteration improving the preceding iteration's tree accuracy [14]. One benefit of GTB is its ability to prevent overfitting when additional independent data are introduced [11]. GTB is very new in the field of crime predicting, and there haven't been many research on its use in crime analysis [1516].


        The Statistics and Machine Learning Toolbox on the Matlab platform is used to develop each AI-based crime type model in this study. Several factors that could affect crime rates are taken into consideration while examining the AI-based crime model (ANN, SVR, and GTB) using a multivariate analysis. Resolving the regression problem, which required projecting the crime rate data for each type of crime, is another of its main concerns.

        The prepared training data set for data fitting is used to train the crime model before the crime rate value is anticipated. Using the prepared testing data set, thecrime model is used to forecast the crime rate values after it has been trained or fitted. Next, the predicted crime rate value outcome is applied to analyse the effectiveness of each AI crime model. The predicting ability of each AI crime model in this study was

        determined and measured using three different forms of quantitative error measurement analyses: mean absolute deviation (MAD), mean absolute percentage error (MAPE), and root mean square error (RMSE).

        1. Data Collection

          In this study, two types of data sets were used i.e. crime rate and factors data sets. In the crime rate data set, four types of violent crime rates across Nigeria namely murder and non- negligent murder, Kidnapping, aggravated assault, and robbery were used. The crime rate data set was obtained from the Uniform Crime Reporting Statistics website provided by the Nigerian bureau for statistics. For the factors data set, nine data series namely unemployment rate, immigration rate, population rate, consumer price index, gross domestic product, tax revenue, poverty rate, inflation rate, and consumer sentiment index were selected and used. The factors data set was obtained from numerous Nigerian government agencies and other related data repository websites. Each data series of both data sets consists of 56 data samples of annual time series data that were collected from 2014-2024. [

        2. Data Preprocessing

          The crime rate and variables data sets were split into two groups for testing (out-of-sample) and training (in-sample) data during the experiment. Each crime model is trained using training data, and the values of the crime rate are tested and predicted using testing data, which is derived from the trained crime model. In this investigation, The data were split into training and testing sets, with 50 data samples from 1960 to 2009 and six data samples from 2010 to 2015 in each set. The ratio of the data was 15:1.

          Using the normalisation technique, each data set was processed and converted into a dimensionless form to prevent any unanticipated errors caused by differing measurement units. The feature scaling method was applied in a scale range to normalise the raw data sets (crime rate and factors) in the range of 0 to 1. Equation (1) defines the process of data normalisation.

          = ( )/( )

          Equation (1) denotes the following: x' is the normalised value for the corresponding sample; ) is the biggest raw value in the corresponding data series x; is the smallest raw value in the corresponding data series x and x' is the raw value of the selected sample in the corresponding data series

          .Denormalization is the technique that returns the dimensionless form of the normalised predicted crime rate numbers for each crime model to their actual raw form following the forecasting process. Equation (1) provides the mathematical transformation that forms the basis of the denormalization computation, which is given in equation (2).

          = ( ) +


          (This work is licensed under a Creative Commons Attribution 4.0 International License.)

        3. Setup for Parameters for the AI technigue

          Prior to the actual modelling and forecasting of the crime model, various input parameters were configured for each chosen artificial intelligence technique. The feed-forward backpropagation, Levenberg-Marquadt, type, training function, number of neurons, and layers were set to 10 and 2, respectively, in the ANN settings. The kernel function, epsilon value, and optimisation solver method for the SVR parameters were set to sequential minimal optimisation, 0.1, and Gaussian, respectively. Finally, the number of trees, individual tree size, and learning rate were set to 100, 3, and

            1. in the GTB settings.

      2. RESULTS

The collected forecast crime rate values of the AI crime models for each crime type were calculated using the quantitative error measurement analysis and the calculated result is presented in table 1.

Crime type



Qualitative Error Measurement






























Armed Robbery













Aggravated assault













In this case study, GTB outperforms ANN and SVR in terms of forecasting performance, as shown by the results in Table

1. Its lowest error measurement values in RMSE, MAD, and MAPE when compared to other AI approaches across all modelled crime categories demonstrate this. This demonstrates that, even with the small data samples included in the study, GTB is one of the AI models that can effectively estimate and anticipate crime rates for each type of crime. hus, it has been demonstrated that the GTB is reliable even with little amounts of data and may generate forecasts that are more accurate. On the other hand, because the observed RMSE, MAD, and MAPE values for all crime categories are highest, ANN did the worst when compared to the other AI methodologies. This suggests that tiny crime data sample sizes cause ANN to perform poorly. In terms of forecast accuracy, GTB yields the best overall result, with SVR coming in second, and ANN being the least accurate model when it comes to predicting Nigeria violent crime rates for this particular case study.


  1. Kadar C, Maculan R and Feuerriegel S 2019 Decis. Support Syst. 119 pp 107-17

  2. Rather A M, Sastry V and Agarwal A 2017 OPSEARCH 54 pp 55879

  3. Baliyan A, Gaurav K and Mishra S K 2015 Procedia Comput. Sci. 48 pp 121-25

  4. Awad M and Khanna R 2015 Efficient learning machines: theories, concepts, and applications for engineers and system designers (Berkley: Apress)

  5. Huang Y L, Lin C T, Yu Y S, Hsieh W H and Pai S M 2015 Proc. of Ntl. Conf. on Information Technology Practice and Application (China)

  6. Corcoran J J, Wilson I D and Ware J A 2003 Int. J. of Forecasting 19 pp 623-34

  7. Wang Q, Jin G, Zhao X, Feng Y and Huang J 2019 Knowl-Based Syst. pp 105-20

  8. Wu J and Lu Z 2012 Proc. of 5th Int. Conf. on Advanced Computational Intelligence (Nanjing: IEEE) pp 999-1003

  9. Basak D, Pal S and Patranabis D C 2007 Neural Info. Process.-Lett. and Rev. 11 pp 203-24

  10. Kianmehr K and Alhajj R 2006 Proc. of Int. Conf. on Computer Systems and Applications (Dubai: IEEE) pp 952-59

  11. Alwee R, Shamsuddin H, Mariyam S and Salehuddin R 2013 The Scientific World J. 2013 951475

  12. Yang F, Wu C, Xiong N and Wu Y 2018 Int. J. Softw. Hardw. Res. Eng. 6 pp 1-10

  13. Friedman J H 2001 Ann. Statist. 29 pp 1189-232

  14. Budur E, Lee S and Kong V S 2015 Social and Information Networks arXiv:1507.05739

  15. Shi X, Paiement J F, Grangier D, and Yu P S 2012 Proc. of the 2012 SIAM Int. Conf. on Data Mining (California: SIAM) pp 224-35

  16. Sumner C, Byers A, Boochever R and Park G J 2012 Proc. of 11th Int. Conf. on Machine Learning and Applications 2 (Washington: IEEE Computer Society) pp 38693

  17. https://nigerianstat.gov.ng/elibrary/read/786


(This work is licensed under a Creative Commons Attribution 4.0 International License.)