lection is important because it allows the electorate to decide who's going to make
decision for their country for the next couple of years. But this election can be
forecasted with a reasonable accuracy. Forecasting election using small polling system
is very common approach but this often do not produce reasonable accuracy.
Data mining is a process that examines large preexisting databases in order to generate
new information. There are also various works that uses data mining approaches to
predict various types of results such as weather forecasting, sports result prediction,
future buying decision prediction, etc. But there are very few works that uses data
mining approaches to predict voting patterns on election. In this work, we uses data
mining approaches to predict voting patterns in USA election. For this study we uses
data preprocessing for removing missing value, identifying best attributes and removing
duplicate values. We split the dataset into training datasets and test datasets. Then
we applied four algorithms Tree J48, Naïve Bayes Classifier, Trees Random Forest and
Rules zero or Classifier for predicting voting patterns and also compares the results
of those model and finds the best models from those models.
2. II.
3. Related Works
Gregg R. Murray and Anthony Scime uses data mining approaches to predict individual
voting behavior including abstention with the intent of segmenting the electorate
in useful and meaningful ways [1]. Gregg R. Murray, Chris Riley, and Anthony Scime, in another study, uses iterative
expert data mining to build a likely voter model for presidential election in USA
[2]. Bae, Jung-Hwan, Ji-Eun, Song, Min uses Twitter data for predicting trends in South
Korea Presidential Election by Text Mining techniques [3]. Tariq Mahmood, TasmiyahIqbal, Farnaz Amin, WaheedaLohanna, Atika Mustafa uses Twitter
data to predict 2013 Pakistan Election winner [4].
4. III.
Data Preprocessing
5. Experimental Methodology
We used 4 algorithms and 8 models (2 models for each algorithm) to predict the voting
pattern in the US election. We then analyse and compare the results of those models
and finds the best models with most accuracy. The algorithms which are applied for
generating models are given below.
i. Trees J48 ii.
Naive From the above table, the best model was identified based on the value of
the parameters accuracy, precision, recall, sensitivity, and specificity. The higher
the value of accuracy, precision, recall and (sensitivity> specificity), the higher
the rank.
6. VI.
7. Conclusion
Though there are lot of techniques and methods for predicting voting patterns, data
mining is the most efficient and effective methods in this fields. In our study, we
clearly found that among various data mining algorithms Trees Random Forest performs
the best with 98.17% accuracy. In future, we will expand our research in most recent
dataset for validating our findings with recent ones.
Figure 1. Table 1 :1
Year 2 019
37
E
I. Handling with Missing Attributes: In this section, we uses the technique of replacing
missing values with mean, median or mode. We uses this approach because it is better
approach when the dataset is small and it can prevent data loss. II. Removing Duplicates:
We used WEKA tools for removing duplicates from the datasets. We used Remove Duplicates
() function in WEKA for removing duplicates. III. Best Attributes Selection: We used
Gain Ratio Attribute Eval which evaluates the worth of an attribute by measuring the
gain ratio with respect to the class and Ranker which Ranks attributes by their individual
evaluations. The top 12 attributes from the whole dataset according to rank from the
attributes are presented in Figure 1.
Volume XIX Issue II Version I ( ) C Global Journal of Computer Science and Technology
Pre-Election Polling: Identifying Likely Voters Using Iterative Expert Data Mining.
Greg R Murray
, Chris Riley
, Anthony Scime
. Public opinion Quarterly2009. 73 (1) .
Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques.
Jung-Hwan Bae
, Ji-Eun Song
, Min
. Journal of intelligence and Information Systems2013. 19 (3) .
Micro targeting and Electorate Segmentation: Data Mining the American National Election
Studies.
R Gregg
, Anthony Murray
, Scime
. Journal of political marketing2010. 9 (3) .