# Introduction tress or depression may lead to mental disorders. Work pressure, working environment, traveling distance, height, weight, food habits, etc. are some of the major reasons behind building stress among the people. Many researchers had tried to predict stress interruption using machine learning techniques including Decision Tree, Naïve Bayes, Random Forest, KNN and SVM, etc. The primary objective of the chapter is to develop an enhanced Support Vector Machine (SVM) classifier for Stress prediction. The research work of this article implements the machine learning algorithm for predicting whether a person is interrupted by stress or not. The implementation for the stress dataset has been developed by Enhanced Support Vector Machine, and its performance is compared with KNN and SVM. # II. # Literature Study The below table 1 shows that the performance of existing machine learning techniques [23] to predict the accuracy. The literature study was conducted by reviewing 23 articles which were published in reputed journals . According to the existing study the highest accuracy is obtained by J48 (i.e) Decision Tree. So the proposed system concentrates on to develop a model which provides highest accuracy than the existing works. # III. # Objectives The primary objective of the chapter is to develop an enhanced Support Vector Machine (SVM) classifier for Stress prediction. Support Vector Machine is enhanced for this research by tuning its Hyperparameters. The Hyperparameter for SVM is its kernel function. This research uses the RBF kernel function, which is used as a way of computing the dot product of two vectors x and y in some (very high dimensional) feature space. RBF is tuned with its parameters; "Gamma" and "C' complexity parameter. "Gamma" can be seen as the inverse of the radius of influence of samples selected by the model as support vectors. "C" parameter is used to increase the complexity level of "gamma". The accuracy level is increased when the RBF kernel is tuned with "Gamma" and "C" parameters. The concerns received from the existing study are resolved by the proposed research work(i.e) Enhanced SVM when using RBF kernel functions. Finally, the efficiency is measured by the performance obtained by the Enhanced SVM classifier. # IV. # The Research Flow for Stress Prediction Research framework involves the steps taken to implement SVM to predict Stress through the research. This section presents the Enhanced SVM methodology used by the research work (i.e) model to predict stress. The following Figure 6.1 shows that the methodology used in this research work. It has several steps. The firststep is collecting the dataset. Dataset for this research work is downloaded from the Kaggle repository which contains 951 instances and 21 attributes. The second step of the research, the dataset is applied for Data preprocessing which makes the data to be nominal values. This preprocessing work is done by using WEKA tool using by "Discretize" filter. # Figure 1: The Research flow for Stress Prediction The third step is feature selection. In this step of the research is to select the subset of attributes based on certain conditions. This research uses "Correlate Attriburte Eval" from "Attribute Evaluator" and "Ranker" approach in "Search Method". At the end of this step, top ranking attributes are grouped into subset. The fourth step is developing Enhanced SVM classifier to predict the Stress interruption. Existing SVM classifier is enhanced by tuning the RBF(Radial Basis Function) kernel function with its Hyperparameters. There are two parameters are tuned to increase the efficiency of RBF kernel function. 1. Gamma 2. C-Complexity parameter.After tuning these two parameters, SVM works efficiently than any other method performedto predict Stress interruption. After implementing the Enhanced SVM classifier, the expected output is either 'Yes-1' or 'No-0'. Finally the performance is evaluated in terms of Accuracy, Precision, Recall and F-Measure with existing methodologies. # a) Data Collection The data for the research is taken from Kaggle repository. The below table 6.3 The above table 2 shows that the dataset which is related to Stress of working people. There are several reasons for the working people to be stressful. # b) Data Pre-Processing The data set is pre-processed with a machine learning tool WEKA. In this step the data values are converted into nominal values. Dataset may contain numeric data but classifier handles only nominal values. In that case research needs to discretize the data, which can be done with the following filters: weka.filters.supervised.attribute.Discretize The "Discretize" filter is stored in the package "weka.filters.supervised.attribute". Here Weka is the root package for all other sub packages. # c) Feature Selection In Machine Learning, feature selection also known as attribute selection or variable subset selection. It is the process of selecting a subset of relevant features for model construction. Feature selection techniques are used for the research is Feature Selection involves two steps. In the first step "Attribute Evaluator" will be chosen. In the second step suitable "Search method" will be selected for "AttributeEvaluator" to select the highly relevant attributes from the dataset. This research work uses the "Correlation Attribute Eval" approach in "Attribute Evaluator" to choose the relevant attributes for the subset. To find the relevant attributes for the subset generation "Ranker" method is chosen in the "Search Method" which gives a ranking for the correlated values. An efficient machine learning technique required only top ranking i.e. dominant attributes for prediction of stress accurately. Because, the top ranking attributes are only highly relevant attributes for predicting the class. To choose the top ranking value, "Ranker" method is tuned with "Threshold" value. Threshold value for ranking: In ranker "Threshold" is its property which takes number as values. Threshold value is used to select the subset of ranked attributes either from positive or negative by given its initial rank value. This research work uses threshold value is 0, which uses only positive ranked values for feature selection. The above Figure 3 shows that the list of attribute in the subset after "Threshold" value is assigned to the "Ranker" method. Figure 6.2 shows that both positive and negative ranked values. To remove the negative values, set Threshold=0. It filters the attributes which are negatively ranked. Finally, out of 18 attributes from subset, only 10 attributes are chosen for new subset after applying "Threshold" value. After completion of feature selection, the new subset will be given as input for the proposed classifier, SVM. # V. Enhanced Support Vector Machine for Predicting Stress This research work is carried out to enhance SVM features for the prediction of Stress interruption accurately. To reach the objective, SVM is enhanced with RBF (Radial Basis Function) kernel function and with tuning parameters of RBF. This research uses the RBF kernel function to map the data. RBF kernel works by mapping the data to a higher dimensional feature space using an appropriate kernel function and a maximum margin is found for separating hyperplane in feature space [15]. The accuracy problem is usually represented by the proportion of correct classifications. A soft margin can be obtained in two different ways. It is important to add a constant factor to the kernel function output whenever the given input vectors are identical. And, the magnitude of the constant factor to be added to the kernel or the bound size of the weights controls the number of training points that the system misclassifies. The setting of this parameter depends on the specific data at hand. To completely specify the support vector machine it requires to specify two parameters; a) the kernel function and b)the magnitude of the penalty for violating the soft margin. Hence, to improve the accuracy of SVM, the RBF kernel function is applied in this research; this is the best criterion used for achieving better results. The next section discussed the procedure for Enhanced SVM methodology. a) Enhanced SVM Algorithm Algorithm 6.2 explains the necessary steps to be followed to improve the performance of Support Vector Machine. Step 1: Collect Stress dataset S Step 2: Pre-process the data using "Discretize" Step 3: Select the subset of attributes using "CorrelationAttributeEval" and "Ranker" method # C Step 4: Eliminate the minimum ranked attributes by using "Threshold". Set Threshold=0 Step 5: Update the subset after eliminating minimum ranked value. Step 4: Implement the classifier Enhanced SVM on subset Step 5: Tune the parameters of SVM Step 5.1: Select RBF (Radial Basis Function) kernel function Step 5.2: Use the "Gamma" parameter. Set "Gamma" =1 Step 5.3: Tune the "Gamma" by "C "Complexity parameter. Set C=0 Step 6: Evaluate the performance Step 7: End This article is proposed by applying the RBF kernel function with gamma factor and complexity factor C in Support Vector Machine algorithm. This parameter tuning helps to improve the efficiency of Support Vector Machine Algorithm in proposed work. # b) Kernel Function Kernel functions are used to linearly or nonlinearly map the input data to a high-dimensional space (feature space). The idea of the kernel function is to enable operations to be performed in the input space rather than the potentially high dimension feature space. Hence the inner product does not need to be evaluated in the feature space This research work chooses RBF kernel function in SVM for searching values in feature space. The RBF kernel on two samples x and x', represented as feature vectors in some input space, is defined as where ||x?x?||2||x?x?||2 is the squared Euclidean distance between two data points x and x?. SVM classifier using an RBF kernel has two parameters: gamma and C. # c) Gamma Parameter Gamma is a parameter of the RBF kernel and can be thought of as the 'spread' of the kernel and therefore the decision region. When gamma is low, the 'curve' of the decision boundary is very low and thus the decision region is very broad. When gamma is high, the 'curve' of the decision boundary is high, which creates islands of decision-boundaries around data points. When Gamma = 0.01, low gamma like 0.01, the decision boundary is not very 'curvy', rather it is just one big sweeping arch. When Gamma = 1.0, the big difference in curve when increase the gamma to 1. Now the decision boundary is starting to better cover the spread of the data. So, the research chooses the best Gamma parameter is 1.0 after experimenting successive incremental of "Gamma" parameter. # d) C-Complexity Parameter The C parameter in support vector machine trades off correct classification of training examples against maximization of the decision functions margin. The only thing will change by the C is the penalty for misclassification. Larger value of C will be accepted and the decision function will be working better at classifying all training points correctly. Therefore, the complexity parameter is increased from 1 to 10 in this research work. When C = 1, the classifier is clearly tolerant of misclassified data point. When C = 10, the classifier is highly tolerant of misclassified data point. From the above table 3, it is observed that the accuracy is increasing up to certain level of Gamma factor and Complexity parameter. The most dangerous and common effect of increasing gamma parameter is overfitting. The experiment starts from the Gamma =0.01 and the Complexity parameter C is not specified. But it is produced low accuracy and the time taken is also very low. To increase the accuracy and also to choose misclassification values, the Complexity parameter C is applied as 10 after experimenting the C value in the research. The accuracy is 82% when "Gamma=0.01" and "C=10". It is better than when "C=0". So the research work decided to increase the "Gamma" factor for the constant "C" parameter. The highest accuracy (96%) is produced by enhanced SVM when Gamma = 1 and Complexity parameter =10. This study also analyzed the performance of RBF Kernel with Polynomial and Linear Kernel functions by using Accuracy and Execution Time. This section implemented the parameter tuning in Enhanced Support Vector Machine, and the efficiency will be measured by evaluating its performance with existing methodology SVM and KNN. # VI. # Performance Evaluation For experimental work, the open source Machine Learning tool WEKA is used. The following metrics are used to evaluate the performance of proposed Machine Learning Algorithm which is discussed detail in Research Methodology. # Result and Discussion Various experiments are conducted with Stress datasets to evaluate the performance of the proposed Enhanced Support Vector Algorithm. To assess the performance of the proposed algorithm, the results are compared with the earlier studies results (i.e) SVM and KNN. Figure 5 shows that precision rate in Enhanced SVM, KNN and SVM. Proposed SVM algorithm achieves better precision 93% which is higher than the other techniques KNN (90%) and SVM (90%) in the Stress data set. Figure 7 summarized the comparison of all the performance metrics, which is used in stress dataset. Among the different category machine learning algorithms, Enhanced SVM produces better results when compared to exiting machine learning algorithms such as SVM and KNN. # VIII. # Conclusion In this research, an Enhanced SVM which improves the efficiency of the machine learning algorithm to prediction of Stress. The performance of enhanced SVM is compared with the existing SVM and KNN method. Those techniques are studied and evaluated using Stress dataset. It has been analyzed that tuning the RBF kernel with Gamma and Complexity parameter, Enhanced SVM can outperform than KNN and earlier works. Proposed SVM algorithm achieves better accuracy i.e. 96% when compared to other techniques like KNN(91%) and SVM (92%) in the Stress data set with minimum execution time. This research work also recommends that the significantly evaluated classifier Enhanced SVM can be used for real-time prediction of stress and early-stage heart failure can be avoided. However, more training data whether from hospitals or from domain-experts can be added for increasing the prediction performance of the classifiers. 2![Figure 2: Ranking for Attribute](image-2.png "Figure 2 :") 1ClassifierAccuracyPrecisionRecallBayes Net88.59%0.8240.834Multilayer perceptron85.43%0.8360.867Naive Bayes84.2105%0.7170.890Logistic regression84.9649%0.8240.838J4886.42%0.8710.879Random Forest83.333%0.8330.825 2Feature Selection1. Attribute Evaluator: CorrelationAttributeEval 2. Search Method: RankerPreprocessing DiscretizationEnhanced SVM Classifier Kernel Function: RBF kernel Parameters: Gamma andComplexity parameterPerformance EvaluationStressDatasetAccuracyPrecisionRecallF-measure 3S. No.Gamma valueComplexity parameterAccuracyExecution Time (in seconds)121092.760.98211096.330.3330.910910.3040.071090.10.2850.051088.190.2160.011082.130.1770.01162.010.16 4Kernel functionAccuracy (%)Execution Time (in seconds)RBF Kernel96.330.33Polynomial Kernel91.690.71Linear Kernel850.323It is observed from the above table 4that SVM with RBF kernel performance is higher thanthat of the polynomial kernel and linear kernel inprediction of stress. The SVM with RBF kernel produced96% accuracy compared to the polynomial kernel. 5Stress datasetS.No.TechniquesAccuracy Precision Recall1Enhanced SVM96.33%92.63% 90.26%2SVM91.69%89.96% 88.25%3KNN90.78%89.68% 87.21% ( ) C © 2020 Global Journals © 2020 Global Journals * Associative Classification Approach for Diagnosing Cardiovascular Disease KiyongNoh HeongyuLee Ho-Sun Shon JuBum KeunLee Ho Ryu 2006 Springer 345 * MiningBiosignal Data: Coronary Artery Disease Diagnosis using Linear and Nonlinear Features of HRV HongyuLee KiYongNoh KeunHo Ryu May 2007 LNAI 4819: Emerging Technologies in Knowledge Discovery and Data Mining * Decision Support System for Heart Disease Diagnosis Using Neural Network NitiGuru AnilDahiya Delhi Business Review 8 1 January -June 2007 * Medical Knowledge Acquisition through Data Mining HaiWang Proceedings 2008 * 978-1-4244-2511-2/08©2008 Crown IEEEInternational Symposium on IT in Medicine and Education * Intelligent Heart Disease Prediction System Using Data Mining Techniques SellappanPalaniappan Rafiahawang IJCSNS) 8 8 August 2008 * Intelligent Heart Disease Prediction System using CANFIS and Genetic Algorithm RLathaparthiban Subramanian International Journal of Biological, Biomedical and Medical Sciences 3 3 2008 * Improved Study of Heart Disease Prediction System using Data Mining Classification Techniques SChaitrali SulabhaSDangare Apte International Journal of Computer Applications 47 10 0975 888. June 2012 * An Efficient Classification Tree Technique for Heart Disease Prediction SVijiyarani International Conference on Research Trends in Computer Technologies (ICRTCT -2013) Proceedings published in International Journal of Computer Applications IJCA * Intelligent heart disease prediction system using CANFIS and genetic algorithm LathaParthiban RSubramanian International Journal of Biological, Biomedical and Medical Sciences 3 3 2008 * Chronic Heart Failure Detection from Heart Sounds Using a Stack of Machine-Learning Classifiers MartinGjoreski AntonGradis ?ek Matjaz?Gams MonikaSimjanoska AnaPeterlin Gregorpoglajen 13th International IEEE Conference on Intelligent Environments 2017 * Heart Disease Diagnosis System with k-Nearest Neighbors Method Using Real Clinical Medical Records MuhammadKetutagungenriko DadanggunawanSuryanegara Al 4th International Conference June 2018 * A Smart Device for the Detection of Heart Abnormality using R-R Interval MustaphaAbdallahkassem Hamad 28th IEEE International conference on Microelectronics(ICM) 2016 Chady El Moucary and ElieFayad * Prediction and Analysis of Heart Disease Using SVM Algorithm RimaMadhurapatil Jadhav Vishakhapatil GeetachillargeAditibhawar International Journal for Research in Applied Science & Engineering Technology 7 Jan 2019 * A Data mining Model for Predicting the Coronary Heart Disease Using Random Forest Classifier SheikAbdullah Rajalaxmi International Journal of Computer Applications' 2019 * Prediction of Heart Diseases Using Associative Classification JagdeepSingh AmitKamra HarbhagSingh 5th International Conference on Wireless Networks and Embedded System 2016 * Evaluating Ensemble Prediction of Coronary Heart Disease using Receiver Operating Characteristics RidairfanTahiramahboob Bazelahghaffar IEEE Internet Technologies and Application 2017 * Development of a Data Clustering Algorithm for Predicting Heart VBalasundar TDevi NSaravan International Journal of Computer Applications 48 2012 * Comparative Study of KNN, Naive Bayes and Decision Tree Classification Techniques DSayali HPJadhav Channe ID: NOV153131 International Journal of Science and Research 5 1 2016 * Heart Disease Prediction Using ANN Algorithm in Data Mining PSai ChandrasekharReddy JayaPuneetpalagi IJCSMC 6 2016 * An analytic approach to better understanding and management of coronary surgeries AsilDursundelen LemanOztekin Tomak Decision Support Systems 52 2012 * Comparative analysis of data mining methods for bankruptcy prediction DavidLOlson DursunDelen Yanyanmeng Decision Support Systems 52 2012 * Clustering of Lung Cancer Data Using Foggy K-Means AkhileshKumar Yadav Divyatomar SonaliAgarwal International Conference on Recent Trends in Information Technology (ICRTIT) 2013 21 * Detection and Analysis of Stress using Machine Learning Techniques SupriyaReshma Kinariwala International Journal of Engineering and Advanced Technology (IJEAT) 2249 -8958 9