# I. Introduction process of drilling in the Earth brings petroleum oil hydrocarbons to the surface and the well is termed as oil well. Modern directional drilling tools allocate for powerfully deviated wells provide sufficient depth and with the proper equipment, actually become horizontal. This is of great value as the reservoir rocks which contain hydrocarbons are usually horizontal; a horizontal wellbore used in a production zone has more surface area than a vertical well, and increase the production rate. The use of deviated and horizontal drilling tools allow production team to reach reservoirs several distance away from the location for the production of hydrocarbons located below locations that are either difficult to rig depends on the environment. The introduction of data mining (DM) in the field of computer science in late 80's lead many researches in data analysis and discovered lot of statistical and data crunching tools. The massive growth of data in all kind of business urges to use data management and data warehousing applications. The growth of data mining in the last three decades can be divided into three parts. In the first part, programmers developed machine learning algorithms to train machines to handle huge amount of data to generate pattern from it. In the second part, many business people realized the application of DM tools and started implementing in their business and took decision according to it. In the third part, web is a huge database with semi-structured data. DM tools were implemented to study the data in the web. Web mining is the concept derived from DM and successfully used for web management. The DM is a domain expertise tool and considered one of the pioneer applications of SC. Continuous research indicate that decision tree data mining algorithm produce best results. The further part of the paper will explain the literature work based on oil exploration and results and discussion of the methods employed in the research work. # II. Review of Literature In [1], a review of recent applications of soft computing in oil exploration. Artificial neural network (ANN), fuzzy logic, probability reasoning, and Bayesian belief network were methods highlighted in the study. ANN has the ability to deal with linear and non -linear problems and ideal for the oil -exploration. Fuzzy logic has the ability to manipulate symbolic information in an effective way than other methods. The concept of fuzzy logic employed for seismic data interpretation and oil reservoir litho logy identification. Data in oil exploration are dynamic and uncertain and probabilistic reasoning used to handle uncertainty in decision making. BBN is suitable for casual rules and represents probabilistic relationship. The review explained the activities involved n the oil exploration like data acquisition and pattern recognition and prediction. In [2], a research on oil exploration using big data. Seismic data management and analysis were the task optimized for the methods used in the research. Semma process used to disclose patterns hidden in the large volume of data. Oil exploration using data analysis is the paramount task. Insight, predict and optimize are Abstract-Soft computing (SC) techniques provide wide variety of applications in data processing, analysis and interpretation. SC play a key role in the geo sciences due to the immense size and uncertainty associated with the data. The nature of SC assesses oil industry in oil exploration and optimization of oil wells. There is a significant change in oil industry due to the complex techniques and modern equipment. Intelligent systems like Neuro computing and Artificial Intelligence are available for the exploration of oil and popular evolutionary algorithms have effective methods for the optimization of oil wells but processing of vague data create problems for the existing techniques. Uncertainty data plays a crucial role to take vital decisions. The research uses seismic data in the process of oil exploration and compared the proposed method with the existing methods and results are favorable to the proposed research. the processes of big data used to explore oil from the huge amount of data. In [3], Big data analysis on safety mechanism and real analysis of the exploration of oil. The research employed business intelligence tools, data warehouses and other transactional applications and generated better results comparing to existing methods. Hadoop system used to derive results from websites logs and complex databases. Real time analytics and recommendations were done by the system using the big data and hadoop. In [4], a paper on overall maintenance of oil industry. The paper described the activities involved in the maintenance of equipment by collecting data from pumps and wells then adjust the repair schedule and prevent the failures. Big data employed in the process of optimization of production volumes. # III. Results and Discussion The implementation of data analytics and prediction tool is a complex task due to scalability and time complexity. The proposed method and other methods used in the research were implemented in Java using i7 processor. K-means, Naïve Bayes, K-NN and SVM are the methods compared with the proposed J48 methods. The algorithms were taken from Google algorithms and the dataset for the experiment were downloaded from international well data (www.ihs.com). We have used two locations Saudi Arabia and Canada well data to show the ability of proposed method. Machine learning and automated tools need training to generate results, therefore during the training phase, selected data from the dataset given to the methods to learn the environment. During the testing phase, the performance will be evaluated by calculating the time. The Table -1 and 2 shows the training phase data and figure 1 and 2 shows the relevant graph to the data generated during the training phase. of the methods and the proposed method have better performance. Figure 3 and 4 shows the graph for the testing time of methods for the two locations. # IV. Conclusion SC is the combination of machine learning algorithms employed in the interest of development of application for real-world problem. The data mining algorithms were successfully implemented in all kind of business to provide decision in the complex situation. Oil exploration is the complex problem and data are vague and difficult to derive information and proposed method has achieved accuracy of an average of 92% for the dataset employed in the research. The ability of J48 to produce results achieved the better accuracy and shortest time than other methods employed in the research. The future work of the research is to expand the work for the other region in the world. 1![Figure 1: Life cycle of oil](image-2.png "AFigure 1 :") 1![Figure 1: Training Time (in seconds) for the location in Saudi Arabia](image-3.png "Figure 1 :") 2![Figure 2: Training Time (in seconds) for the location in Canada J48 is the implementation of ID3 (Iterative Dichotomiser 3) based on the classification algorithm. It has shown better training time than other methods. It has the ability to produce results in short duration with improved performance. Saudi Arabia is the largest oil producer and Canada is the fifth in the world. In the location of Saudi Arabia, the numbers of oil wells are more than the location of Canada. The training phase data shows that the employed methods had taken more time to learn the scenario and the attributes used for the training were seismic data, percentage of hydro carbon, distance from ground and latitude and longitude of the oil well. The trained attributes are vital to predict the location of oil well.Table 3 and 4 shows the testing time](image-4.png "Figure 2 :") 34![Figure 3: Testing Time (in seconds) for the location in Saudi ArabiaTable 4: Testing Time (in seconds) for the location in Canada](image-5.png "Figure 3 :Table 4 :") 5![Figure 5: Accuracy of results (in percentage)](image-6.png "Figure 5 :") 4![Figure 4: Testing Time (in seconds) for the location in Canada](image-7.png "Figure 4 :") 1Methods Seismic DataPercentage of HydrocarbonDistance from GroundLatitude & LongitudeJ480.1780.2610.2720.314K-Means 0.2140.2910.2830.364Naïve Bayes0.1920.2420.2860.412K-NN0.1910.2600.2810.292SVM0.1840.3120.3170.319 2Methods Seismic DataPercentage of HydrocarbonDistance from GroundLatitude & LongitudeJ480.0980.1010.0850.145K-Means0.1250.1540.1140.189Naïve0.1370.1380.1420.189BayesK-NN0.1150.1190.1210.162SVM0.0990.1280.1260.149 3Methods Seismic DataPercentage of HydrocarbonDistance from GroundLatitude & LongitudeJ480.0910.1010.1600.147K-Means 0.1120.1320.1920.241Naïve Bayes0.1010.2120.2420.312K-NN0.0980.1740.1740.180SVM0.0940.1180.1810.174 5MethodsSaudi ArabiaCanadaJ489597K-Means9192Naïve Bayes9090K-NN9392SVM9395 © 2017 Global Journals Inc. (US) 1 © 2017 Global Journals Inc. (US) * Data Mining: An Overview from a Database Perspective MSChen JWHan PhilipSYu IEEE Transactions on Knowledge and Data Engineering 8 6 December 1996 Air Resources Board California Environmental Protection Agency Report treatment tanks * Survey of clustering data mining techniques PBerkhin Accrue Software 2002 Technical Report * From black magic to swarms: hydrocarbon exploration using non-seismic technologies EKBiegert EGM 2007 international workshop innovation in EM, grav and mag methods: a new perspective for exploration Capri Italy 2007 * Neural networks for pattern recognition CMBishop 1999 Oxford University Press Oxford * ITERATE: a conceptual clustering algorithm for data mining GBiswas JBWeinberg DHFisher IEEE Trans Syst Man Cybern Part C Appl Rev 28 1998 * Waveform analysis with seismic attributes JHBodine Oil Gas J 84 1984 * Evolution of Canada's oil gas industry RDBott Canadian Center for Energy Information 2004 Canada * Date Mining Concepts and Techniques JiaweiHan MichelineKamber 2006 China Machine Press * A Conceptual Framework of Data Mining YYYao NZhong YZhao Studies in Computational Intelligence (SCI) 2008 118 * A Three-layered Conceptual Framework of Data Mining YYYao NZhong YZhao * A Step Towards the Foundations of Data Mining YYYao BVDasarathy Data Mining and Knowledge Discovery: Theory Tools Technology V 2003 * HQu WZZhao SYHu Oil & Gas Resources Status and the Exploration Fields in China China Petroleum Exploration 2006 4 * Potentials of petroleum resources and exploration strategy in China ActaPetroleiSinica JPPan ZJJin 2004 25 * How Data-Driven Modeling Methods Like Neural Networks can Help to Integrate Different Types of Data into Reservoir Management MStundner JSAl-Thuwaini 2001 68163 * Hydrocarbon reservoir prediction using artificial nerve network method YDCai JWGong IRGan LSYao Oil Geophys Prospect 28 1993 * Support vector machines for crop classification using hyper spectral data GCamps-Valls LGomez-Chova JCalpe-Maravilla ESoria-Olivas JDMart??n-Guerrero JMoreno 2003 LNCS 2652 Berlin * Graph mining: laws, generators and algorithms DChakarbatti CFaloutsos ComputSurv 38 2 2006 ACM * Lithostratigraphic interpretation of seismic data for reservoir characterization MChandra AKSrivastava VSingh DNTiwari PKPainuly AAPG international conference Barcelona 2003 * Choosing multiple parameters for support vector machines OChapelle VVapnik OBouquet SMukherjee Mach Learn 46 2002