# I. Introduction

process of drilling in the Earth brings petroleum oil hydrocarbons to the surface and the well is termed as oil well.

Modern directional drilling tools allocate for powerfully deviated wells provide sufficient depth and with the proper equipment, actually become horizontal. This is of great value as the reservoir rocks which contain hydrocarbons are usually horizontal; a horizontal wellbore used in a production zone has more surface area than a vertical well, and increase the production rate. The use of deviated and horizontal drilling tools allow production team to reach reservoirs several distance away from the location for the production of hydrocarbons located below locations that are either difficult to rig depends on the environment.

The introduction of data mining (DM) in the field of computer science in late 80's lead many researches in data analysis and discovered lot of statistical and data crunching tools. The massive growth of data in all kind of business urges to use data management and data warehousing applications. The growth of data mining in the last three decades can be divided into three parts. In the first part, programmers developed machine learning algorithms to train machines to handle huge amount of data to generate pattern from it. In the second part, many business people realized the application of DM tools and started implementing in their business and took decision according to it. In the third part, web is a huge database with semi-structured data. DM tools were implemented to study the data in the web. Web mining is the concept derived from DM and successfully used for web management. The DM is a domain expertise tool and considered one of the pioneer applications of SC. Continuous research indicate that decision tree data mining algorithm produce best results.

The further part of the paper will explain the literature work based on oil exploration and results and discussion of the methods employed in the research work.


# II. Review of Literature

In [1], a review of recent applications of soft computing in oil exploration. Artificial neural network (ANN), fuzzy logic, probability reasoning, and Bayesian belief network were methods highlighted in the study. ANN has the ability to deal with linear and non -linear problems and ideal for the oil -exploration. Fuzzy logic has the ability to manipulate symbolic information in an effective way than other methods. The concept of fuzzy logic employed for seismic data interpretation and oil reservoir litho logy identification. Data in oil exploration are dynamic and uncertain and probabilistic reasoning used to handle uncertainty in decision making. BBN is suitable for casual rules and represents probabilistic relationship. The review explained the activities involved n the oil exploration like data acquisition and pattern recognition and prediction.

In [2], a research on oil exploration using big data. Seismic data management and analysis were the task optimized for the methods used in the research. Semma process used to disclose patterns hidden in the large volume of data. Oil exploration using data analysis is the paramount task. Insight, predict and optimize are Abstract-Soft computing (SC) techniques provide wide variety of applications in data processing, analysis and interpretation. SC play a key role in the geo sciences due to the immense size and uncertainty associated with the data. The nature of SC assesses oil industry in oil exploration and optimization of oil wells. There is a significant change in oil industry due to the complex techniques and modern equipment. Intelligent systems like Neuro computing and Artificial Intelligence are available for the exploration of oil and popular evolutionary algorithms have effective methods for the optimization of oil wells but processing of vague data create problems for the existing techniques. Uncertainty data plays a crucial role to take vital decisions. The research uses seismic data in the process of oil exploration and compared the proposed method with the existing methods and results are favorable to the proposed research.

the processes of big data used to explore oil from the huge amount of data.

In [3], Big data analysis on safety mechanism and real analysis of the exploration of oil. The research employed business intelligence tools, data warehouses and other transactional applications and generated better results comparing to existing methods. Hadoop system used to derive results from websites logs and complex databases. Real time analytics and recommendations were done by the system using the big data and hadoop.

In [4], a paper on overall maintenance of oil industry. The paper described the activities involved in the maintenance of equipment by collecting data from pumps and wells then adjust the repair schedule and prevent the failures. Big data employed in the process of optimization of production volumes.


# III. Results and Discussion

The implementation of data analytics and prediction tool is a complex task due to scalability and time complexity. The proposed method and other methods used in the research were implemented in Java using i7 processor. K-means, Naïve Bayes, K-NN and SVM are the methods compared with the proposed J48 methods. The algorithms were taken from Google algorithms and the dataset for the experiment were downloaded from international well data (www.ihs.com). We have used two locations Saudi Arabia and Canada well data to show the ability of proposed method. Machine learning and automated tools need training to generate results, therefore during the training phase, selected data from the dataset given to the methods to learn the environment. During the testing phase, the performance will be evaluated by calculating the time. The Table -1 and 2 shows the training phase data and figure 1 and 2 shows the relevant graph to the data generated during the training phase.    of the methods and the proposed method have better performance. Figure 3 and 4 shows the graph for the testing time of methods for the two locations.    


# IV. Conclusion

SC is the combination of machine learning algorithms employed in the interest of development of application for real-world problem. The data mining algorithms were successfully implemented in all kind of business to provide decision in the complex situation. Oil exploration is the complex problem and data are vague and difficult to derive information and proposed method has achieved accuracy of an average of 92% for the dataset employed in the research. The ability of J48 to produce results achieved the better accuracy and shortest time than other methods employed in the research. The future work of the research is to expand the work for the other region in the world. 
1![Figure 1: Life cycle of oil](image-2.png "AFigure 1 :")
1![Figure 1: Training Time (in seconds) for the location in Saudi Arabia](image-3.png "Figure 1 :")
2![Figure 2: Training Time (in seconds) for the location in Canada J48 is the implementation of ID3 (Iterative Dichotomiser 3) based on the classification algorithm. It has shown better training time than other methods. It has the ability to produce results in short duration with improved performance. Saudi Arabia is the largest oil producer and Canada is the fifth in the world. In the location of Saudi Arabia, the numbers of oil wells are more than the location of Canada. The training phase data shows that the employed methods had taken more time to learn the scenario and the attributes used for the training were seismic data, percentage of hydro carbon, distance from ground and latitude and longitude of the oil well. The trained attributes are vital to predict the location of oil well.Table 3 and 4 shows the testing time](image-4.png "Figure 2 :")
34![Figure 3: Testing Time (in seconds) for the location in Saudi ArabiaTable 4: Testing Time (in seconds) for the location in Canada](image-5.png "Figure 3 :Table 4 :")
5![Figure 5: Accuracy of results (in percentage)](image-6.png "Figure 5 :")
4![Figure 4: Testing Time (in seconds) for the location in Canada](image-7.png "Figure 4 :")
1Methods Seismic DataPercentage of HydrocarbonDistance from GroundLatitude & LongitudeJ480.1780.2610.2720.314K-Means 0.2140.2910.2830.364Naïve Bayes0.1920.2420.2860.412K-NN0.1910.2600.2810.292SVM0.1840.3120.3170.319
2Methods Seismic DataPercentage of HydrocarbonDistance from GroundLatitude & LongitudeJ480.0980.1010.0850.145K-Means0.1250.1540.1140.189Naïve0.1370.1380.1420.189BayesK-NN0.1150.1190.1210.162SVM0.0990.1280.1260.149
3Methods Seismic DataPercentage of HydrocarbonDistance from GroundLatitude & LongitudeJ480.0910.1010.1600.147K-Means 0.1120.1320.1920.241Naïve Bayes0.1010.2120.2420.312K-NN0.0980.1740.1740.180SVM0.0940.1180.1810.174
5MethodsSaudi ArabiaCanadaJ489597K-Means9192Naïve Bayes9090K-NN9392SVM9395
			© 2017 Global Journals Inc. (US) 1
			© 2017 Global Journals Inc. (US)
		
		
* 
	
		Data Mining: An Overview from a Database Perspective
		
			MSChen
		
		
			JWHan
		
		
			PhilipSYu
		
	
		IEEE Transactions on Knowledge and Data Engineering
		
			8
			6
			
			December 1996
			Air Resources Board
		
		
			California Environmental Protection Agency
		
	
	Report
	treatment tanks


* 
	
		Survey of clustering data mining techniques
		
			PBerkhin
		
		Accrue Software
		
			2002
		
	
	Technical Report


* 
	
		From black magic to swarms: hydrocarbon exploration using non-seismic technologies
		
			EKBiegert
		
	
		EGM 2007 international workshop innovation in EM, grav and mag methods: a new perspective for exploration Capri Italy
				
			2007
		
	
* 
	
		Neural networks for pattern recognition
		
			CMBishop
		
		
			1999
			Oxford University Press
			
			Oxford
		
	
* 
	
		ITERATE: a conceptual clustering algorithm for data mining
		
			GBiswas
		
		
			JBWeinberg
		
		
			DHFisher
		
	
		IEEE Trans Syst Man Cybern Part C Appl Rev
		
			28
			
			1998
		
	
* 
	
		Waveform analysis with seismic attributes
		
			JHBodine
		
	
		Oil Gas J
		
			84
			
			1984
		
	
* 
	
		Evolution of Canada's oil gas industry
		
			RDBott
		
	
		Canadian Center for Energy Information
		
			2004
		
	
	Canada


* 
	
		Date Mining Concepts and Techniques
		
			JiaweiHan
		
		
			MichelineKamber
		
		
			2006
			China Machine Press
		
	
* 
	
		A Conceptual Framework of Data Mining
		
			YYYao
		
		
			NZhong
		
		
			YZhao
		
	
		Studies in Computational Intelligence (SCI)
				
			2008
			118
			
		
* 
	
		A Three-layered Conceptual Framework of Data Mining
		
			YYYao
		
		
			NZhong
		
		
			YZhao
		
		
* 
	
		A Step Towards the Foundations of Data Mining
		
			YYYao
		
		
			BVDasarathy
		
	
		Data Mining and Knowledge Discovery: Theory Tools Technology V
				
			2003
			
		
* 
	
		
			HQu
		
		
			WZZhao
		
		
			SYHu
		
		Oil & Gas Resources Status and the Exploration Fields in China China Petroleum Exploration
				
			2006
			4
			
		
* 
	
		Potentials of petroleum resources and exploration strategy in China ActaPetroleiSinica
		
			JPPan
		
		
			ZJJin
		
		
			2004
			25
			
		
* 
	
		How Data-Driven Modeling Methods Like Neural Networks can Help to Integrate Different Types of Data into Reservoir Management
		
			MStundner
		
		
			JSAl-Thuwaini
		
		
			2001
			68163
		
	
* 
	
		Hydrocarbon reservoir prediction using artificial nerve network method
		
			YDCai
		
		
			JWGong
		
		
			IRGan
		
		
			LSYao
		
	
		Oil Geophys Prospect
		
			28
			
			1993
		
	
* 
	
		Support vector machines for crop classification using hyper spectral data
		
			GCamps-Valls
		
		
			LGomez-Chova
		
		
			JCalpe-Maravilla
		
		
			ESoria-Olivas
		
		
			JDMart??n-Guerrero
		
		
			JMoreno
		
		
			2003
			LNCS
			2652
			
			Berlin
		
	
* 
	
		Graph mining: laws, generators and algorithms
		
			DChakarbatti
		
		
			CFaloutsos
		
	
		ComputSurv
		
			38
			2
			2006
			ACM
		
	
* 
	
		Lithostratigraphic interpretation of seismic data for reservoir characterization
		
			MChandra
		
		
			AKSrivastava
		
		
			VSingh
		
		
			DNTiwari
		
		
			PKPainuly
		
	
		AAPG international conference
				Barcelona
		
			2003
		
	
* 
	
		Choosing multiple parameters for support vector machines
		
			OChapelle
		
		
			VVapnik
		
		
			OBouquet
		
		
			SMukherjee
		
	
		Mach Learn
		
			46
			
			2002