process of drilling in the Earth brings petroleum oil hydrocarbons to the surface and the well is termed as oil well.
Modern directional drilling tools allocate for powerfully deviated wells provide sufficient depth and with the proper equipment, actually become horizontal. This is of great value as the reservoir rocks which contain hydrocarbons are usually horizontal; a horizontal wellbore used in a production zone has more surface area than a vertical well, and increase the production rate. The use of deviated and horizontal drilling tools allow production team to reach reservoirs several distance away from the location for the production of hydrocarbons located below locations that are either difficult to rig depends on the environment.
The introduction of data mining (DM) in the field of computer science in late 80's lead many researches in data analysis and discovered lot of statistical and data crunching tools. The massive growth of data in all kind of business urges to use data management and data warehousing applications. The growth of data mining in the last three decades can be divided into three parts. In the first part, programmers developed machine learning algorithms to train machines to handle huge amount of data to generate pattern from it. In the second part, many business people realized the application of DM tools and started implementing in their business and took decision according to it. In the third part, web is a huge database with semi-structured data. DM tools were implemented to study the data in the web. Web mining is the concept derived from DM and successfully used for web management. The DM is a domain expertise tool and considered one of the pioneer applications of SC. Continuous research indicate that decision tree data mining algorithm produce best results.
The further part of the paper will explain the literature work based on oil exploration and results and discussion of the methods employed in the research work.
In [1], a review of recent applications of soft computing in oil exploration. Artificial neural network (ANN), fuzzy logic, probability reasoning, and Bayesian belief network were methods highlighted in the study. ANN has the ability to deal with linear and non -linear problems and ideal for the oil -exploration. Fuzzy logic has the ability to manipulate symbolic information in an effective way than other methods. The concept of fuzzy logic employed for seismic data interpretation and oil reservoir litho logy identification. Data in oil exploration are dynamic and uncertain and probabilistic reasoning used to handle uncertainty in decision making. BBN is suitable for casual rules and represents probabilistic relationship. The review explained the activities involved n the oil exploration like data acquisition and pattern recognition and prediction.
In [2], a research on oil exploration using big data. Seismic data management and analysis were the task optimized for the methods used in the research. Semma process used to disclose patterns hidden in the large volume of data. Oil exploration using data analysis is the paramount task. Insight, predict and optimize are Abstract-Soft computing (SC) techniques provide wide variety of applications in data processing, analysis and interpretation. SC play a key role in the geo sciences due to the immense size and uncertainty associated with the data. The nature of SC assesses oil industry in oil exploration and optimization of oil wells. There is a significant change in oil industry due to the complex techniques and modern equipment. Intelligent systems like Neuro computing and Artificial Intelligence are available for the exploration of oil and popular evolutionary algorithms have effective methods for the optimization of oil wells but processing of vague data create problems for the existing techniques. Uncertainty data plays a crucial role to take vital decisions. The research uses seismic data in the process of oil exploration and compared the proposed method with the existing methods and results are favorable to the proposed research.
the processes of big data used to explore oil from the huge amount of data.
In [3], Big data analysis on safety mechanism and real analysis of the exploration of oil. The research employed business intelligence tools, data warehouses and other transactional applications and generated better results comparing to existing methods. Hadoop system used to derive results from websites logs and complex databases. Real time analytics and recommendations were done by the system using the big data and hadoop.
In [4], a paper on overall maintenance of oil industry. The paper described the activities involved in the maintenance of equipment by collecting data from pumps and wells then adjust the repair schedule and prevent the failures. Big data employed in the process of optimization of production volumes.
The implementation of data analytics and prediction tool is a complex task due to scalability and time complexity. The proposed method and other methods used in the research were implemented in Java using i7 processor. K-means, Naïve Bayes, K-NN and SVM are the methods compared with the proposed J48 methods. The algorithms were taken from Google algorithms and the dataset for the experiment were downloaded from international well data (www.ihs.com). We have used two locations Saudi Arabia and Canada well data to show the ability of proposed method. Machine learning and automated tools need training to generate results, therefore during the training phase, selected data from the dataset given to the methods to learn the environment. During the testing phase, the performance will be evaluated by calculating the time. The Table -1 and 2 shows the training phase data and figure 1 and 2 shows the relevant graph to the data generated during the training phase. of the methods and the proposed method have better performance. Figure 3 and 4 shows the graph for the testing time of methods for the two locations.
SC is the combination of machine learning algorithms employed in the interest of development of application for real-world problem. The data mining algorithms were successfully implemented in all kind of business to provide decision in the complex situation. Oil exploration is the complex problem and data are vague and difficult to derive information and proposed method has achieved accuracy of an average of 92% for the dataset employed in the research. The ability of J48 to produce results achieved the better accuracy and shortest time than other methods employed in the research. The future work of the research is to expand the work for the other region in the world.
Methods Seismic Data | Percentage of Hydrocarbon | Distance from Ground | Latitude & Longitude | |
J48 | 0.178 | 0.261 | 0.272 | 0.314 |
K-Means 0.214 | 0.291 | 0.283 | 0.364 | |
Naïve Bayes | 0.192 | 0.242 | 0.286 | 0.412 |
K-NN | 0.191 | 0.260 | 0.281 | 0.292 |
SVM | 0.184 | 0.312 | 0.317 | 0.319 |
Methods Seismic Data | Percentage of Hydrocarbon | Distance from Ground | Latitude & Longitude | |
J48 | 0.098 | 0.101 | 0.085 | 0.145 |
K-Means | 0.125 | 0.154 | 0.114 | 0.189 |
Naïve | 0.137 | 0.138 | 0.142 | 0.189 |
Bayes | ||||
K-NN | 0.115 | 0.119 | 0.121 | 0.162 |
SVM | 0.099 | 0.128 | 0.126 | 0.149 |
Methods Seismic Data | Percentage of Hydrocarbon | Distance from Ground | Latitude & Longitude | |
J48 | 0.091 | 0.101 | 0.160 | 0.147 |
K-Means 0.112 | 0.132 | 0.192 | 0.241 | |
Naïve Bayes | 0.101 | 0.212 | 0.242 | 0.312 |
K-NN | 0.098 | 0.174 | 0.174 | 0.180 |
SVM | 0.094 | 0.118 | 0.181 | 0.174 |
Methods | Saudi Arabia | Canada |
J48 | 95 | 97 |
K-Means | 91 | 92 |
Naïve Bayes | 90 | 90 |
K-NN | 93 | 92 |
SVM | 93 | 95 |
Graph mining: laws, generators and algorithms. ComputSurv 2006. ACM. 38 (2) .
From black magic to swarms: hydrocarbon exploration using non-seismic technologies. EGM 2007 international workshop innovation in EM, grav and mag methods: a new perspective for exploration Capri Italy, 2007.
ITERATE: a conceptual clustering algorithm for data mining. IEEE Trans Syst Man Cybern Part C Appl Rev 1998. 28 p. .
Waveform analysis with seismic attributes. Oil Gas J 1984. 84 p. .
Lithostratigraphic interpretation of seismic data for reservoir characterization. AAPG international conference, (Barcelona
Data Mining: An Overview from a Database Perspective. IEEE Transactions on Knowledge and Data Engineering December 1996. Air Resources Board. 8 (6) p. . California Environmental Protection Agency (Report) (treatment tanks)
Choosing multiple parameters for support vector machines. Mach Learn 2002. 46 p. .
Evolution of Canada's oil gas industry. Canadian Center for Energy Information 2004. (Canada)
Hydrocarbon reservoir prediction using artificial nerve network method. Oil Geophys Prospect 1993. 28 p. .
A Step Towards the Foundations of Data Mining. Data Mining and Knowledge Discovery: Theory Tools Technology V, 2003. p. .
A Conceptual Framework of Data Mining. Studies in Computational Intelligence (SCI), 2008. 118 p. .