# INTRODUCTION ince accessibility of huge amounts of data and knowledge is on the rise, data mining has occupied a significant place in the field of information industry [1]. Data mining is a vital part of the process of Knowledge Discovery in Databases (KDD) [22] which is a non-trivial excavation of hidden, implied, never revealed data and comparatively has a large usage [6]. In broad, data mining methods are categorized into two ways: # Descriptive mining It is an account of putting forward a group of data and its characteristics in a succinct and summarized way. One of the most significant of the descriptive kind mining is Association Rule Mining (ARM) which was introduced by Agarwal et.al. [2]. # Predictive mining This method makes to surmise the outline of the information to give some assumptions [13]. Progressively many network side scholars and researchers of computer science, particularly those who are dedicatedly working in the area of Knowledge Discovery in Data (KDD), mainly concentrate and accentuate on Association Rule Mining (ARM) [14]. Association rule mining [2,3] has been extensively utilized to detect and unravel data mining complicated issues which involve financial troubles, and business dealings [4]. The difficulties of using the mining association rules are divided into two steps. 1. First step is to evaluate the item sets that often occur in the databases. 2. The second one is to produce the association rules. Just the once if the some item sets are found occurring often, then the production of the association rules is uncomplicated and can be accomplished in a specified time [5]. Conventional ARM algorithms judge all the data items in the similar passion by assuming that their weight-age as always 1 if they are identified or 0 if they are not identified which intangibly drives to miss some of the very functional outlines of the data [7]. In order to trounce these drawbacks of the traditional way of mining, utility mining method [9] [10] and weighted association rule mining [16] has come into existence. Utility data mining is a latest field of development entranced in every type of utility factors in data mining procedures and accentuated to assimilate the services also called as utilities in the data mining methods [15]. Utility of any particular data or item is a reliant on the individuals and is measured in terms of aesthetic values or other expressions of individual inclination [7]. When an action in the database and its related minimum utility threshold value and a utility chart are monitored then the objective of the utility mining is to identify and determine each and every high utility item set [12]. In general pros and cons of the item set in the business is not possible to be derived by the utilization of the values that shore up the rules. So this rationally proves that utility mining can be more advantageous than the conventional association rule mining [12]. But while considering the weighted association rule mining (WARM) [16], there has been a modification of not counting the item sets that occurred in an action of the database which made a compulsion to acclimatize the conventional help to the weighted one [23]. This method also segments the consumers on the basis of their reliable nature and impending count of procurements [8]. For an illustration, one consumer may buy 15 items and some other may have 5 items at a time but the conventional association rule method considers these couple of actions in a similar way. Hence the procedure of considering the actions in a same conduct in the conventional association rule method mislays some of the vital information [8]. As a result, Weighted ARM deals mainly with the magnitude of individual substances in a database [17,18,19]. For a case in point, goods that has more advantages or which are under the process of endorsement are given prior able constraints in comparison to rest [20]. Incorporating the mutual characteristics (weightage and utility) for excavation of rules is treated as an addition to the weighted association rule mining which means that the data weights are most important in a particular set of actions besides it also deals with the number of possible appearances of the data in those actions. This has made a concern in categorizing the data appearances and their weights and also in detecting the prior able data which put in more to the benefits of the business [21]. By considering this Sandhu et al [28] proposed a model that identifying association rules based on UW-Score, which calculated based on characteristics weightage and utility. In their proposal they were not considering diminution occurred in contextual factors Here, we put forward an efficient method that makes the utilization of the traditional Apriori algorithm to engender a group of association rules from a database out of which a pooled Exact Utility Weighted Score (EUW-Score) is calculated. In due course, sub values of the priority given weighted, utility and diminution considered constraints are derived on the source of the EUW-score and the tentative outcome exhibit the effectiveness. The later part of the paper is structured as given: The section 2 explains the methods involved. The proposed methodology based on usage of mining utility-oriented association rules is explained in section 3. Section 4 concludes the paper. # II. # METHODS INVOLVED Apriori, a notable algorithm for ARM, is one of the frequently used processes of discovering an assortment of data properties which functionally is associated to bring about the data and are chiefly based on range of occurrences. But the extraction through the number of occurrences does not bring in the attention of the scholars, and to overcome this some more measures are included in the Apriori algorithm for an efficient mining of association rules. They are: a) Weightage Unlike the general transaction database which projects the total amount of characteristics by some number, the traditional algorithms like Apriori mine association rules utilizes a binary mapped database that depicts the occurrence of the data or an item in one course of an action, thus allowing to gather and verify some good number of information related to the characteristics of the data, that results in recurrent but few number of weight-age rules. Even in an ordinary user transaction, some times the data that has a good weight-age occurs rarely, but it must also be involved in the recurrent item set. This procedure is followed in our approach, for mining a subset that has high significance. # b) Utility The individual utility (Gain) of the characteristics is the subsequent measure involved in the approach to give a good standard to the ARM. # c) Diminution The individual Diminution that occurs when item failed to raise the utility (Gain), which balance the utility and provide actual gain of the characteristics is the subsequent measure involved to the approach to give a good standard to the ARM. Some of the service standards in the business would be neglected in a process of mining. As these rules, when mined without these service standards will lead to a plausible loss. Those standards are attained by this method through utility measure (U-gain). Weight-age and utility measures are individually incorporated in copious researches [21,24,25,26] so as to make their methods more efficient but those procedures need high capability. These procedures are effectively utilized in this methodology to extract the association rules from a database. # III. # PROPOSED METHODOLOGY Assuming D as a database having n number of transactions T and m number of attributes I= [i1,i2 ,.....,im] with positive real number weights Wi. Ui specifies the profit associated with the i attribute of utility table U with m count of utility values. The methodology based on weight-age and utility involves some key steps like: Step1: Extraction of the association rules from D by utilizing Apriori. Step 2: Generation of W-gain value. Step 3: Generation of U-gain value. Step 4: Generation of D-sum value. Step 5: Generation of DUW-score through W-gain and Ugain. Step 6: Deriving the vital association rules by taking UWscore into consideration. # a) Extraction of the Association Rules through Apriori To begin with, association rules are excavated from a transaction database D with n transactions. We represent the database D as: (1) Each transaction T in D encompasses with 'm' number of attributes I= [i1,i2 ,.....,im] related to it and each attribute i is symbolized by weights Wi. To extract the rules, a typical Apriori algorithm is used in our methodology. A binary mapped database BT is applied for extracting the association rules in conventional Apriori process by which the initial database D is converted to binary mapped database BT such that it comprises of binary 0 and 1 denoting the non-existence and existence of attributes. By the succeeding equation the weights Wi are mapped onto the binary values. ( Consequently, an input to the Apriori algorithm [2] is produced by the binary mapped database BT for extraction of association rules which are processed in two steps of Apriori as follows: ? Recurrent Item set Generation: Produces minsupport which signifies that each and every feasible set of attributes that comprise support value higher than a predefined threshold. ? Association Rule Generation: Produces minconfidence which signifies that association rules from the item sets that comprise confidence higher than a predefined threshold. The composition of a typical association constraint is: A ? B, where A symbolizes the antecedent and B symbolizes the consequent and these are subset of the items in the binary mapped database, such that A ? I, B ? I and A ? B= ? and is construed as B co-existing if A exists. The support S and confidence C clasps the constraint A ? B in the transaction database D, if item sets A and B are contained in S% and C % of the transactions. Therefore: Support (3) Confidence (A ? B) = P(B|A)= support (A ? B)/ support (A)(4) The pseudo code for the Apriori algorithm is: A k amount of association rules R=[R1,R2 ,.....,Rk] are produced with apriori algorithm and is sent as input to the successive part in the methodology for weight-age and utility calculation. Each attribute of k association rules of R the is determined with some measures, that is for an association rule R i of the form, [A, B] => C, where, A, B and C are considered as the attributes, the derivations U-gain, W-gain and UW-score are evaluated for each attribute A, B and C independently. At first, a decremented arrangement is done for the produced k association rules considering their respected confidence level. The listing of rearranged association rules is specified by b) Generating the value of W-gain At the start the initial rule R1' is chosen from the rearranged list S and the independent attributes of R1' are derived followed by the computation of W-gain. Definition 1: Item weight (Wi): Item weight value Wi, is a nonnegative integer which is termed as the total magnitude evaluation of the attribute present in the transaction database D. Definition 2: Weighted Gain (W-gain): W-gain is termed as the summation of weights of each item W i of an attribute that is involved in each and every transaction of the database D as referred in the given equation: Correspondingly the initial rule R1' is preferred from the rearranged list S and the independent attributes of R1' are derived. By considering the U-factor and the utility value Ui ,the value of U-gain for each character attribute is determined. T T D T n ? ? ? ? ? ? ? ? = ? ? ? ? ? ? ? ? 0 0 1 0 { if W T i k B r if W T i k = ? = ? ? (A ? B) = P (A ? B) 1 1 1 { arg 1 };( 2; 0; Definition 3: Item Utility (Ui): In general every character has a precincts of the gain related to that particular character or attribute and is delineated as the Item utility Ui. Definition 4: Utility table U: A quantity of 'm' utility values Ui are encompassed in the utility table U with the attributes related in the transaction database D. We represent the utility table as: (6) Definition 5: Utility factor (U-factor): The constant value of utility factor (U-factor) is derived by the addition of every utility items (Ui) of the utility table U .We define it as: (7) Consider, m is the amount of attributes involved in the transaction database. Definition 6: Utility Gain (U-gain): The calculation of an attribute's authentic utility by considering its U-factor is referred as the Utility Gain and we define it as follows: (8) For every attribute in the association rule R1' the value of U-gain is calculated. Definition 7: Diminution table: A quantity of 'm' diminution values DMi are encompassed in the diminution table DM with the attributes related in the transaction database D. We represent the diminution table as: Definition 8: Diminution factor (D-factor): The constant value of diminution factor (D-factor) is derived by the addition of every diminution items ( i DM ) of the diminution table DM. We define it as: (10) Consider, m is the no of attributes involved in the transaction database. Definition 9: Diminution Sum (D-sum): The calculation of an attribute's authentic utility by considering its U-factor is referred as the Utility Gain and we define it as follows: (11) For every attribute in the association rule R1' the value of D-sum is calculated. # d) Generation of EUW-score through W-gain, U-gain and D-sum From the values derived by calculating W-gain, U-gain and D-sum for the each attribute, they are merged together into one value named as UW-score for every independent association rule. Definition 10: Exact Utility Weighted Score (EUW-score): EUW-score is derived by computing the proportion between the addition of products of W-gain, U-gain and D-sum for each attribute in the association rule to the total amount of attributes present in the rule. (12) Here, | R | denotes the amount of attributes present in the association rule. The equations ( 5),( 8) and ( 11) and ( 12) aimed to determine the W-gain, U-gain, D-sum and EUWscore These equations are looped for the remaining association rules R2 ' to R k' involved in the rearranged list S. And for total 'k' number of association rules in the rearranged list S will be calculated with a EUW-Score related to it and the association rules in the rearranged list S are consequently rearranged by taking EUW-score into consideration to get 5![Extended Apriori for association rule mining: Diminution based utility weightage measuring approach © 2011 Global Journals Inc. (US) Global Journal of Computer Science and Technology Volume XI Issue XXII Version I](image-2.png "( 5 )") ![R1', R2 ',.., Rk ' }, S R ? , where conf (R1 ') ? conf (R2 ') ? ?conf (R3 ')?.. ? ?conf (Rk ').](image-3.png "") ![Here we term, i w as the weight of item in an attribute and | | T as the amount of transactions in the database D.c) Generating the value of U-gain](image-4.png "") December © 2011 Global Journals Inc. (US) Global Journal of Computer Science and Technology Volume XI Issue XXII Version I 26 2011 December © 2011 Global Journals Inc. (US) Global Journal of Computer Science and Technology Volume XI Issue XXII Version I 28 2011 December © 2011 Global Journals Inc. (US) Global Journal of Computer Science and Technology Volume XI Issue XXII Version I 29 2011 December © 2011 Global Journals Inc. (US) consequential values of the weighted and utility related association rules is given by , where and The significant improvement in minimizing number of rules can be observable in following graphs. ## CONCLUSION By considering the weight factor, utility and diminution, the methodology used by us has given a chance to provide a proficient high utility association rules. At the outset, the planned methodology has enabled to make utilization of the conventional Apriori algorithm to create a group of association rules from a database. Depending on weightage (W-gain), utility (Ugain) and diminution (D-sum) complications a joint Exact Utility Weighted Score (EUW-Score) is generated for each association rule extracted. Considering the EUW-Score generated, eventually a subset of notable association rules are derived at. * Recognizing & rioritizing Of Critical Success Factors (CSFs) On Data Mining Algorithm's Implementation In Banking Industry: Evidence From Banking Business System AliRajabzadeh Ghatari NasibehMohamadi AidaHonarmand ParvizAhmadi NasibehMohamadi Proceedings of EABR & TLC EABR & TLCPrague, Czech Republic 2009 * Mining association rules between sets of items in large databases RAgrawal TImielinski ASwami proceedings of the international Conference on Management of Data, ACM SIGMOD the international Conference on Management of Data, ACM SIGMODWashington, DC May 1993 * Fast algorithms for mining association rules RAgrawal RSrikant Proceedings of 20th International Conference on Very Large Data Bases 20th International Conference on Very Large Data BasesSantiago, Chile September 1994 * Data mining: an overview from a database perspective MSChen JHan PSYu IEEE Transactions on Knowledge and Data Engineering 8 6 1996 * Mining temporal rare utility Itemsets in large databases using relative utility thresholds Chun-JungChu VincentSTseng TyneLiang International Journal of Innovative Computing, Information and Control 4 11 November 2008 * Knowledge Discovery in Databases: An Overview WFrawley GPiatetsky-Shapiro C;Matheus Ai Magazine 1992. 1992 * Mining Long High Utility Itemsets in Transaction Databases GuangzhuYu ShihuangShao XianhuiZeng WSEAS Transactions On Information Science & Applications 5 2 February 2008 * Weighted Support Association Rule Mining using Closed Itemset Lattices in Parallel AM JMd PZubair Rahman Balasubram International Journal of Computer Science and Network security 9 3 March 2009 * A Foundational Approach to Mining Itemset Utilities from Databases HongYao HowardJHamilton CoryJButz Proceedings of the Third SIAM International Conference on Data Mining the Third SIAM International Conference on Data MiningOrlando, Florida 2004 * Pushing Frequency Constraint to Utility Mining Model JingWang YingLiu LinZhou YongShi XingquanZhu Proceedings of the 7th international conference on Computational Science the 7th international conference on Computational ScienceBeijing, China 2007 * High-utility pattern mining: A method for discovery of high-utility item sets JianyingHu AleksandraMojsilovic Pattern Recognition 40 11 November 2007 * Isolated items discarding strategy for discovering high utility itemsets Yu-ChiangLi Jieh-ShanYeh Chin-ChenChang Data & Knowledge Engineering 64 1 January 2008 * Attribute-Oriented Induction in Data Mining JHan YFu Advances in Knowledge Discovery and Data Mining AAAI Press/The MIT Press 1996 * Visually Aided Exploration of Interesting Association Rules BingLiu WynneHsu KeWang ShuChen Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining 1999 * A Fast Algorithm for Mining High Utility Itemsets SShankar TBabu Nishanth JayanthiSPurusothaman proceedings of IEEE International Advance Computing Conference IEEE International Advance Computing ConferencePatiala,India 2009 * Mining Weighted Association Rules without Preassigned Weights KeSun FengshanBai IEEE Transactions on Knowledge and Data Engineering 20 4 April 2008 * Mining Association Rules with Weighted Items C HCai A W CFu CCheng WKwong Proceedings of the International Symposium on Database Engineering and Applications the International Symposium on Database Engineering and ApplicationsCardiff, Wales, UK July 1998 * Efficient Mining of Weighted Association Rules (WAR) WWang YuPYang Proceedings of the KDD the KDDBoston, MA August 2000 * Mining Weighted Association Rules SLu HHu FLi Intelligent Data Analysis August 2001 5 * Fuzzy Weighted Association Rule Mining with Weighted Support and Confidence Framework MSulaiman Khan MaybinMuyeba FransCoenen International Workshops on New Frontiers in Applied Data Mining Osaka, Japan May 20-23, 2009 * A Weighted Utility Framework for Mining Association Rules MSulaiman Khan MaybinMuyeba FransCoenen proceedings of European Symposium on Computer Modeling and Simulation European Symposium on Computer Modeling and SimulationLiverpool September 2008 * Decomposition Methodology for Knowledge Discovery and Data Mining: Theory and Applications ZOded LiorMaimon Rokach May 2005 World Scientific Publishing Company * Weighted Association Rule Mining using Weighted Support and Significance Framework FengTao FionnMurtagh MohsenFarid Proceedings of the International Conference on Knowledge Discovery and Data Mining the International Conference on Knowledge Discovery and Data MiningWashington 2003 * Mining itemset utilities from transaction databases HongYaoa HowardJHamilton Data & Knowledge Engineering 59 3 December 2006 * An efficient algorithm for mining high utility itemsets with negative item values in large databases Chun-JungChua VincentSTsengb TyneLiang Applied Mathematics and Computation 215 2 2009 * Efficient mining of weighted interesting patterns with a strong weight and/or support affinity UnilYuna Information Sciences 177 17 September 2007 * An Improvement in Apriori Algorithm Using Profit and Quantity PSSandhu DSDhaliwal SNPanda ABisht Computer and Network Technology (ICCNT), 2010 Second International Conference on April 2010