# Introduction oftware remains buggy and testing is still the leading approach for detecting software errors. Incorrect and buggy behaviour in deployed software costs up to $70 billion each year in the US [1]. Thus debugging, testing, maintaining, optimizing, refactoring, and documenting software, while timeconsuming, remain significantly important. Such maintenance is reported to consume up to 90% of the total cost of software projects [2]. Maximum maintenance time is spent studying existing software since maintenance concern is incomplete documentation. Consistently, however, verification tools require specifications that describe some aspect of program accuracy. Creating accurate specifications is difficult, time-consuming and error-prone. Verification tools can only point out disagreements between the program and the specification. Even assuming a sound and complete tool, an defective specification can still yield false positives by pointing out non-bugs as bugs or false negatives by failing to point out real bugs. Crafting specifications typically requires program-specific knowledge. Specification mining can be compared to learning the rules of English grammar by reading essays written by high school students; we propose to focus on the essays of passing students and be doubtful of the essays of failing students. We claim that existing miners have high false positive rates in large part because they treat all code equally, even though not all code is created equal. For example, consider an execution trace through a recently modified, rarely-executed piece of code that was copied-and-pasted by an inexperienced developer. We argue that such a trace is a poor guide to correct behaviour when compared with a well-tested, infrequently-changed, and commonly-executed trace. Various pre-existing software projects are not yet formally specified [3]. Formal program specifications are difficult for humans to construct [4], and incorrect specifications are difficult for humans to debug and modify [5]. Accordingly, researchers have developed techniques to automatically infer specifications from program source code or execution traces [6], [7], [8], [9]. These techniques typically produce specifications in the form of finite state machines that describe legal sequences of program behaviours. Unfortunately, these existing mining techniques are insufficiently precise in practice. Some miners produce large but approximate specifications that must be corrected manually [5]. As these large specifications are indefinite and difficult to debug, this article focuses on a second class of techniques that produce a larger set of smaller and more precise candidate specifications that may be easier to evaluate for correctness. These specifications typically take the form of two-state finite state machines that describe temporal properties, e.g. "if event a happens during program execution, event b must eventually happen during that execution." Twostate specifications are limited in their expressive power; comprehensive API specifications cannot always be expressed as a collection of smaller machines [8]. Recognize and illustrate lightweight, automatically collected software features that fairly accurate source code quality for the purpose of mining specifications. In this approach explain how to lift code quality metrics to metrics on traces, and empirically measure the utility of our lifted quality metrics when applied to previous static specification mining techniques. To avoid false positives recommend two novel specification mining techniques that use our automated quality metrics to learn temporal safety specifications. The main contributions of this project are: ? A set of source-level features related to software engineering processes that capture the trustworthiness of code for specification mining. We analyze the relative analytical power of each of these features. ? Experimental evidence that our notions of trustworthy code serve as a basis for evaluating the trustworthiness of traces. We provide a characterization for such traces and show that offthe-shelf specification miners can learn just as many specifications using only 60% of traces. ? A novel automatic mining technique that uses our trust-capturing features to learn temporal safety specifications with few false positives in practice. We evaluate it on over 800,000 lines of code and explicitly compare it to two previous approaches. # II. # On going methodology Our basic mining technique learns specifications that locate more safety-policy violations than previous miners (740 vs. 426) while presenting far fewer false positive specifications (107 vs. 567). When focused on precision, our technique obtains a low 5% false positive rate, an order-of-magnitude improvement on previous work, while still finding specifications that locate 265 violations. To our knowledge, this is the first specification miner that produces multiple candidate specifications and has a false positive rate under 90%. i. # Approach In this approach present a specification miner that works in three stages: 1. Statically estimate the trustworthiness of each code fragment. 2. Lift that judgment to traces by considering the code visited along a trace. 3. Weight the contribution of each trace by its trustworthiness when counting event frequencies for specification mining. The code is most trustworthy when it has been written by experienced Programmers who are familiar with the project at hand, when it has been well-tested, and when it has been mindfully written. # b) Mining Temporal Specification for Error Detection If we use implicit language-based specifications (e.g., null pointers should not be dereferenced) or to reuse standard library specifications then it can reduce the cost of writing specifications. More recently, however, a variety of attempts have been made to conclude program-specific temporal specifications and API usage rules automatically. These specification mining techniques take programs (and possibly dynamic traces, or other hints) as input and produce candidate specifications as output. Basically specifications could also be used for documenting, refactoring, testing, debugging, maintaining, and optimizing a program. Centre of attention is that finding and evaluating specifications in a particular context: given a program and a generic verification tool, what specification mining technique should be used to find bugs in the program and thereby improve software quality? Thus we are concerned both with the number of "real" and "false positive" specifications produced by the miner and with the number of "real" and "false positive" bugs found using those "real" specifications. In this methodology propose a novel technique for temporal specification mining that uses information about program error handling. Our miner assumes that programs will generally adhere to specifications along normal execution paths, but that programs will likely violate specifications in the presence of some run-time errors or exceptional situations. Intuitively, error-handling code may not be tested as often or the programmer may be unaware of sources of run-time errors. Taking advantage of this information is more important than ranking candidate policies. i. # Contributions # ? Propose a novel specification mining technique based on the observation that programmers often make mistakes in exceptional circumstances or along uncommon code paths. ? Present a qualitative comparison of five miners and show how some miner assumptions are not wellsupported in practice. ? Finally, we give a quantitative comparison of our technique's bug-finding powers to generic "library" policies. For our domain of interest, mining finds 250 more bugs. We also show the relative unimportance of ranking candidate policies. In all, we find 69 specifications that lead to the discovery over 430 bugs in 1 million lines of code. # III. Proposed system for quantitative analysis of fault and failure In proposed system, aim to develop a system which can be used to measure the quality of the code considering different aspects affecting the quality of the code. The term quality of the code can be explained using different factors such as code clone, author rank, code churn, code readability, path feasibility etc. To Present a new specification miner that works in three stages. First, it statically estimates the quality of source code fragments. Second, it lifts those quality judgments to traces by considering all code visited along a trace. Finally, it weights each trace by its quality when counting event frequencies for specification mining. # Global Journal of Computer Science and Technology Volume XII Issue XII Version I develops an automatic specification miner that balances true positives -as required behaviours -with false positives -non-required behaviours. We claim that one important reason that previous miners have high false positive rates is that they falsely assume that all code is equally likely to be correct. For example, consider an execution trace through a recently modified, rarely-executed piece of code that was copied and-pasted by an inexperienced developer. We believe that such a trace is a poor guide to correct behaviour, especially when compared with a well-tested, stable, and commonly-executed piece of code. Patterns of specification adherence may also be useful to a miner: a candidate that is violated in the high quality code but adhered to in the low quality code is less likely to represent required behaviour than one that is adhered to on the high quality code but violated in the low quality code. We assert that a combination of lightweight, automatically collected quality metrics over source code can usefully provide both positive and negative feedback to a miner attempting to distinguish between true and false specification candidates. Code quality information may be gathered either from the source code itself or from related artefacts, such as version control history. By augmenting the trace language to include information from the software engineering process, we can evaluate the quality of every piece of information supporting a candidate specification (traces that adhere to a candidate as well as those that violate it and both high and low quality code) on which it is followed and more accurately evaluate the likelihood that it is valid. The system architecture of the system is as in following figure, which explains the modules to be generated. # Conclusion Testing, maintenance, optimization, refactoring, documentation, and program repair these are the various applications of formal specification. Though human programmers should not produce and verify such specification manually. These technique is also problematic since it treat all parts of program as equally indicative as correct behaviour. We encode this intuition using dependability metrics such as analytical execution frequency, copy paste code measurements, code duplication software readability or path feasibility. We compare the bug finding power of various miners. This technique improves the performance of existing trace based miners by focusing on high quality traces. Our technique is also useful to improve the quality of code through specification mining. ![a) Specification Mining With Few False Positive This methodology presents a new automatic specification miner that uses artifacts from software S © 2012 Global Journals Inc. (US)Global Journal of Computer Science and TechnologyVolume XII Issue XII Version I capture the reliability of its input traces.](image-2.png "") 1![Figure 1 :](image-3.png "Figure 1 :") © 2012 Global Journals Inc. (US) * The economic impact of inadequate infrastructure for software testing may 2002 National Institute of Standards and Technology Tech. Rep. 02-3 * RCSeacord DPlakosh GALewis Modernizing Legacy Practices 2003 * Formal specifications on industrial-strength code from myth to reality MDas Computer-Aided Verification 2006 1 * Setuid demystified HChen DWagner DDean USENIX Security Symposium 2002 * Debugging temporal specifications with concept analysis GAmmons DMandelin RBod´?k JRLarus Programming Language Design and Implementation 2003 * Mining specifications GAmmons RBodik JRLarus Principles of Programming Languages 2002 * Bugs as inconsistent behaviour: A general approach to inferring errors in systems code DREngler DYChen AChou Symposium on Operating System Principles 2001 * Symbolic mining of temporal specifications MGabel ZSu ICSE 2008 * Automatic extraction of object-oriented component interfaces JWhaley MCMartin MSLam ISSTA 2002 * Measuring code quality to improve specification mining ClaireLe Goues WestelyWeimer IEEE Trans. Software Eng * FiVaTech: Page-Level Web Data Extraction from Template Pages MohammedKayed Chia-HuiChang IEEE Transactions On Knowledge And Data Engineering 22 2 February 2010 * A metrics suite for object oriented design SRChidamber CFKemerer IEEE Trans. Softw. Eng 20 6 1994 * Simplify: a theorem prover for program checking DDetlefs GNelson JBSaxe * J. ACM 52 3 2005 * Who are source code contributors and how do they change? M DiPenta DMGerman in Working Conference on Reverse Engineering IEEE Computer Society 2009 * Cloning Considered Harmful" in WCRE CKapser MWGodfrey 2006 * A study of consistent and inconsistent changes to code clones JKrinke 2007 WCRE. IEEE Computer Society * Specification mining with few false positives C LeGoues WWeimer TACAS 2009 * A complexity measure TJMccabe IEEE Trans. Software Eng 2 4 1976 * Using software dependencies and churn metrics to predict field failures: An empirical case study NNagappan TBall ESEM 2007 * On the Sustained Use of a Test Driven Development Practice at IBM," in Agile JCSanchez LWilliams EMMaximilien 2007. August 2007 IEEE Computer Society * Privately finding specifications WWeimer NMishra IEEE Trans. Software Eng 34 1 2008 * Mining temporal specifications for error detection WWeimer GCNecula TACAS 2005 * Bugs as inconsistent behavior: A general approach to inferring errors in systems code DREngler DYChen AChou Symposium on Operating Systems Principles 2001