# INTRODUCTION data warehouse is a mechanism for data storage and data retrieval. Data can be stored and retrieved with a multidimensional structure-hypercube or relational, a star schema structure or several other data storage techniques. The task of transitioning from a procedural mindset to an objectoriented paradigm can seem overwhelming; however, the transition does not require developers to step into another dimension or go to Mars in order to grasp a new way of doing things. In many ways, the object-oriented approach to development more closely mirrors the world we've been living in all along: We each know quite a bit about objects already. It is that knowledge we must discover and leverage in transitioning to object-oriented tools and methodologies. Our research has been from a different point of view -our primary motivating factor is to show how existing applications can be enhanced using object -oriented Technology. Like Many new ideas, object -oriented programming does not have a universally accepted definition [1,2]. Ideas on the subject do, however, seem to be converging the "best" definition that we have seen to date is "object-oriented = object + classes + inheritance" [3]. OOP can also be defined as an extension of the idea of abstract data type. The task of transitioning from a procedural mindset to an object -oriented paradigm can seem overwhelming: however, the transition does not require developers to step into another dimension or go to grasp a new way of doing thing. In many ways, the object oriented approach to development more closely mirrors the world we've been living in all along. [8]. we each know quite a bit about objects already. It is that knowledge we must discover and leverage in transitioning to object-oriented tools and methodology. # II. ENTROPY IN DATA COMPRESSION Data compression is of interest in business data warehousing, both because of the cost savings it offers and because of the large volume of data manipulated in many business applications. The types of local redundancy present in business data files include runs of zeros in numeric fields, sequences of blanks in alphanumeric fields, and fields which are present in some records and null in others. Run length encoding can be used to compress sequences of zeros or blanks. Null suppression may be accomplished through the use of presence bits. Another class of methods exploits cases in which only a limited set of attribute values exist. Dictionary substitution entails replacing alphanumeric representations of information such as bank account type, insurance policy type, sex, month, etc. by the few bits necessary to represent the limited number of possible attribute values. The problem of compressing digital data can be decoupled into two subproblems: modeling and entropy coding. Whatever the given data may represent in the real world, in digital form it exists as a sequence of symbols, such as bits. The modeling problem is to choose a suitable symbolic representation for the data and to predict for each symbol of the representation the probability that it takes each of the allowable values for that symbol. The entropy-coding problem is to code each symbol as compactly as possible, given this knowledge of probabilities. (In the realm of lossy compression, there is a third subproblem: evaluating the relative importance of various kinds of errors.) For example, suppose if it is required to transmit messages composed of the four letters a, b, c, and d. A straightforward scheme for coding these messages in bits would be to represent a by \00", b by \01", c by \10" and d by \11". However, suppose if it is known that for any letter of the message (independent of all other letters), a occurs with probability .5, b occurs with probability .25, and c or d occur with probability .125 each. Then a shorter representation might be chosen for a, at the necessary cost of accepting longer representations for the other letters. a could be represented by \0", b by \10", c by \110", and d by \111". This representation is more compact on average than the first one; indeed, it is the most compact representation possible (though not uniquely so). In this simple example, the modeling part of the problem is determining the probabilities for each symbol; the entropy-coding part of the problem is determining the representations in bits from those probabilities; the probabilities associated with the symbols play a fundamental role in entropy coding. One well-known method of entropy coding is Huffman coding, which yields an optimal coding provided all symbol probabilities are integer powers of .5. Another method, yielding optimal compression performance for any set of probabilities, is arithmetic coding. In spite of the superior compression given by arithmetic coding, so far it has not been a dominant presence in real data-compression applications. This is most likely due to concerns over speed and complexity, as well as patent issues; a rapid, simple algorithm for arithmetic coding is therefore potentially very useful. An algorithm which allows rapid encoding and decoding in a fashion akin to arithmetic coding is known as the Q-coder. The QM-coder is a subsequent variant. However, these algorithms being protected by patents, new algorithms with competitive performance continue to be of interest. The ELS algorithm is one such algorithm. The ELS-coder works only with an alphabet of two symbols (0 and 1). One can certainly encode symbols from larger alphabets; but they must be converted to a two-symbol format first. The necessity for this conversion is a disadvantage, but the restriction to a two-symbol alphabet facilitates rapid coding and rapid probability estimation. The ELS-coder decoding algorithm has already been described. The encoder must use its knowledge of the decoder's inner workings to create a data stream which will manipulate the decoder into producing the desired sequence of decoded symbols. As a practical matter, the encoder need not actually consider the entire coded data stream at one time. One can partition the coded data stream at any time into three portions; from end to beginning of the data stream they are: preactive bytes, which as yet exert no in seuence over the current state of the decoder; active bytes, which affect the current state of the decoder and have more than one consistent value; and postactive bytes, which affect the current state of the decoder and have converged to a single consistent value. Each byte of the coded data stream goes from preactive to active to postactive; the earlier a byte's position in the stream, the earlier these transitions occur. A byte is not actually moved to the external _le until it becomes postactive. Only the active portion of the data stream need be considered at any time. Since the internal buffer of the decoder contains two bytes, there are always at least two active bytes. The variable backlog counts the number of active bytes in excess of two. In theory backlog can take arbitrarily high values, but higher values become exponentially less likely. # III. METHODOLOGY Following steps will be taken for the future work 1. Creation of different sizes of databases in oracle 2. Employment of object oriented programming for compression using datawarehousing 3. Further compression of database csv files using # CONCLUSION A data warehouse is an essential component to the decision support system. The traditional data warehouse provides only numeric and character data analysis. But as information technologies progress, complex data such as semi-structured and unstructured data become vastly used [2], [3]. Data Compression is of interest in business data warehousing, both because of the cost saving it offers and because of the large volume of data manipulated in many business application. The entropy is used in many areas such as image processing, document images. But in our research we used the entropy in object oriented data warehousing. Creation of different sizes of databases in oracle. Employment of object oriented programming for compression using datawarehousing. Further compression of database .csv files using C++. Comparison of time taken and compression efficiency for different sizes of databases. 42011![Comparison of time taken and compression efficiency for different sizes of databases. Journal of Computer Science and Technology Volume XI Issue XVII Version I Journal of Computer Science and Technology Volume XI Issue XVII Version I 74 Entropy of Data Compression Using Object Oriented Data Warehousing V.](image-2.png "C++ 4 .GlobalGlobal 2011 October") © 2011 Global Journals Inc. (US) Global Journal of Computer Science and Technology Volume XI Issue XVII Version I Entropy of Data Compression Using Object Oriented Data Warehousing© 2011 Global Journals Inc. (US) * Using the compressed data model in objectoriented data warehousing Wei-Chou Chen; Tzung-Pei Hong; Wen-YangLin IEEE SMC '99 Conference Proceedings. 1999 IEEE International Conference on 1999. 1999 5 * Wei-Chou Chen; Tzung-PeiHong * A composite data model in object-oriented data warehousing Wei-YangLin TOOLS 31. Proceedings 1999. 1999 Technology of Object-Oriented Languages and Systems * The Novel Model of Object-Oriented Data Warehouses JCShieh HWLin Workshop on Databases and Software Engineering 2006 * Lin Wen-Yang "Three maintenance algorithms for compressed object-oriented data warehousing ChenWei-Chou HongTzung-Pei * 2-D Compression of ECG Signals Using ROI Mask and Conditional Entropy Coding BoqiangHuang ; Yuanyuan Wang; JianhuaChen Biomedical Engineering 56 4 April 2009 IEEE Transactions on * On entropyconstrained residual vector quantization design YGong MK HFan C.-MHuang Data Compression Conference 1999. Mar 1999 526 Proceedings. DCC '99 * Wavelet entropy based no-reference quality prediction of distorted/decompressed images IDe JSil 2nd International Conference on 2010. April 2010 3 * ANFIS tuned no-reference quality prediction of distorted/decompressed images featuring wavelet entropy IDe JSil Computer Information Systems and Industrial Management Applications (CISIM), 2010 International Conference on 8-10 Oct. 2010 * An entropy based segmentation algorithm for computergenerated document images LLiu YDong XSong GFan Proceedings. 2003 International Conference on 2003 International Conference on 2003. 2003 1 * Context-based entropy coding of block transform coefficients for image compression CTu TDTran IEEE Transactions on 11 11 Nov 2002 Image Processing * Using difficulty of prediction to decrease computation: fast sort, priority queue and convex hull on entropy bounded inputs SChen JHReif Proceedings., 34th Annual Symposium on 34th Annual Symposium on 1993. Nov 1993 * KimSang Hyun * A novel approach to scene change detection using a cross entropy Rae-HongPark Proceedings. 2000 International Conference on 2000 International Conference on 2000. 2000 3 * Lossless Compression Using Conditional Entropy-Constrained Subband Quantization AScales WRoark FKossentini MJ TSmith Data Compression Conference, 1995. DCC '95. Proceedings Mar 1995 498 * Entropy coding with variable length re-writing systems HJegou CGuillemot ISIT 2005. Proceedings. International Symposium on 2005. Sept. 2005 Information Theory * HuaXie * Entropy-and complexityconstrained classified quantizer design for distributed image classification AOrtega Multimedia Signal Processing 9 2002. 11 Dec. 2002 IEEE Workshop on