Introduction ncompressed text, graphics, audio and video Data require considerable storage capacity for today's storage technology. Similarly for multimedia communications, data transfer of uncompressed images and video over digital network require very high bandwidth. For example, an uncompressed still image of size 640x480 pixels with 24 bits of color require about 7.37 M bits of storage and an uncompressed full-motion video (30 frames/sec) of 10 sec duration needs 2.21 G bits of storage and a bandwidth of 221 M bits/sec. Even if there is availability of enough storage capacity, it is impossible to transmit large number of images or play video (sequence of images) in real time due to insufficient data transfer rates as well as limited network bandwidths. The encryption algorithms aim at achieving confidentiality, not inhibiting unauthorized content duplication. The requirements of these two applications are different. Systems for inhibiting unauthorized content duplication attempt to prevent unauthorized users and devices from getting multimedia data with feasible quality. Such a system could be considered successful if the attacker without the correct key could only get highly degraded contents. Most selective encryption algorithms in the literature are adequate for this purpose. Encryption for confidentiality, on the other hand, must prevent attackers without the correct key from obtaining any intelligible data. Such a system fails if the hacker, after a lot of work, could make out a few words in the encrypted speech or a vague partial image from the encrypted video. To sum up, at the present state of art technology only solution is to compress Multimedia data before storage and transmission and decompress it at the receiver for play back [1]. Discrete Cosine Transform (DCT) is the Transform of choice in image compression standard such as JPEG. Furthermore DCT has advantages such as simplicity and can be implemented in hardware thereby improving its performance. However, DCT suffer from blocky artifacts around sharp edges at low bit rate. In general, wavelets in recent years have gained widespread acceptance in signal processing and image compression in particular. Wavelet-based image coders are comprised of three major components: A Wavelet filter bank decomposes the image into wavelet coefficients which are then quantized in a quantizer, finally an entropy encoder encodes these quantized coefficients into an output bit stream (compressed image). Although the interplay among these components is important and one has the freedom to choose each of these components from a pool of candidates, it is often the choice of wavelet filter that is crucial in determining the ultimate performance of the coder. A wide variety of wavelet-based image compression schemes have been developed in recent years [2]. Most of these well known Images coding algorithms use novel quantization and encoding techniques to improve Coding Performance (PSNR). However, they use a fixed wavelet filter built into the algorithm for coding and decoding all types of color images whether it is a natural, synthetic, medical, scanned or compound image. But, in this work we propose dynamic selection of suitable wavelet for different types of images to achieve better PSNR and excellent false acceptance and rejection ratio with minimum computational complexity and better recognition rate. Wavelets provide new class of powerful algorithm: They can be used for noise reduction, edge detection and compression. The usage of wavelets has superseded the use of DCTs for image compression in JPEG2000 image compression algorithm. This paper is organized as follows: Importance and procedural steps of wavelet transforms is explained in section II, calculation of statistical parameters like IAM, SF and their importance is explained in section III, Training of counter propagation neural network is presented in section IV, Multilayer feed forward neural network with error back propagation training algorithm is explained in section V, proposed method for dynamic selection of suitable wavelet and effective compression with MLFFNN with EBP and modified RLC is explained in section VI, Simulation results are presented in section VII, Conclusion and future scope is given in section VIII. # II. # Wavelet Transform of an Image Wavelet transform is used to decompose an input signal into a series of successive lower resolution reference signals and their associated detail coefficients, which contains the information needed to reconstruct the reference signal at the next higher resolution level. In discrete wavelet transform, an image signal can be analyzed by passing it through analysis filter bank followed by decimation operation. This analysis filter bank which consists of both low pass and high pass filters at each decomposition stage is commonly used in image compression. When signal passes through these filters, it is split into two bands. The low pass filter, which corresponds to averaging operation, extracts the coarse information of the signal. The high pass filter, which corresponds to differencing operation, extracts the detail information of the signal. The output of the filtering operation is then decimated by two. A two dimensional transform can be accomplished by performing two separate one dimensional transform(Fig. 1) First, the image is filtered along the X-dimension using low pass and high pass analysis filter and decimated by two. Low pass filtered coefficients are stored on the left part of the matrix and high pass filtered on the right. Because of decimation, the total size of transformed image is same as the original image. It is then followed by filtering the sub image along the Y-dimension and decimated by two. Finally the image is split into four bands LL1, HL1, LH1 and HH1 through first level decomposition and second stage of filtering. Again the LL1 band is split into four bands viz LL2, HL2, LH2 and HH2 through second level decomposition. # Statistical Features of an Image In literature, many image features have been evaluated: they are range, mean, median, different (mean-median), standard deviation, variance, coefficient variance, skewness, kurtosis, brightness energy [6], gray/colour energy, zero order entropy, first-order entropy and second-order entropy. Other spatial characteristics explored include image gradient [6][7][8][9], spatial frequency (SF) [10] and spectral flatness measure (SFM) [3]. The result shows that almost all the characteristics have no good correlation with the codec performance. However, the image gradient (IAM) and spatial frequency (SF) have strong correlation with the performance of the wavelet-based compression [6][7][8][9]11]. Image gradient measure is a measure of image boundary power and direction. An edge is defined by a change in gray level in gray scale or colour level in colour image. Image gradient is used to provide an indication of activity for an image in terms of edges. Saha and Vemuri [7] defined image gradient as: ? ? ? ? ? ? + ? + + ? = ?? ?? ? = = = ? = 1 1 1 1 1 1 ) 1 , ( ) , ( ) , 1 ( ) , ( * 1 M i N i M i N j j i I j i I j i I j i I N M IAM (1) Where I is intensity value of pixel i, j. The other feature that has strong correlation is the SF in the spatial domain. SF [10] is the mean difference between neighbouring pixels and it specifies the overall image activity. It is defined as: ( ) ( ) ? ? ? ? ? ? ? + ? ? ? ? ? ? ? = ?? ?? ? = ? ? ? = = ? 1 1 2 2 , 1 , 1 1 1 2 1 , , 1 1 M j N k k j k j M j N k k j k j X X MN X X MN SF (2) Where, X is intensity of pixel j, k. Since the result of study shows that there is strong correlation between IAM and SF to the codec performance it is therefore decided to force these features as inputs to the neural network. It is envisaged that these image features are used to select most appropriate wavelet to compress a specific image. IV. # Counter Propagation Neural Network The counter propagation network is two-layered consisting of two feed forward layers. It performs vector to vector mapping similar to Hetero Associative memory networks. Compared to Bidirectional Associative Memory (BAM), there is no feedback and delay activation during the recall operation mode. The advantage of the Counter propagation network is that it can be trained to perform associative mappings much faster than a typical two-layer network. The counter propagation network is useful in pattern mapping and associations, data compression, and classification. The network is essentially a partial selforganizing look-up table that maps Rn into Rq and is taught in response to a set of training examples. The objective of the counter propagation network is to map input data vectors Xi into bipolar binary responses Zi, for i=1, 2,?? p. We assume that data vectors can be arranged into p clusters, and the training data are noisy versions of vectors Xi . The essential part of the counter propagation network structure is shown in Fig. 2. However, counter propagation combines two different; novel learning strategies and neither of them is gradient descent technique. The network's recall operation is also different from previously seen architecture. The first layer of the network is the Kohonen layer, which is trained in the unsupervised winner-takeall mode. Each of the Kohonen layer neurons represents an input cluster or pattern class, so if the layer works in local representation, this particular neurons input and response are larger. Similar input vectors belong to same cluster activate the same m'th neuron of the kohonen layer among all p neurons available in this layer. Note that first-layer neurons are assumed to have continuous activation function during learning. However, during recall they respond with the binary unipolar values 0 and 1, specifically when recalling with input representing a cluster, for example, m and the output vector y of the kohonen layer becomes Such response can be generated as a result of lateral inhibitions within the layer which is to be activated during recall in a physical system. The second layer is called the Grossberg layer due to its outstar learning mode. This layer, with weights Vij functions in a familiar manner Z=â?"? [Vy](4) With diagonal elements of the operatorâ?"? being a sgn (?) function ope rates component wise on entries of the vector V y. . Let us denote the column vectors of the weight matrix V as v 1 ,v 2 ,?v m ?v p, now each weight vector vm for i=1,2,?p, contains entries that are fanning out from the m' th neuron of the kohonen layer. # Substituting (3) & (4) then Z=â?"? [Vm] (5) Where V m = [v 1m v 2m ??.v qm ] t It is observed that the operation of this layer with bipolar binary neurons is simply to output zi=1 if v im >0,and zi=-1 if v im <0, for i=1, 2? q, by assigning any positive and negative values for weights vim highlighted in fig. 2, A desired vector-to-vector mapping x?y?z can be implemented by this architecture. This is done under the assumption that the Kohonen layer responds as expressed in (3). The target vector z for each cluster must be availabe for learning, so that the V m = z (6) However, this is over simplified weight learning rule for this layer, which is of batch type rather than incremental, it would be appropriate if no statistical relationship exists between input and output vectors within the training pairs(x, z). In practice, such relationships often exist and also needs to establish in the network during training. The training rule of Kohonen layer involves adjustment of weight vectors in proportion to the probability of occurrence and distribution of winning events. Using the outstar learning rule of eqn (4), incrementally and not binarily as in eqn (6), it permits us to treat a stationary additive noise in output z in a manner similar to the way we considered distributed clusters during the training of the kohonen layer with "noisy" inputs. The outstar learning rule makes use of the fact that the learning of vector pairs , denoted by the set of mappings {(x 1 ,z 1 ),?,(x p, z p )) will be done gradually and thus involve eventual statistical balancing within the weight matrix V. The supervised learning rule for this layer in such a case becomes incremental and takes the form of the out star learning rule. ?V m = ? (z-V m ) (7) Where ? is set to approximately 0.1 at the beginning of learning and reduces gradually during the training process. Index m denotes the number of the winning neurons in the Kohonen layer. Vectors Zi,, i=1, 2 ?p, used for training are stationary random process vectors with statistical properties that make the training possible. Note that the supervised outstar rule learning eqn (7) starts after completion of the unsupervised training of the first layer. Also as indicated, the weight of the Grossberg layer is adjusted if and only if it fans out from a winning neuron of the kohonen layer. As training progresses, the weights of the second layer tend to converge to the average value of the desired outputs. Let us also note that the unsupervised training of the first layer produces active outputs at indeterminate positions. The second layer introduces ordering in the mapping so that the network becomes a desirable loopup memory table. During the normal recall mode, the grossberg layer output weight values z=vm, connects each output node to the first layer winning neuron. No processing, except for addition and sgn (net) computation, is performed by the output layer neurons if outputs are binary bipolar vectors. The network discussed and shown in figure 2(a) is simply feed forward and does not refer the counter flow of signals for which the original network was named. The full version of the counter propagation network makes use of bidirectional signal flow. The entire network consists of doubled network from figure 2(a). It can be simultaneously both trained and operated in the recall mode in arrangement as shown in Figure 2(a). This makes possible to use it as an auto associator according to the formula ? ?? ? ?? ? ? = ? â?"? 1 [??] â?"? 1 [??] ?(8) Input signals generated by vector x input, and by vector z, desired output, propagate through bidirectional network in opposite directions. Vectors x' and z' are respective outputs that are intended to be approximations, or auto associations, of x and z, respectively. Let us summarize the main features of this architecture in its simple feed forward version. The counter propagation network functions in the recall mode as the nearest match look-up table. The input vector x finds the weight vector wm which is its closest match among p vectors available in the first layer, then the weights that are entries of vector Vm, which are fanning out from winning mth kohonen's neuron, after sgn (.) computation, become binary outputs. Due to the specific training of the counter propagation network, it outputs the statistical averages of vector z associated with input x. practically, the network performs as well as a look-up table can do to approximate vector matching. Counter propagation can also be used as a continuous function approximator. Assume that the training pairs are (xi, ,zi) and zi =g(xi), where g is a continuous function on the set of input vectors {x}. The mean square error of approximation can be made as small as desired by choosing sufficiently large number p of kohonen layer neurons. However, for continuous function approximation, the network is not as efficient as error back-propagation trained networks, since counter propagation networks can be used for rapid prototyping of mapping and to speed up system development, they typically require orders of magnitude fewer training cycles than usually needed in error back-propagation training. The counter propagation can use a modified competitive training condition for kohonen layer. Thus it has been assumed that the winning neuron, for which weights are adjusted and one fulfilling condition of yielding the maximum scalar product of the weights and the training pattern vector. Another alternative for training is to choose the winning neuron of the kohonen layer such that the minimum distance criterion is used directly according to the formula { } i p i m w x w x ? = ? = ... 2 , 1 min (9) The remaining aspects of weight adaptation and of the training, recall mode. The only difference is that the weights do not have to be renormalized after each step in this training procedure. # Multi Layer Feed Forward Neural Network Consider a feed forward neural network with a single hidden layer denoted by N-h-N, where N is the number of units in the input and output layers, and h is the number of units in the hidden layer. The input layer units are fully connected to the hidden layer units which are in turn fully connected to the output units. The output y, of the jth unit is given by j N i ji j b i W f Y + = ? =1 (10) k j h j kj k b y W f O + = ? =1(11) Where, in equation ( 10), Wji is the synaptic weight connecting the ith input node to the jth hidden layer, b, is the bias of the ith unit, N is the number of input nodes, f is the activation function, Y, is the output of the hidden layer. Analogously, eqn (11) describes the subsequent layer where Ok is the kth output in the second layer. The networks are trained using the variation of the Back propagation learning algorithm that minimizes the error between network's output and the desired output. This error is given as follows. ( ) ? = ? = N k k k d o E 1 (12) Where o and d are the present output and desired outputs of the kth unit of the output layer. For image compression, the number of units in the hidden layer h should be smaller than that in the input and output layers (i.e. h