# Introduction hloroplasts, the organelles responsible for photosynthesis, are in many respects similar to mitochondria. Both chloroplasts and mitochondria function to generate metabolic energy, evolved by endosymbiosis, contain their own genetic systems, and replicate by division. However, chloroplasts are larger and more complex than mitochondria, and they perform several critical tasks in addition to the generation of ATP. Most importantly, chloroplasts are responsible for the photosynthetic conversion of Carbon Di-oxide to carbohydrates. In addition, chloroplasts synthesize amino acids, fatty acids, and the lipid components of their own membranes. The reduction of nitrite to ammonia, an essential step in the incorporation of nitrogen into organic compounds, also occurs in chloroplasts. Moreover, chloroplasts are only one of several types of related organelles (plastids) that play a variety of roles in plant cells [1][2][3][4][5][6][7]. Microsatellites (sometimes referred to as a variable number of tandem repeats or VNTRs) are short segments of DNA that have a repeated sequence, and they tend to occur in DNA. In some microsatellites, the repeated unit may occur four times, in others it may be seven, or two, or three [8]. These repeats are ubiquitous in nature and are responsible for causing several diseases and cancers [9] [10]. These are used in various applications like DNA Fingerprinting, DNA Forensics, Paternity Studies, and have been considered as potential markers for identifying species, for establishing phylogenetic relationships and also to study evolution [11]. Microsatellites are ubiquitously found in both coding and non-coding regions of all organisms and their distribution in coding regions (genes) is known to affect protein formation and gene regulation [12]. Next-generation sequencing enabled researchers to study biological systems at a level never before possible. Studying mutations in chloroplast microsatellite repeats can be very helpful to understand various biological questions and their usage in various other diverse applications. Few studies [13][14][15][16] earlier analyzed the distribution of microsatellites in chloroplast genomes but they are only confined to single or very low number of genomes. This paper describes the study performed to analyze microsatellite repeats in more than 370 chloroplasts genomes and details have been presented. # II. # Materials & Methods Imperfect microsatellites have been extracted from Chloro Mito SSRDB [17] version 2.0, an opensource microsatellite repository of sequenced organelle genomes. For this study, a total of 370 chloroplast genome sequences have been used that belong to various classes as shown in Table 1. # Discussion a) Genome Size Analysis We did a preliminary study to analyze the genome sizes of all chloroplasts. The chloroplast genome sizes vary from few kbs to a maximum of 1 Mb. The smallest chloroplast genome reported is of size 29529bp that belongs to plant named Plasmodium falciparum HB3 apicoplast (ID: NC_017928) belongs to Non-Viridiplantae category. The largest chloroplast genome spans about 1021616 bp of length that belongs to Paulinella chromatophora chromatophore (ID: NC_011087) belongs to Rhizaria. In Viridiplantae, the smallest chloroplast genome is Helicosporidium sp. ex Simulium jonesii plastid(ID: NC_008100) of length 37454 bp where as the largest chloroplast genome is Floydiella terrestris(ID: NC_014346) chloroplast of length 521168 bp. In Non-Viridiplantae, the smallest chloroplast genome is found as Plasmodium falciparum HB3 apicoplast (ID: of length 29529 bp where as the largest chloroplast genome is Paulinella chromatophora chromatophore (ID: NC_011087) chloroplast of length 1021616 bp. It is observed that this non-Virdiplantae category genome size is greater than the Viridiplantae genomes. When the average genome sizes of chloroplast are considered category wise, it has been observed that the average lengths of Viridiplantae chloroplast genomes are little bit higher when compared to those of other non Virdiplantae(Refer Fig 1). 2 gives a summary of the total number of genomes categorized based on genome sizes of the two classes of chloroplast. It has been observed that majority of the genome sizes lie between 10kb to 500kb, only two genomes namely Floydiella terrestris chloroplast (NC_014346) and Paulinella chromatophora chromatophore (NC_011087) are found to be greater than 500kb. On the other hand, 311 plants of Viridiplantae show genome sizes between 100kb and 500kb. # b) Distribution of Microsatellites Microsatellites in or near genes (coding regions) are found to impact protein formation and gene regulation. When the distribution of microsatellites has been analyzed overall, it is found that around 57% of microsatellite repeats fall in coding regions of all chloroplast genomes. Out of the total 78536 chloroplast microsatellites, 45518 microsatellites fall in gene regions where as the rest 33018 repeats fall in non-coding regions. However, it is surprising to see that the distribution differs when the two classes have been compared separately (Refer Fig. 2). Genomes of Non-Viridiplantae are found to be having majority of its microsatellites in coding regions (64%). On the other hand, green plants (Viridiplantae) show that around 57% of their microsatellites to be distributed in coding regions. When two chloroplast categories are compared (Refer Fig. 3), these two categories exhibit a similar distribution of its microsatellites in coding and non coding regions. It would be interesting to study the reason behind the major number of microsatellite repeats in Viridiplantae. # c) Motif-size wise Analysis We have further analyzed the distribution of chloroplast microsatellites based on their motif sizes. Table 3 lists the proportionate distribution of chloroplast microsatellites motif-size wise. It has been observed that chloroplast genomes are rich in tri and tetra nucleotide repeats which tohether account for more than 77% in Non-virdiplantae, and around 62% in Virdiplantae. Mono, Penta and Hexa-nucleotide repeats are found to be very low in number. When the microsatellite tract lengths have been analyzed, the genomes reported few interesting tract lengths for almost all motif sizes. The average microsatellite tract lengths are usually observed to be not more than 19 bp. But, it is surprising to note that some of the tetra and tri repeats have shown exceptional tract lengths as large as 276bp have been observed. Based on the results in Table 4, we have further tried to find repeats in chloroplast genomes that have exceptional tract lengths. Interestingly, we found 10 repeats in chloroplast with tract lengths 100bp or more; out of those, two repeats have tract lengths 200bp or more. Two significant tract lengths of 276 and 203 have been reported for genomes with IDs NC_020321, NC_008117 respectively. IV. # Conclusion In this paper, we have presented a brief description about the distribution of microsatellite repeats in all sequenced chloroplast genomes of Plants. This study forms the first comprehensive analysis of microsatellite repeats in chloroplast genomes and the statistics of this study can be a useful resource for biologists. 1![Figure 1 : Bar Graph representing the average genome sizes of Viridiplantae and Non-Viridiplantae Table2gives a summary of the total number of genomes categorized based on genome sizes of the two classes of chloroplast. It has been observed that majority of the genome sizes lie between 10kb to 500kb, only two genomes namely Floydiella terrestris chloroplast (NC_014346) and Paulinella chromatophora chromatophore (NC_011087) are found to be greater than 500kb. On the other hand, 311 plants of Viridiplantae show genome sizes between 100kb and 500kb.](image-2.png "Figure 1 :") 2![Figure 2 : Distribution of Microsatellite Repeats in Coding and Non-coding regions of Viridiplantae, Non-Viridiplantae](image-3.png "Figure 2 :") 3![Figure 3 : Distribution of Microsatellite Repeats in Coding and Non-coding for all chloroplast Categories](image-4.png "Figure 3 :") 1CategoryTotal No.Alveolata9Cryptophyta3Euglenozoa5Glaucocystophyceae1Haptophyceae4Rhizaria2Rhodophyta9Stramenopiles14Viridiplantae323Total Genomes370Among the 370 genomes, 323 genomes belongto Viridiplantae (Green Plants), 47 genomes belongs toNon-Viridiplantae which include genomes of Alveolata,Cryptophyta,Euglenozoa,Glaucocystophyceae,Haptophyceae,Rhizaria,RhodophytaandC © 2015 Global Journals Inc. (US) Global Journal of C omp uter S cience and T echnology Volume XV Issue III Version I Year ( ) C Stramenopiles ( 2Average genome sizes of chloroplast150000148178.28145000Genome size140000136551.53135000130000ViridiplantaeNon-ViridiplantaeSize RangeNo. of plants>= 10 Kb and <50 KbNon-Viridiplantae5Viridiplantae2>= 50 Kb and <100 KbNon-Viridiplantae10Viridiplantae9>= 100 Kb and <500 KbNon-Viridiplantae31 3Motif SizeNon-ViridiplantaeViridiplantaeMono159(1.80%)8602(12.33%)Di840(9.55%)7909(11.34%)Tri3506(39.87%)17055(24.45%)Tetra3300(37.52)26796(38.42%)Penta623(7.08%)5680(8.14%)Hexa365(4.15%)3701(5.31%)Total879369743 4Non-ViridiplantaeViridiplantaeMotif SizeHighLowAvgHighLowAvgMONO251213.93461214.49DI541112.90831113.24TRI511112.192761112.38TETRA291111.912031112.13PENTA651415.271001415.41HEXA421718.741451719.70 © 2015 Global Journals Inc. (US) 1 © 2015 Global Journals Inc. (US) * Chloroplast DNA rearrangements are more frequent when a large inverted repeat sequence is lost JeffreyDPalmer WilliamFThompson Cell 29 2 1982 * Polymorphic simple sequence repeat markers in chloroplast genomes of Solanaceous plants GJBryan JMcnicoll GRamsay RCMeyer WSJong Theoretical and Applied Genetics 99 5 1999 * Polymorphic simple sequence repeat regions in chloroplast genomes: applications to the population genetics of pines WPowell MMorgante RMcdevitt GGVendramin JARafalski Proceedings of the National Academy of Sciences 92 17 1995 * Complete nucleotide sequence of thePorphyra purpurea chloroplast genome MichaelReith JanetMunholland Plant Molecular Biology Reporter13 4 1995 * Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: the tortoise and the hare III JoeyShaw EdgarBLickey EdwardESchilling RandallLSmall American journal of botany 94 3 2007 * The highly rearranged chloroplast genome of Trachelium caeruleum (Campanulaceae): multiple inversions, inverted repeat expansion and contraction, transposition, insertions/deletions, and several repeat families MaryECosner KRobert JeffreyDJansen StephenRPalmer Downie Current genetics 31 5 1997 * Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology RichardCronn AaronListon MatthewParks DavidSGernandt RongkunShen ToddMockler Nucleic acids research 36 19 2008 * Evolutionary dynamics of microsatellite DNA CSchlotterer Chromosoma 109 2000 * Simple sequences DTautz CSchlotterer Curr. Opin. Genet. Dev 4 1994 * Microsatellite instability in cancer of the proximal colon SNThibodeau GBren DSchaid Science 260 5109 1993 * Microsatellites: evolution and applications DBGoldstein CSchlotterer 2001 Oxford University Press Oxford * Microsatellites within genes: structure, function, and evolution YCLi ABKorol TFahima ENevo Molecular biology and evolution 21 6 2004 * Simple sequence repeats in organellar genomes of rice: frequency and distribution in genic and intergenic regions PRajendrakumar AKBiswal SMBalachandran KSrinivasarao RMSundaram Bioinformatics 23 2007 * In silico analysis of microsatellites in organellar genomes of major cereals for understanding their phylogenetic relationships PRajendrakumar AKBiswal SMBalachandran RMSundaram In Silico Biol 8 2008 * Microsatellite analysis in organelle genomes of Chlorophyta HKuntal VSharma HDaniell Bioinformation 8 2012 * Distribution and evolution of short tandem repeats in closely related bacterial genomes EKassai-Jáger COrtutay GTóth TVellai ZGáspári Gene 410 1 2008 * ChloroMitoSSRDB: open source repository of perfect and imperfect repeats in organelle genomes for evolutionary genomics GSablok SBMudunuri SPatnana MPopova MAFares LaPorta N DNA research 20 2 2013