, 2004) and RnaViz 2.0 (De Rijk & De Wachter, 1997), with experimentally defined 5S rRNA used for reference. The number of 5S rRNA genes present in a genome was determined by whole-genome BLAST search based on the known 5S rRNA sequence. Genomes that contained only a single 5S rRNA gene operon were not further analyzed. Copies of 5S rRNA genes from each remaining genome were aligned with clustalw (Thompson et al., 1994). To calculate diversity, we normalized the ABT-263 solubility dmso number of revealed mismatches and indels by the total number of positions, including gaps in the alignment. To compare two related secondary structures, a mismatch was defined as conserved
if it did not cause a stem/loop transition (Pei et al., 2009, 2010). For example, a mismatch located in a loop was considered conserved because it maintained the loop structure and a mismatch located in a stem but causing GC/GT conversions or covariation was also considered conserved because it did not cause a change in base-pairing or disruption of the stem. In contrast, a nonconserved mismatch was one that altered base-pairing and converted a loop to a stem or a stem to a loop. In total, 1161 complete prokaryotic genomes were available for analysis, 86 from Archaea, and 1075 selleck chemicals from Bacteria, representing 779 unique species (75 Archaea and 704 Bacteria) (Supporting Information, Table S1). Of the 779 species, 174 genomes contained only a single 5S rRNA gene. Remaining were 605 unique
species (40 Archaea and Farnesyltransferase 565 Bacteria) whose genomes contained multiple 5S rRNA genes, representing 27 phyla. Proteobacteria was the most abundant phylum (344 species) in the dataset followed
by Firmicutes (123 species), Actinobacteria (82 species), Euryarchaeota (53 species), and Bacteroidetes/Chlorobi (36 species). The remaining 22 phyla were represented by only 141 species. The 605 genomes examined contained 2–19 copies of 5S rRNA genes [median = 4 copies per genome, interquartile range (IQR) = 2–6]; 388 genomes had 5S rRNA genes that were identical, and 217 had 5S rRNA genes that were diversified. For each of the 217 diversified species, the most dissimilar 5S rRNA gene pair was identified by pairwise analysis of all possible pairs. Maximal diversity ranged from 0.60% to 26.15% (median = 2.50%, IQR = 0.88–5.91%) (Wonacott & Wonacott, 1990). Sixteen genomes with > 13.44% diversity between the most dissimilar pair of 5S rRNA genes – Staphylococcus saprophyticus ssp. saprophyticus, Actinobacillus pleuropneumoniae, Thermoanaerobacter pseudethanolicus, Desulfotomaculum acetoxidans, Bifidobacterium adentium, Lactococcus lactis ssp. cremoris, Francisella novicida, Syntrophomonas wofei ssp. Wolfei, Methanosphaerula palustris, Francisella tularnesis ssp. holarctica, Psychromonas ingrahamii, Bacillus megaterium, Actinobacillus succinogenes, Symbiobacterium thermophilum, Aggregatibacter aphrophilus, and Haemophilus influenzae – were classified as outliers, using Tukey’s boxplot (Wonacott & Wonacott, 1990).