60% of the genes into the GO database (Additional file 1:

60% of the genes into the GO database (selleck products Additional file 1: PF-3084014 Figure S1) [28], 73.50% of the genes into COG (Additional file 1: Figure S2) [29], 66.69% of the genes into KEGG (Additional file 1: Figure S3) [25], 97.34% of the genes into the NR database, 69.07%

genes into SwissProt [30] and 97.34% of the genes into TrEMBL [31] (see Methods for details). Moreover, 321 genes were identified in the CAZY (Carbohydrate-Active enzymes) database [32], 210 genes in the PHI-base (Pathogen – Host Interaction) database [33], 6 genes in DBETH (a Database of Bacterial Exotoxins for Human) [34] and 387 genes in VFDB (Virulence Factors Database) [35]. In addition, our analysis predicted genome islands, prophages and CRISPRs (Clustered Regularly Interspaced Short Palindromic Repeats), but no CRISPRs have been found. The genome map of E. faecium strain LCT-EF90 was shown in Figure 1. Figure 1 Genome map of E. faecium strain LCT-EF90 (ncRNA, COG annotation, GC content and GC skew). From outer to innner, the 1st circle shows the ncRNA result of the positive strand containing tRNA, rRNA and sRNA; the 2nd circle showed the COG function of the positive strand along scaffolds and each colour represents a function classification; the 3rd circle shows the ncRNA result of negative strand; the 4th circle shows the COG function of the negative strand; the 5th circle

shows the GC content (black); the 6th circle shows the GC skew ((G-C)/(G + C), green > 0, purple < 0). The 5th and 6th circle are plotted in relation to the average value. Comparative genomic HDAC inhibitor analysis We used LCT-EF90 as the reference strain and detected variations, including SNPs, InDels and structure variations (SVs) between LCT-EF258 and LCT-EF90 (Figure 2). For SNP identification, the query sequence was Ribonuclease T1 aligned with the reference sequence using

MUMmer software (Version 3.22) [36] (see Methods for details). The raw variation sites were identified and then filtered with strict standards to detect potential SNP sites. Finally, 1 SNP for E. faecium LCT-EF258 was detected and was located in the functional gene LCT-EF90GL001983 (Additional file 1: Table S2). The SNP mutation in LCT-EF90GL001983 was a non-synonymous substitution in dprA, a gene encoding a DNA processing protein based on KEGG pathway analysis, and may play an important role in phenotypic variation. Figure 2 Comparative genomic analysis. We used BRIG software to achieve alignment results of three genomes. The gray circle is LCT-EF90, and blue circle is LCT-EF258. There are some white regions in two circles, which are the gaps in genomes. The triangles indicate the general positions of the mutations with SNPs and InDels, which were annotated into genes dprA and arpU. To detect more variations, we used the LASTZ (Version 1.01.50) tool to identify InDels less than or equal to 10 bp (see Methods for details).

Comments are closed.