INTRODUCTION
The discovery and characterization of microbial species continues to be a long and painstaking process. Scientists have spent decades (well... really centuries) carefully studying the distinct properties of different microbes, examining the evolutionary relationships between different microbes, and organizing microbes into taxonomic groups based on various commonalities. One of the many benefits from this research program, and especially the DNA sequence characterization and organization of microbes, have been the resulting reference database collections which are used in sequence-based microbiome studies. In recent years there has been a lot of excitement growing aroung the use these DNA sequence reference databases to characterize microbial communities. While the spotlight has really been on microbial communities as a whole, the importance of individual microbial species discovery and characterization has been somewhat overlooked.A recent paper in PNAS did a really nice job of highlighting the important synergy between molecular microbial community studies and studies focusing on individual microbe discovery [1]. Most importantly, this study highlights the benefits microbiome studies will provide microbe species discovery studies, instead of the other way around (i.e. microbiome studies benefiting from microbe discovery studies). In this post, I first want to briefly go over what microbiome studies are actually doing, just to make sure we are all on the same page. Then I am going to talk about the interesting implications of the paper by Kang et al [1]. For another brief overview of the paper by Kang et al, check out the PNAS commentary by Culley, AI [2].
THE TWO BASIC TYPES OF MICROBIOME STUDIES
There are basically two types of molecular techniques, which make use of high-throughput sequencing platforms, that are used to study microbial communities (the microbiomes). The first is the amplicon sequencing approach, where the region of a certain ubiquitous gene is PCR amplified (this amplified gene region is called the amplicon), sequenced, and compared to other sequences. To illustrate this technique, we can think about bacteria community analysis using the 16S rRNA gene.
The 16S rRNA gene is a gene that contains the code needed to make protein synthesis machinery within the bacterial cell, and the gene is found in all bacteria. The 16S rRNA gene maintains conserved and variable sequences, with the conserved regions allowing for general PCR primer binding and the variable regions allowing for bacteria identification and community analyses (taxonomic and phylogenetic analyses). Scientists use this gene to study bacterial communities by PCR amplifying certain regions of the gene, sequencing those PCR products (amplicons), and using those sequences in community analyses.
Some populations, notably virus and bacteriophage populations, do not have ubiquitous genes that can be used in amplicon-based molecular techniques, so scientists have to use whole metagenome shotgun sequencing techniques. Instead of focusing on a certain genetic region like amplicon based techniques, this technique involves randomly sequencing fragments from all of the genomes within the population, reassembling them (based on sequence similarity) into large sections of genomes, and using those genomic sections for community analyses. The video below does a great job illustrating the basic concept behind whole metagenome shotgun sequencing.
GIVING A SINGLE VIRUS CONTEXT USING THE MICROBIOME
In their somewhat recent study, Kang et al discovered and characterized the HMO-2011 bacteriophage, which infects bacteria of the SAR116 clade, one of the most abundant marine bacterial lineages. At first, their study goes into the usual virus characterization with high points including electron microscope images, genome sequence overview, etc, and also goes over how the HMO-2011 phage interacts with its SAR116 host. This is all really interesting of course, but the overall highlight of the paper was what happened when they searched for matches to the HMO-2011 phage genome in existing phage metagenome data sets.
As I alluded to above, virus metagenomes often result in a high amount of unknown sequences (sometimes up to 99% of sequences are unknown), meaning they do not match any known virus, or even any known organism. This clearly limits the utility of some virus metagenomic data sets. To understand how their newly characterized phage fit into known virus communities, Kang et al searched for matches to their new phage genome in existing phage metagenome data sets, and found something quite interesting. The group's phage matched up to 30% more sequences, meaning that the addition of their single new phage genome to the reference database provided 30% more sequences with matches to known viruses (overview in figure below) [1,2]. Said another way, this study used existing marine virus metagenomes to give context to the phage discovery, and likewise added important information to the existing marine phage metagenomes.
The second reason why this discovery is important is because it really supports the value of research that focuses on discovering novel microbes. There are a lot of microbes which have not yet been discovered or characterized, so this research must continue if we are to improve our understanding of microbial communities. Microbiome studies have somewhat overshadowed single microbe studies, but this paper really shows how important the single microbe studies are.
Finally, this paper's discoveries are overall important because they shed some light onto how much unknown data microbiome sequences contain, and how previous microbiome studies will continue to be important when integrated into future analyses. Much like how species discovery aids microbiome studies by providing extensive reference collections, microbiome studies will aid microbe discoveries by giving context to their discoveries and allowing greater understanding of how they fit into bigger pictures. Additionally, as more microbiomes are characterized, I predict we will see an increase in their use in giving context to discoveries of newly characterized microbes.
The second reason why this discovery is important is because it really supports the value of research that focuses on discovering novel microbes. There are a lot of microbes which have not yet been discovered or characterized, so this research must continue if we are to improve our understanding of microbial communities. Microbiome studies have somewhat overshadowed single microbe studies, but this paper really shows how important the single microbe studies are.
Finally, this paper's discoveries are overall important because they shed some light onto how much unknown data microbiome sequences contain, and how previous microbiome studies will continue to be important when integrated into future analyses. Much like how species discovery aids microbiome studies by providing extensive reference collections, microbiome studies will aid microbe discoveries by giving context to their discoveries and allowing greater understanding of how they fit into bigger pictures. Additionally, as more microbiomes are characterized, I predict we will see an increase in their use in giving context to discoveries of newly characterized microbes.
2. Culley AI (2013). Insight into the unknown marine virus majority. Proceedings of the National Academy of Sciences of the United States of America, 110 (30), 12166-7 PMID: 23842091
No comments:
Post a Comment