tag:blogger.com,1999:blog-89710270810519369892024-03-18T09:19:41.650-04:00ProphageThe Blog for Bacteria, Phages, Computers, and ScienceAnonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.comBlogger93125tag:blogger.com,1999:blog-8971027081051936989.post-83574004679493692542017-11-18T19:47:00.000-05:002017-11-18T19:47:15.214-05:00We Are Moving!After a couple awesome years and over 100,000 views, Prophage is moving to a new site. Check out the first post on the new site <a href="http://microbiology.github.io/blog/hello-world/">here</a>. Follow the link below for the new site.<br />
<br />
<a href="http://microbiology.github.io/">New Prophage Blog Site</a>Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com18tag:blogger.com,1999:blog-8971027081051936989.post-18235534688368141052017-06-17T09:49:00.000-04:002017-06-17T09:49:03.302-04:00Improving Your Skill Set: Tips for Learning New Programming Languages<div style="text-align: justify;">
<div class="separator" style="clear: both; text-align: center;">
</div>
<a href="http://ebookskart.com/wp-content/uploads/2016/11/Computer-Programming.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="http://ebookskart.com/wp-content/uploads/2016/11/Computer-Programming.jpg" data-original-height="546" data-original-width="728" height="240" width="320" /></a>I spend a considerable amount of my time with scientists who are staring to learn to code (either IRL or online), often in the hopes that it will open future research and career doors. One of the major barriers I have found in my own experience, as well as observed with others, is learning and implementing new programming languages after I already learned one. If we are honest, learning a programming language is very challenging but also incredibly rewarding. So do we really want to go through that again with a new language, and will the new language open as many analytical doors as the first? This week I was to discuss the process of learning new programming languages, and offer you some tips on learning a new language, if that is something that interests you.</div>
<div style="text-align: justify;">
<br />
<a name='more'></a></div>
<h3 style="text-align: justify;">
Why Learn a New Programming Language?</h3>
<div style="text-align: justify;">
Once you know how to program with a language like <a href="https://www.python.org/">Python</a> or <a href="https://www.r-project.org/">R</a>, why bother learning other languages? This is a logical question, and one answer is that different languages can offer you very different strengths and weaknesses. Some languages are faster than others, some implement more responsible memory usages, some are just easier to read and write (we're looking at you, <a href="https://www.perl.org/">Perl</a>), and some have stronger communities to support your applications (e.g. many scientific applications are <a href="https://cran.r-project.org/">supported by the R community</a>). Being familiar, if not proficient, in multiple languages offers the ability to take advantage of the strengths of multiple different languages, and apply the tools that are best fit for the job.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Another reason for learning new languages is the fact that most languages are not around forever, and eventually become replaced. This makes "keeping up" a very valuable skill. Yet another good reason for learning new languages is that the exposure to different structures allows you to think about all of your code and data in new and challenging ways. This is a great exercise for becoming a better all-around scientist.</div>
<div style="text-align: justify;">
<br /></div>
<h3 style="text-align: justify;">
Don't Sell Yourself Short</h3>
<div style="text-align: justify;">
Through various conversations and my own experiences, I have learned that it is really easy for us to sell ourselves short. We see a new and unfamiliar language and think, "I can't learn how to use that; I barely learned the language I know now." The fact is that the first language is the hardest, and you will be surprised how much faster you will pickup new languages. So we really need to "go for it" and check out the new languages we think will be helpful. The worst thing we can do is disqualify ourselves before giving it a try.</div>
<div style="text-align: justify;">
<br /></div>
<h3 style="text-align: justify;">
Skills For Learning New Languages (In 30 Minutes)</h3>
<div style="text-align: justify;">
Last week I gave a presentation to my lab about this topic. The main goal being to illustrate how each of us can learn new programming languages faster and more effectively than we give ourselves credit for. To this end, I proposed that we break into small groups (although you can do this by yourself as well) and solve the following simplified Fizz Buzz test.</div>
<div style="text-align: justify;">
<br /></div>
<blockquote class="tr_bq" style="text-align: justify;">
Write a program that prints the numbers 1 to 100. Print the word "FIZZ" next to every number divisible by 7, and the word "BUZZ" by every number that is not divisible by 7.</blockquote>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The trick here is that we solved the problem using <a href="https://julialang.org/">Julia</a>, a language that nobody else has used in our lab (if you want to do this yourself, pick a different language if you are familiar with Julia). I mentioned to the group that, when I start familiarizing myself with a new language to solve a specific task, I begin by breaking the problem up into smaller, "Google-able" problems. These can then be solved and put together to form the overall code. Smaller problems within our Fizz Buzz test include:</div>
<div style="text-align: justify;">
<br /></div>
<ul>
<li style="text-align: justify;">How do I print words and numbers?</li>
<li style="text-align: justify;">How do I create a string of numeric ranges (1,2,3,...)?</li>
<li style="text-align: justify;">How do I loop tasks?</li>
<li style="text-align: justify;">How do I perform conditional evaluations?</li>
</ul>
<br />
<div style="text-align: justify;">
After going through this, we broke up into groups and everyone was able to solve the problem within 30 minutes. When we came back together, we discussed our solutions while reflecting on the following points:</div>
<div style="text-align: justify;">
<br /></div>
<ul>
<li style="text-align: justify;">How did we break up the tasks?</li>
<li style="text-align: justify;">What was the most challenging part of the problem?</li>
<li style="text-align: justify;">What does the solution look like (show the code to the rest of the lab)?</li>
</ul>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<a href="https://www.roberthalf.com.hk/sites/roberthalf.com.hk/files/best-programming-language-sg.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="537" data-original-width="800" height="214" src="https://www.roberthalf.com.hk/sites/roberthalf.com.hk/files/best-programming-language-sg.jpg" width="320" /></a>And there you have it. After about a 45 minute meeting, everyone in the room was able to solve an analytical task using a totally new language. I definitely encourage you to also give it a try. If you run into problems, or have more questions about how you can use this approach as a teaching tool, reach out in the comments below, Twitter, or email (my contact info to the right side of the page). As always, please also reach out if you have any other questions, comments, or concerns. I always love hearing from people.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
I also wanted to include a note about the state of this blog. You may have noticed that posting has become less frequent lately. I have a lot of other projects going on right now, which are awesome but also mean I don't have a ton of time to devote to blogging. I look forward to getting back into a more regular routine, but for now the posts are going to be more spread out. Thanks for reading!</div>
<br />
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com7tag:blogger.com,1999:blog-8971027081051936989.post-70876068012929864782017-05-08T20:07:00.000-04:002017-05-08T20:07:57.490-04:00A Primer on Downloading Sequencing Data from MG-RAST & the SRA<br />
<div style="text-align: justify;">
<a href="https://nmap.org/movies/matrix/trinity-nmapscreen-hd-crop-1200x728.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="194" src="https://nmap.org/movies/matrix/trinity-nmapscreen-hd-crop-1200x728.jpg" style="cursor: move;" width="320" /></a>One of the best set of resources we have for bioinformatics, and especially microbiome research, are the extensive and freely available DNA sequence archives. For the past few years, most studies have been (and in most cases required to) archiving their relevant sequence datasets so that they are freely available to the public and other researchers. This is becoming an increasingly valuable resource for data mining and meta-analyses now that we have about a decade of archiving behind us. Just as these datasets can be highly valuable research tools, they can also be particularly difficult resources to download and prepare for analysis. I have been meaning to get to this for a while, so this week I want to go through an introduction to downloading these datasets. My goal is to equip you to easily get the sequence sets onto your own computer and start your own analysis.</div>
<br />
<br />
<a name='more'></a><br />
<h3>
The Sequence Read Archive (SRA)</h3>
<div style="text-align: justify;">
One of the largest (if not the largest) sequence dataset archives available to the public is the <a href="https://www.ncbi.nlm.nih.gov/sra">United States National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA)</a>. This sequence archive has years of DNA sequencing studies readily available, but getting the reads can be a little bit of a challenge. They do have instructions (and other tools for downloading) in their documentation, but to make things easier, we will go through it here while including some custom scripts that you can use.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
An easy way to get SRA datasets using command line tools is downloading the data from their ftp (no worries if you don't know what that is; it's just a site to download data from). As long as you are downloading a small-ish dataset, the wget tool works great. A nice subroutine you can use is as follows.</div>
<!-- HTML generated using hilite.me --><br />
<div style="background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;">
<pre style="line-height: 125%; margin: 0;">DownloadFromSRA <span style="color: #333333;">()</span> <span style="color: #333333;">{</span>
<span style="color: #996633;">line</span><span style="color: #333333;">=</span><span style="background-color: #fff0f0;">"${1}"</span>
<span style="color: #007020;">echo </span>Processing SRA Accession Number <span style="background-color: #fff0f0;">"${line}"</span>
mkdir ./data/<span style="color: #008800; font-weight: bold;">${</span><span style="color: #996633;">Output</span><span style="color: #008800; font-weight: bold;">}</span>/<span style="background-color: #fff0f0;">"${line}"</span>
<span style="color: #996633;">shorterLine</span><span style="color: #333333;">=</span><span style="color: #008800; font-weight: bold;">${</span><span style="color: #996633;">line</span>:<span style="color: #996633;">0</span>:<span style="color: #996633;">3</span><span style="color: #008800; font-weight: bold;">}</span>
<span style="color: #996633;">shortLine</span><span style="color: #333333;">=</span><span style="color: #008800; font-weight: bold;">${</span><span style="color: #996633;">line</span>:<span style="color: #996633;">0</span>:<span style="color: #996633;">6</span><span style="color: #008800; font-weight: bold;">}</span>
<span style="color: #007020;">echo </span>Looking <span style="color: #008800; font-weight: bold;">for</span> <span style="color: #008800; font-weight: bold;">${</span><span style="color: #996633;">shorterLine</span><span style="color: #008800; font-weight: bold;">}</span> with <span style="color: #008800; font-weight: bold;">${</span><span style="color: #996633;">shortLine</span><span style="color: #008800; font-weight: bold;">}</span>
<span style="color: #888888;"># Recursively download the contents of the </span>
wget -r --no-parent -A <span style="background-color: #fff0f0;">"*"</span> ftp://ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByStudy/sra/<span style="color: #008800; font-weight: bold;">${</span><span style="color: #996633;">shorterLine</span><span style="color: #008800; font-weight: bold;">}</span>/<span style="color: #008800; font-weight: bold;">${</span><span style="color: #996633;">shortLine</span><span style="color: #008800; font-weight: bold;">}</span>/<span style="color: #008800; font-weight: bold;">${</span><span style="color: #996633;">line</span><span style="color: #008800; font-weight: bold;">}</span>/
mv ./ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByStudy/sra/<span style="color: #008800; font-weight: bold;">${</span><span style="color: #996633;">shorterLine</span><span style="color: #008800; font-weight: bold;">}</span>/<span style="color: #008800; font-weight: bold;">${</span><span style="color: #996633;">shortLine</span><span style="color: #008800; font-weight: bold;">}</span>/<span style="color: #008800; font-weight: bold;">${</span><span style="color: #996633;">line</span><span style="color: #008800; font-weight: bold;">}</span>/*/*.sra ./data/<span style="color: #008800; font-weight: bold;">${</span><span style="color: #996633;">Output</span><span style="color: #008800; font-weight: bold;">}</span>/<span style="background-color: #fff0f0;">"${line}"</span>
rm -r ./ftp-trace.ncbi.nih.gov
<span style="color: #333333;">}</span>
</pre>
<pre style="line-height: 125%; margin: 0;"><span style="color: #333333;">
</span></pre>
<pre style="line-height: 125%; margin: 0;"><span style="color: #333333;"><span class="pl-k" style="box-sizing: border-box; color: #a71d5d; font-family: , "consolas" , "liberation mono" , "menlo" , "courier" , monospace; font-size: 12px;">export</span><span style="color: #24292e; font-family: , "consolas" , "liberation mono" , "menlo" , "courier" , monospace; font-size: 12px;"> -f DownloadFromSRA</span></span></pre>
</div>
<br />
<div style="text-align: justify;">
If you copy and paste this into your command line (Linux/Mac), you can just type the subroutine name "DownloadFromSRA", followed by the project ID that you want to use, and it will download all of the samples for you. If you are using a Mac, be sure to install wget using something like <a href="https://brew.sh/">Homebrew</a> (which I highly suggest for downloading tools in general). The files you get will be in the SRA format, so you have to remember to convert them to fastq format using their custom tools.</div>
<div style="text-align: justify;">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://i.ytimg.com/vi/KEkrWRHCDQU/maxresdefault.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="223" src="https://i.ytimg.com/vi/KEkrWRHCDQU/maxresdefault.jpg" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">You don't have to be a superhero hacker to get DNA data from public archives.</td></tr>
</tbody></table>
<div style="text-align: justify;">
<br /></div>
<h3 style="text-align: justify;">
The Metagenomics RAST Server (MG-RAST)</h3>
<div style="text-align: justify;">
Although used less than the SRA, the Metagenomics RAST Server (MG-RAST) is another one of the major archives available for free public use. Although MG-RAST is a nice sequence repository, it is unfortunately more difficult to use than the SRA (for downloading sequences at least). The key to downloading MG-RAST data with command line tools is honestly complicated at first, and sort of hidden in the documentation. Again, to make things easier, we can use some custom scripts to make things happen.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The trick to getting the MG-RAST sequence files using a project ID is that you have to first download the project metadata, and then use the parsed metadata information to download the actual files (this is done in the second loop below. The actual URL to use with their API is also kind of confusing, but once you get it you are ready to go.</div>
<!-- HTML generated using hilite.me --><br />
<div style="background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;">
<pre style="line-height: 125%; margin: 0;">DownloadFromMGRAST <span style="color: #333333;">()</span> <span style="color: #333333;">{</span>
<span style="color: #996633;">line</span><span style="color: #333333;">=</span><span style="background-color: #fff0f0;">"${1}"</span>
<span style="color: #007020;">echo </span>Processing MG-RAST Accession Number <span style="background-color: #fff0f0;">"${line}"</span>
mkdir -p ./data/<span style="background-color: #fff0f0;">"${line}"</span>
<span style="color: #888888;"># Download the raw information for the metagenomic run from MG-RAST</span>
wget -O ./data/<span style="background-color: #fff0f0;">"${line}"</span>/tmpout.txt <span style="background-color: #fff0f0;">"http://api.metagenomics.anl.gov/1/project/mgp${line}?verbosity=full"</span>
<span style="color: #888888;"># Pasre the raw metagenome information for indv sample IDs</span>
sed <span style="background-color: #fff0f0;">'s/metagenome_id\"\:\"/\nmgm/g'</span> ./data/<span style="background-color: #fff0f0;">"${line}"</span>/tmpout.txt <span style="background-color: #fff0f0; color: #666666; font-weight: bold;">\</span>
| sed <span style="background-color: #fff0f0;">'s/\".*//'</span> <span style="background-color: #fff0f0; color: #666666; font-weight: bold;">\</span>
| grep mgm <span style="background-color: #fff0f0; color: #666666; font-weight: bold;">\</span>
> ./data/<span style="background-color: #fff0f0;">"${line}"</span>/SampleIDs.tsv
<span style="color: #888888;"># Get rid of the raw metagenome information now that we are done with it</span>
rm ./data/<span style="background-color: #fff0f0;">"${line}"</span>/tmpout.txt
<span style="color: #888888;"># Now loop through all of the accession numbers from the metagenome library</span>
<span style="color: #008800; font-weight: bold;">while </span><span style="color: #007020;">read </span>acc; <span style="color: #008800; font-weight: bold;">do</span>
<span style="color: #008800; font-weight: bold;"> </span><span style="color: #007020;">echo </span>Loading MG-RAST Sample ID is <span style="background-color: #fff0f0;">"${acc}"</span>
<span style="color: #888888;"># file=050.1 means the raw input that the author meant to archive</span>
wget -O ./data/<span style="background-color: #fff0f0;">"${line}"</span>/<span style="background-color: #fff0f0;">"${acc}"</span>.fa <span style="background-color: #fff0f0;">"http://api.metagenomics.anl.gov/1/download/${acc}?file=050.1"</span>
<span style="color: #008800; font-weight: bold;">done</span> < ./data/<span style="background-color: #fff0f0;">"${line}"</span>/SampleIDs.tsv
<span style="color: #888888;"># Get rid of the sample list file</span>
rm ./data/<span style="background-color: #fff0f0;">"${line}"</span>/SampleIDs.tsv
<span style="color: #333333;">}</span>
<span style="color: #007020;">export</span> -f DownloadFromMGRAST
</pre>
</div>
<br />
<div style="text-align: justify;">
These files will be in the fasta format instead of the sra format you get from the SRA. Also note that this uses GNU sed, which is not installed on Mac computers by default (Mac has a different version of sed. I know, it's kind of annoying). So make sure that, if you are running this on a Mac, install GNU sed using Homebrew again.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
To give it a try, copy and paste this subroutine into your command line, and then write the project ID, like below.</div>
<br />
<!-- HTML generated using hilite.me --><br />
<div style="background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;">
<pre style="line-height: 125%; margin: 0;">DownloadFromMGRAST 4843
</pre>
</div>
<br />
<h3>
Conclusions</h3>
<div style="text-align: justify;">
So there you have it. A very brief introduction to downloading SRA and MG-RAST datasets, with an emphasis on providing you the tools to do it yourself. Go ahead and give it a try. Let me know how it works, and if you run into problems, feel free to reach out with questions. And of course, please let me know if you have any questions, comments, or concerns!</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Finally, thanks for reading! If you are a frequent reader, you might have noticed that my posts have been less frequent lately. I apologize for that. This has been an eventful year, which is great in general but bad for keeping up with the blog. As usual, it means I have some other exciting projects going on, and I am excited to share those experiences on here later. So for now the posts will be less frequent, but I look forward to getting back in a more frequent writing groove in the near future.</div>
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com20tag:blogger.com,1999:blog-8971027081051936989.post-18390134954497951302017-04-08T09:53:00.000-04:002017-04-08T09:53:09.166-04:00Publication Alert: High Nucleotide Resolution Study of the Skin Virome<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://dfzljdn9uc3pi.cloudfront.net/2017/2959/1/fig-4-2x.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto; text-align: justify;"><img border="0" height="261" src="https://dfzljdn9uc3pi.cloudfront.net/2017/2959/1/fig-4-2x.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">We identified diversity generating retroelements as a<br />
potential mechanism driving targeted genomic diversity.</td></tr>
</tbody></table>
<br />
<div style="text-align: justify;">
A few weeks ago some colleagues and myself published a <a href="https://peerj.com/articles/2959/">new manuscript</a> looking at the diversity of the human skin virome. In <a href="http://mbio.asm.org/content/6/5/e01578-15.abstract">our previous previous work</a>, we evaluated the diversity of viruses on the skin. Other groups have looked at virus diversity at other body sites including the gut, lungs, and oral cavity. Our new paper focused on the diversity within viruses on the skin. It provided initial insight into the genomic variability associated with major viruses in the skin virome. In other words, it was a "high resolution" study of the virome.</div>
<br />
<div style="text-align: justify;">
</div>
<a name='more'></a><br />
<div style="text-align: justify;">
One of the highlights of the manuscript was identifying numerous <a href="https://en.wikipedia.org/wiki/Hypervariable_region">hyper-variable loci </a>within the skin virus genomes that we investigated. We did this using a SNP geometric distribution approach instead of a sliding window because it allowed us to establish regions that were more variable than would be expected by random chance, and did not require us to arbitrarily establish a window size for the loci. The loci identified using this approach were associated with stronger evolutionary pressure than their adjacent regions, suggesting they are functionally important. We followed up with this, but I will let you get the details from the manuscript.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
A methodological highlight was the validation of our findings using an existing dataset from a different lab. We performed our analytical workflow a second time using another skin metagenomic dataset from a different skin microbiome lab. Even though the second dataset did not undergo virus purification, we were still able to pull out enough viral reads to perform our targeted analysis. We replicated our findings in the second dataset, thereby supporting the strength and biological ubiquity of our findings. My hope is that more groups will perform this type of validation in their future studies, especially since there is so much archived data just waiting to be utilized.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The challenge with this study was writing the analytical tools that I needed to answer our questions. In the end, I built a lot of tools that allowed me to answer evolutionary and functional virome questions. I think these are pretty easy to use, and made them freely available on GitHub. If you are interested in performing similar evolutionary analyses on your virome datasets, <a href="https://github.com/Microbiology/ViromeVarScripts">check the code out here</a> and let me know if you have any questions.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
In the end, I think this is a pretty cool study and I really enjoyed working on it. If this summary sounds interesting, I suggest you check out the paper. It is freely available online and easy to download.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
As always, if you have any questions, comments, or concerns, please let me know in the comments section below, shoot me an email, or find me on twitter (links are to the right). I always love to hear from readers!</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<span style="float: left; padding: 5px;"><a href="http://www.researchblogging.org/"><img alt="ResearchBlogging.org" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" style="border: 0;" /></a></span>
<br />
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<h3>
References</h3>
<br /></div>
<br />
<br />
<br />
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=PeerJ&rft_id=info%3Apmid%2F28194314&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Evolutionary+and+functional+implications+of+hypervariable+loci+within+the+skin+virome.&rft.issn=&rft.date=2017&rft.volume=5&rft.issue=&rft.spage=&rft.epage=&rft.artnum=&rft.au=Hannigan+GD&rft.au=Zheng+Q&rft.au=Meisel+JS&rft.au=Minot+SS&rft.au=Bushman+FD&rft.au=Grice+EA&rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CMedicine%2CHealth">Hannigan GD, Zheng Q, Meisel JS, Minot SS, Bushman FD, & Grice EA (2017). Evolutionary and functional implications of hypervariable loci within the skin virome. <span style="font-style: italic;">PeerJ, 5</span> PMID: <a href="http://www.ncbi.nlm.nih.gov/pubmed/28194314" rev="review">28194314</a></span><br />
<br />
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com7tag:blogger.com,1999:blog-8971027081051936989.post-61395910392380025102017-03-12T14:17:00.000-04:002017-03-12T14:17:39.985-04:00Correlations In Random Genomic Data: A Simple Biology Pitfall<div style="text-align: justify;">
<a href="https://www.washcoll.edu/live/image/gid/74/width/650/11500_math2.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="213" src="https://www.washcoll.edu/live/image/gid/74/width/650/11500_math2.jpg" width="320" /></a>Wow it has been a long time since we have had a post on here! As always, that means other projects are in the works and the blog has taken a bit of a back seat, but we are back and ready to talk science. This week I wanted to get to a topic I have been meaning to get to for a while. If you are a frequent reader, you know that every now and then I like to go over some basic statistics topics that cause confusion among biologists, as well as scientists in other fields. This week I want to cover a common statistical pitfall, with the hopes that it will prevent readers from making simple mistakes. The topic for this post will be obtaining statistically significant correlations from random gene expression data.</div>
<div style="text-align: justify;">
<br />
<a name='more'></a></div>
<div style="text-align: justify;">
To kick things off, let's imagine I have a gene expression dataset. More specifically, I have expression data for gene 1 and gene 2, as well as a housekeeping gene (these genes are usually used for experimental controls). Ultimately I want to compare expression of gene 1 and gene 2 in 50 different people, with my hypothesis being that the expression of these genes are positively correlated with each other (when gene 1 is highly expressed, gene 2 is also highly expressed).</div>
<div>
<div style="text-align: justify;">
<br /></div>
</div>
<div>
<div style="text-align: justify;">
We could answer this type of question with real data, but what if we use random data? In <a href="https://www.r-project.org/about.html">R (an awesome statistical programming language)</a>, I can generate a random set of 50 numbers, representing the gene expression of gene 1 and gene 2 among 50 different people. Please note that for consistency, I set the random seed so that the code always returns the same result. Also note that the visualization is done using log values to make it clearer, but the correlations are the raw values, and not log transformed.<br />
<br /></div>
</div>
<div style="background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;">
<pre style="line-height: 125%; margin: 0;"><span style="color: #888888;"># Load the ggplot2 library</span>
library(ggplot2)
<span style="color: #888888;"># Set the random seed</span>
set.seed(<span style="color: #6600ee; font-weight: bold;">1234</span>)
<span style="color: #888888;"># Create two sets of 50 random numbers</span>
x <span style="color: #333333;"><-</span> sample(<span style="color: #6600ee; font-weight: bold;">1</span><span style="color: #333333;">:</span><span style="color: #6600ee; font-weight: bold;">10000</span>, <span style="color: #6600ee; font-weight: bold;">50</span>)
y <span style="color: #333333;"><-</span> sample(<span style="color: #6600ee; font-weight: bold;">1</span><span style="color: #333333;">:</span><span style="color: #6600ee; font-weight: bold;">10000</span>, <span style="color: #6600ee; font-weight: bold;">50</span>)
<span style="color: #888888;"># Put them together in a data frame</span>
df <span style="color: #333333;"><-</span> data.frame(gene1 <span style="color: #333333;">=</span> x, gene2 <span style="color: #333333;">=</span> y)
<span style="color: #888888;"># Plot the results</span>
qplot(log10(gene1), log10(gene2), data <span style="color: #333333;">=</span> df) <span style="color: #333333;">+</span> theme_classic()
cor.test(df<span style="color: #333333;">$</span>gene1, df<span style="color: #333333;">$</span>gene2)
</pre>
</div>
<div>
<div style="text-align: justify;">
<br />
<br /></div>
</div>
<div class="separator" style="clear: both; text-align: justify;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-wks91BA49_s/WMWMdNYAP7I/AAAAAAAABgg/wfJ3GwAicvk7rArl6ZNHoGKNLd_lnCnaACLcB/s1600/randomscatter1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://1.bp.blogspot.com/-wks91BA49_s/WMWMdNYAP7I/AAAAAAAABgg/wfJ3GwAicvk7rArl6ZNHoGKNLd_lnCnaACLcB/s320/randomscatter1.png" width="320" /></a></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The resulting correlation had a p-value = 0.59 and a correlation (r) of 0.077 (using Pearson correlation coefficient). We therefore generated a set of random expression values for genes 1 and 2, and when we plotted them against each other, we got a random distribution of points. This set of gene expressions was not correlated.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
But if this was actually an experimental result, we might think "of course there is no correlation, we forgot to correct our results with our housekeeping gene control". For readers unfamiliar with this practice, gene expression data can sometimes be skewed by differences in loading the machine, preparing samples, etc. This can mean some samples look more abundant due to experimental variability instead of biological variability. To correct for this, we can use "housekeeping genes", which are genes that we expect to be expressed about the same amount in each person.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
To correct our data for experimental variability, we may <b><i>divide each of our gene 1 and gene 2 expression values by the housekeeping gene value from that sample</i></b>. Therefore, if sample 1 had twice as much material loaded as sample 2, we would divide it by twice as much housekeeping gene, and the result would be values over approximately the same housekeeping gene expression. In our example, we can make this correction using a third set of randomly generated numbers.<br />
<br />
<!-- HTML generated using hilite.me --><br />
<div style="background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;">
<pre style="line-height: 125%; margin: 0;"><span style="color: #888888;"># Building off of the previous code</span>
<span style="color: #888888;"># Create random housekeeping gene expression data</span>
z <span style="color: #333333;"><-</span> sample(<span style="color: #6600ee; font-weight: bold;">1</span><span style="color: #333333;">:</span><span style="color: #6600ee; font-weight: bold;">10000</span>, <span style="color: #6600ee; font-weight: bold;">50</span>)
df2 <span style="color: #333333;"><-</span> data.frame(gene1 <span style="color: #333333;">=</span> x<span style="color: #333333;">/</span>z, gene2 <span style="color: #333333;">=</span> y<span style="color: #333333;">/</span>z)
qplot(log10(gene1), log10(gene2), data <span style="color: #333333;">=</span> df2) <span style="color: #333333;">+</span>
theme_classic() <span style="color: #333333;">+</span>
geom_smooth(method<span style="color: #333333;">=</span>lm, se <span style="color: #333333;">=</span> <span style="color: #008800; font-weight: bold;">FALSE</span>)
cor.test(df2<span style="color: #333333;">$</span>gene1, df2<span style="color: #333333;">$</span>gene2)
</pre>
</div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-6Eoj99C-8XE/WMWMi1vadHI/AAAAAAAABgk/O0-UBQlWe58Ij1SnUB3q1CjkflwBqu13wCLcB/s1600/randomscatter2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://1.bp.blogspot.com/-6Eoj99C-8XE/WMWMi1vadHI/AAAAAAAABgk/O0-UBQlWe58Ij1SnUB3q1CjkflwBqu13wCLcB/s320/randomscatter2.png" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: justify;">
<br /></div>
<div style="text-align: justify;">
With this correction using more random data, we would expect to see another lack of correlation. On the contrary, the resulting correlation had a p-value = 3.4e-9 and a correlation (r) of 0.72. Even though we were using entirely random data, we obtained a much higher and statistically significant correlation between expression of gene 1 and 2. Here we can see that we did something terribly wrong to get this result, but if we had done this on a biological dataset, we might think that we had found a great result and may even push to publish it.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
So what did we do wrong? Ultimately it was that "correction" that hurt us. When we divided each person's gene 1 and gene 2 expression value by the same "housekeeping" value for that person, we were introducing a common transformation within each sample that made gene 1 and 2 expression more similar to each other, within each person. This is why it is problematic when we apply functions (such as division) to each sample individually and then perform a correlation of those samples. This is also why we have to be careful in our correlation analyses and think carefully about how we are dealing with our data, and what correlations we might be introducing by mistake.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Of course this is just a simple example and the principle could apply to many scenarios. The main point here is to outline a potential way we might skew our results without knowing it. Overall I hope this summary was informative and will help in thinking about analyses in future experiments.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Any questions, comments, or concerns? Always feel welcome to reach out in the comment section below, or reach out to me on Twitter (my Twitter link is on the right).</div>
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com2tag:blogger.com,1999:blog-8971027081051936989.post-36595176592563055152017-01-28T19:52:00.000-05:002017-01-28T19:52:57.697-05:00A Model for Phage Communication and the Implications for the Human Microbiome<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/--InHvXNHmW0/WI0sXp0hPwI/AAAAAAAABew/GAw1mmM7Uec4fhLsYu2fJow66aGbXualACLcB/s1600/phagemodel.png" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" height="400" src="https://1.bp.blogspot.com/--InHvXNHmW0/WI0sXp0hPwI/AAAAAAAABew/GAw1mmM7Uec4fhLsYu2fJow66aGbXualACLcB/s400/phagemodel.png" width="167" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The research group prepared<br />
two types of media to test phage<br />
infection efficacy.</td></tr>
</tbody></table>
<div style="text-align: justify;">
Well we took a bit of a break these past couple of weeks, but we are back for the new year! Welcome to the Prophage blog 2017! The year has actually been off to a good start, with a lot of interesting papers being published this January. This week I want to kick things off by covering a very cool 2017 study by Erez <i>et al</i> that described an new and interesting mechanism by which bacteriophages communicate using their bacterial hosts. This really is a well written and elegant study that I highly suggest you read. In this post, I want us to cover the highlights of the study, and then discuss what this will mean for future research endeavors.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The research group led by Erez <i>et al</i> began their work by testing the hypothesis that <b>"bacteria secrete communication molecules to alert other bacteria of phage infection"</b>, but what they ended up finding was arguably much more interesting. They began their series of experiments by simply growing bacteria in liquid media with and without bacteriophages (see the figure to the right). They let the mixture sit long enough for the phages to infect their bacterial hosts for a couple of replication cycles (3 hours), and then removed all of the bacteria and phages from the liquid by filtration. At this point, if there was a signaling molecule released during the infection, it would still be in the media even though the phages and bacteria were removed. Additionally, if there was a signaling molecule released during the phage infection period, repeating an infection in that same media would result in altered growth patterns (for example, less bacteria killed when the molecule is present). As it turns out, this is exactly what they observed.</div>
<div style="text-align: justify;">
<br />
<a name='more'></a></div>
<div style="text-align: justify;">
The group found that phage infections were much less efficient when done in media that had already been used for phage infections. After careful study, the researchers found that the signaling molecule was in fact a small protein that was associated with the phage, not the bacterial host. The signal was highly phage specific. This meant that their observation was not of bacterial warning as initially hypothesized, but rather phage signaling to other phages. This is the first time such an extracellular signaling mechanism has been described between phages (at least as far as I know), which is pretty significant.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Following further characterization, the group found that the protein is (could be) used by many different phages to signal to other of the same phages whether they should enter a lytic replication cycle (reproduce and kill the bacterial host) or a lysogenic cycle (integrate into the bacterial genome and exist silently). The authors end their paper with a proposed mechanistic model for the phage to phage communication. They call this system the arbitrium system, after the latin word for decision.</div>
<div style="text-align: justify;">
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-E0VRXizq3mU/WI0szVi77yI/AAAAAAAABe0/bNNhCMdnIUgdH4jugXZmiViQa-Pf8dYTgCLcB/s1600/mechmodel.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="218" src="https://1.bp.blogspot.com/-E0VRXizq3mU/WI0szVi77yI/AAAAAAAABe0/bNNhCMdnIUgdH4jugXZmiViQa-Pf8dYTgCLcB/s400/mechmodel.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The authors propose this mechanistic model for phage signaling.</td></tr>
</tbody></table>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
What really makes this study cool is the implications it could have for microbiology and associated clinical applications. I think this finding could be especially important for our understanding of the human microbiome and virome. As we study the microbiome we strive, in part, to understand how bacteria and phages interact in human systems such as the gut, and understanding phage to phage signaling will be important for obtaining a more accurate picture of the system.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
These findings could lead to some very interesting experiments. How would a cocktail of this type of signaling molecule (the authors identify many) alter gut virus or bacterial communities? How would this impact microbiome stability? Would a decrease in phage lytic capabilities significantly disrupt the kill-the-winner dynamics we see in the human microbiome, and result in low bacterial diversity with some un-checked bacteria taking over? The human microbiome is a complicated system, but this could be a step toward better understanding its dynamics, and maybe even contribute toward therapeutic applications.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
I also think that these findings could be important for phage engineering and phage therapy. One of the big challenges in phage therapy is obtaining lytic bacteriophages that can effectively kill the pathogenic bacterial target. Lysogenic phages can also be effective in phage therapy, although they may be more effective if lysogeny could be avoided. It may also be advantageous to knock this gene out of phage therapy candidates.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
In the end, this study has a lot of implications and I bet microbiologists are already thinking of hundreds of experiments they can conduct. And that is really what makes this study cool. It not only offers important information to the field, but it really captures and inspires the imagination of other scientists who read it. So if you have not read it yet, I highly suggest you go check it out.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
What were your thoughts about the study? What implications do you think this will have for microbiology and the human microbiome? Let us know in the comments section, along with all of your questions, comments, and concerns. You can always reach out by Twitter or email as well.</div>
<div style="text-align: justify;">
<br /></div>
<h3>
References</h3>
<div style="text-align: justify;">
<br /></div>
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=Nature&rft_id=info%3Adoi%2F10.1038%2Fnature21049&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Communication+between+viruses+guides+lysis%E2%80%93lysogeny+decisions&rft.issn=0028-0836&rft.date=2017&rft.volume=541&rft.issue=7638&rft.spage=488&rft.epage=493&rft.artnum=http%3A%2F%2Fwww.nature.com%2Fdoifinder%2F10.1038%2Fnature21049&rft.au=Erez%2C+Z.&rft.au=Steinberger-Levy%2C+I.&rft.au=Shamir%2C+M.&rft.au=Doron%2C+S.&rft.au=Stokar-Avihail%2C+A.&rft.au=Peleg%2C+Y.&rft.au=Melamed%2C+S.&rft.au=Leavitt%2C+A.&rft.au=Savidor%2C+A.&rft.au=Albeck%2C+S.&rft.au=Amitai%2C+G.&rft.au=Sorek%2C+R.&rfe_dat=bpr3.included=1;bpr3.tags=Biology">Erez, Z., Steinberger-Levy, I., Shamir, M., Doron, S., Stokar-Avihail, A., Peleg, Y., Melamed, S., Leavitt, A., Savidor, A., Albeck, S., Amitai, G., & Sorek, R. (2017). Communication between viruses guides lysisālysogeny decisions <span style="font-style: italic;">Nature, 541</span> (7638), 488-493 DOI: <a href="http://dx.doi.org/10.1038/nature21049" rev="review">10.1038/nature21049</a></span>
<br />
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com11tag:blogger.com,1999:blog-8971027081051936989.post-84409408584566419092016-12-11T17:49:00.000-05:002016-12-11T18:08:12.390-05:00How to Write a Manuscript Submission Cover Letter<div style="text-align: justify;">
<a href="https://www.writersandartists.co.uk/assets/users/admin_1/admin_1-asset-503789627f596.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="320" src="https://www.writersandartists.co.uk/assets/users/admin_1/admin_1-asset-503789627f596.jpg" width="299" /></a>The communication of our research findings is a foundational pillar to our careers as scientists. One of the most common ways we scientists share information is by publishing papers in peer-reviewed journals. This primary method of information dissemination allows us to share our research findings both to our colleagues as well as the public at large. When preparing a manuscript for submission to a journal for peer review and subsequent publication, a lot of work goes into preparing a variety of documents. One of the important documents is a cover letter to the editor. This letter represents a significant hurdle for new and young researchers because it is often unclear what a cover letter should actually look like, and what information should be included. In this week's post I want to go over what a good cover letter <i>could</i> look like and how you can write your own. I say this is what it <i>could</i> look like because there is certainly a lot of room for interpretation and personal style, and there are many correct ways to do it. Here I am just going to cover one potential way to tackle the problem.</div>
<div style="text-align: justify;">
</div>
<a name='more'></a><br />
<div style="text-align: justify;">
Before we get into the specifics, let's first discuss what a cover letter actually is. Again the exact answer can vary between people, but I think most could agree that it is an opportunity to introduce the journal editor to the manuscript you are submitting. This is an opportunity for you to briefly introduce the problem you are addressing, explain why your manuscript is important, and discuss why your manuscript should be published in that journal. Additionally, you can provide some of the subtle information associated with the paper, such as suggested reviewers and whether the article is already available in pre-print. This is not supposed to be a repeat of your abstract, but really just a brief letter providing an introduction to the entire work you are submitting.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
So this description is fine and you can probably find something like that on some journal websites, but it is still vague. What does all of that look like in practice? To make it clearer, lets go through an example that I wrote out for this blog. The content is just a fictional example for a manuscript written by Jane Appleseed (first author) and Marissa Mayer (corresponding author). While the specific content is nonsense, the structure and themes for each section are real. Here is the general structure that you could follow for your own manuscript submission.</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://4.bp.blogspot.com/-J3mq1FVkJxE/WE3VXIa0lEI/AAAAAAAABeI/Pe-f29YF_2cH3i9Mj7LearBcNgOX9OT5gCLcB/s1600/Example-CL.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="370" src="https://4.bp.blogspot.com/-J3mq1FVkJxE/WE3VXIa0lEI/AAAAAAAABeI/Pe-f29YF_2cH3i9Mj7LearBcNgOX9OT5gCLcB/s400/Example-CL.png" width="400" /></a></div>
<br />
<div style="text-align: justify;">
So there you have it, an example of how to write a cover letter for your next manuscript submission. As I said above, this is meant to be an example of how you could do it, but there are many good ways to write submission cover letters. The best way to learn how to write a good cover letter is to ask to read many of your colleagues' letters to see what you like about their style and structure.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
If you have your own advice on how to write a successful cover letter, or have further questions, let us know in the comments below. As always, you can feel free to reach out to me on Twitter and by email as well. Happy submitting!</div>
<br />
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com156tag:blogger.com,1999:blog-8971027081051936989.post-41635847031890280922016-11-13T11:58:00.001-05:002016-11-13T11:58:57.488-05:00Summary of the 2016 International Human Microbiome Congress<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-CTb_szNE6a8/WCequhFzbnI/AAAAAAAABdc/0xe7Ta4oV68QCoIgb1uK2IdprPc6Jx1wwCLcB/s1600/IMG_6534.JPG" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" height="240" src="https://2.bp.blogspot.com/-CTb_szNE6a8/WCequhFzbnI/AAAAAAAABdc/0xe7Ta4oV68QCoIgb1uK2IdprPc6Jx1wwCLcB/s320/IMG_6534.JPG" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Kicking off the IHMC meeting for 2016.</td></tr>
</tbody></table>
<div style="text-align: justify;">
This week I had the privilege of attending the <a href="http://ihmc2016.org/">2016 International Human Microbiome Congress</a> which was hosted in Houston, Texas in the United States. The goal of this recurring meeting is to get the worldwide human microbiome community together to discuss recent progress, current challenges, and future directions. In this post I want to give a summary of the meeting for anyone who could not attend.</div>
<div style="text-align: justify;">
<a name='more'></a></div>
<div>
</div>
<h3 style="text-align: justify;">
<br /></h3>
<h3 style="text-align: justify;">
Top Three Research Picks</h3>
<div style="text-align: justify;">
Of course I cannot go into all of the meeting in detail, but I will provide some highlights and encourage you to keep a close eye on the literature as the work presented was either published or near publication. Here are my top three picks for the talks. I should also mention that this is based on the talks I was able to attend. I missed many of the talks during the concurrent sessions (as did everyone since many talks are given at the same time), and because I had to leave before the end of the last day. So these are the top three of what I saw.</div>
<div style="text-align: justify;">
<br /></div>
<div>
<ol>
<li style="text-align: justify;"><b><i>Kjersti Aagaard</i></b> is well known for her placenta microbiome work, which has been met with skepticism around whether the results truly represent a placenta microbiome or whether they are contaminants. It was clear that she is aware of this criticism and is working to address it (in addition to her other very cool research projects). The coolest was that she is using microscopy techniques like <a href="https://en.wikipedia.org/wiki/Fluorescence_in_situ_hybridization">FISH</a> to visualize bacteria that appear to be colonizing. Unfortunately it was a fast talk so Iām not going to try going too much into it. They are anticipating publishing the results in the near future however so it will be worth reading for sure.</li>
<li style="text-align: justify;"><b><i>Ami Bhatt</i></b> was doing some very cool work with <a href="https://en.wikipedia.org/wiki/Triclosan">Triclosan</a> and the microbiome. This is an interesting study on a unique cohort since Triclosan is now banned by the FDA in the US. She was also presenting some interesting FMT work. Not only were the results cool, but I thought her use of metagenomics was interesting, refreshing, and represented an understanding I wish I could say was ubiquitous throughout the meeting. Not only was she using shotgun metagenomic sequencing to get at the presence of functional genes, but was doing some cool work to look at SNP concordance between FMT donors and long-term recipients. She demonstrated a unique and informative approach that I really appreciated seeing.</li>
<li style="text-align: justify;"><b><i>Morgan Langille</i></b> presented some very cool work utilizing a wide range of techniques to detect microbiome signatures that can be used to predict irritable bowel disease. Not only was the machine learning presented well, but I thought this was a really cool example of how we can effectively use multiple techniques to understand disease and the human microbiome. We often see a push to use different ā-omicsā techniques (for lack of a better term) but the studies are often implemented poorly, I think because of the difficulty in understanding how to effectively use them together. This seems like a good example of how it can be done well. They can be used to classify disease states using stool, and then we can go back to determine what factors of them all were most important, and how much more information we really get from each technique. It was another refreshing metagenomics approach that I appreciated seeing.</li>
</ol>
</div>
<div style="text-align: justify;">
<br /></div>
<h3 style="text-align: justify;">
The Virome</h3>
<div style="text-align: justify;">
I know that this is mostly expected, but I feel it is worth mentioning again. There was a large focus on bacteria without many talks for fungi and viruses (including bacteriophages). There were a couple, but the almost exclusive focus on bacteria has been a common theme in human microbiome research and Iām not surprised this conference also focused on bacteria. I just feel it is worth mentioning that the future of the human microbiome does not only include bacteria.</div>
<div style="text-align: justify;">
<br /></div>
<h3 style="text-align: justify;">
Metagenomics and the Microbiome</h3>
<div style="text-align: justify;">
<a href="http://ihmc2016.org/wp-content/uploads/2016/03/IHMC-LOGO-updated.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="http://ihmc2016.org/wp-content/uploads/2016/03/IHMC-LOGO-updated.png" height="156" width="320" /></a>The theme for the meeting was <i>āfrontiers of microbiome science and metagenomic medicineā</i>. This meant that there was a heavy focus on microbiome studies that utilized metagenomic shotgun sequencing to understand the human microbiome. I honestly felt throughout the meeting that this choice might have been a little too restrictive and had the focus too much on the method and not enough on the actual biology and medicine. There was certainly some excellent science, but it would have been nice to have the focus more on how we can use tools to answer important questions instead of looking for questions we can answer with a tool. But I could do a whole post on this so for now I am going to leave it at that. In the end, I think that the next meeting could really benefit from a broader theme that focuses less on a specific method. For example, I preferred the broad theme last year: āfuture directions for human microbiome research in health and diseaseā.</div>
<div style="text-align: justify;">
<br /></div>
<h3 style="text-align: justify;">
Wrap Up</h3>
<div style="text-align: justify;">
So there you have it, my almost criminally short summary of the 2016 International Human Microbiome Congress. It was a meeting with highs and lows, and I was happy I was able to meet some cool people and see some interesting science. If you are interested in seeing the live tweeting archives, check out <a href="https://twitter.com/search?src=typd&q=%23ihmc2016">#IHMC2016 on Twitter</a>. Questions, comments, or concerns? Please leave a post in the comment section, or reach out via Twitter or email. I always love hearing from readers.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com0tag:blogger.com,1999:blog-8971027081051936989.post-67380238417685870312016-10-23T18:58:00.000-04:002016-10-23T18:58:43.881-04:00Global Online Office Hours<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-lBz2ihHdYqA/VN08KkI95iI/AAAAAAAAAQo/4Q7vPLBvAfM/s1600/OfficeHours_02032015-logo-on-board%2B(1).jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" height="208" src="https://4.bp.blogspot.com/-lBz2ihHdYqA/VN08KkI95iI/AAAAAAAAAQo/4Q7vPLBvAfM/s320/OfficeHours_02032015-logo-on-board%2B(1).jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Global online office hours will be held monthly<br />
through Google Chat.</td></tr>
</tbody></table>
<span style="text-align: justify;">Interest in the microbiome has continued to skyrocket. It seems like there is a new microbiome commercialization strategy everyday, and more and more scientists are looking to incorporate the microbiome into their research programs. It is certainly an exciting time for the microbiome. Unfortunately the increased demand has been met with a somewhat insufficient supply of information and resources. Of course there are some excellent resources out there, but a lot of people don't have access to a "microbiome researcher" to answer their questions. Sometimes this means newcomers to the field make some crucial mistakes because they are forced to "go it alone". In an effort to generate even more unique resources for all of the microbiome folks out there, I decided to hold </span><b style="text-align: justify;">Global Online Office Hours</b><span style="text-align: justify;">.</span><br />
<br />
<div style="text-align: justify;">
</div>
<a name='more'></a><br />
<div style="text-align: justify;">
As the name suggests, these office hours will be held online and are open to anybody in the world. The idea is that anybody who has questions about microbiome research, the current state of the field, a recent study, or anything else, now has an opportunity to ask a real life microbiome researcher. Of course I encourage students to attend, but anybody can join in. This includes academics, industry scientists, journalists, etc.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Right now this is still in the experimental stage. I am trying to evaluate both the level of interest (is this a good idea?) and what the most effective format will be. Right now I am holding monthly office hours through the rest of the year. If there is a lot of interest, I am totally ready to bump it up to more frequent times. I also currently have these scheduled for a fixed time (in my afternoon), which means some time zones will have a hard time attending (i.e. it will be 0300 for some people). So again, if there is interest I will try to stagger my times to allow for more general audience participation.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
If this sounds cool and you would like to attend and ask some questions, feel free to read more on <a href="http://microbiology.github.io/openofficehours.html">the website</a>. I also encourage you to sign up for <a href="https://groups.google.com/forum/#!forum/online-microbiome-office-hours/join">the Google Group here</a> because that is how I will communicate with the group members (by email).</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
So that's it for this week. If you have any questions, comments, or concerns, please feel free to let me know in the comments below, on Twitter, by email, or even in office hours! Hope to see you there!</div>
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com0tag:blogger.com,1999:blog-8971027081051936989.post-2491837098276641532016-10-02T18:39:00.001-04:002016-10-02T18:39:53.116-04:00A New Look At Irritable Bowel Disease and Viruses: The Core Human "Phageome"<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="http://www.nature.com/nrmicro/journal/v13/n3/images/nrmicro3404-f2.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://www.nature.com/nrmicro/journal/v13/n3/images/nrmicro3404-f2.jpg" height="201" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">An illustration of the core protein clusters (PCs; groups<br />
of similar genes) found in the photic and aphotic zones<br />
of the ocean. This new study applies a similar approach<br />
using phage genomes instead of genes. <a href="http://www.nature.com/nrmicro/journal/v13/n3/fig_tab/nrmicro3404_F2.html">Source</a></td></tr>
</tbody></table>
<div style="text-align: justify;">
Ongoing research has continued to implicate the microbiome in a variety of human diseases. We often hear about this in the context of bacterial communities. Certain bacterial communities appear to be associated with health, and disrupting these communities seems to be associated with disease. To better understand these bacterial communities, we sometimes group the shared members together as the "core bacterial community" that is associated with health or disease. In some ways these core bacteria are considered important to the system because they are found in every instance of health or disease. But what about the core phages (bacterial viruses) of these communities? A few weeks ago Manrique <i>et al </i>published a study that began addressing this question.</div>
<div style="text-align: justify;">
<br />
<a name='more'></a></div>
<div style="text-align: justify;">
Manrique <i>et al</i> published a study in <a href="http://www.pnas.org/">PNAS</a> that looked at the core human "phageome" in health and disease. The goal of the study was to identify the core set of phages that are part of the human gut phageome and observe how they are changed in disease states. The purpose of this study is ultimately to identify those phages that are likely to play roles in maintaining health by identifying phages that are present in health and absent in disease. Overall I liked this paper and I will leave you to read it yourself for the study specifics. Here I just want us to briefly summarize the paper while highlighting the most important points.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The group began by assembling a small human cohort consisting of two subjects whose stool was sampled at two different time points. They purified the viruses out of the stool and sequenced the genomic DNA using whole genome shotgun sequencing. They combined the sequences from the four samples and used them to assemble approximately 4,000 <a href="https://en.wikipedia.org/wiki/Contig">contigs</a>. As was expected, they identified a core set of phages that were present in all of the samples.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
This was interesting, but what really made the paper cool was the expansion of their methods to a more robust, disease-associated virome dataset. The group performed their analysis on the <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4312520/">Norman <i>et al</i> virome dataset</a>, which includes purified virus (mostly phage) genomic DNA from the stool of healthy subjects, as well as subjects suffering from <a href="https://en.wikipedia.org/wiki/Inflammatory_bowel_disease">irritable bowel disease</a> conditions. This dataset allowed the group to investigate how the core phage communities differed between healthy and diseased (IBD) states. The geographic diversity of the sampling also allowed them to account for location variation in the core virome.</div>
<div style="text-align: justify;">
<br /></div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="http://www.pnas.org/content/113/37/10400/F3.large.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://www.pnas.org/content/113/37/10400/F3.large.jpg" height="306" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Heatmap of the core, common, and unique phage<br />
genomes found in the Manrique <i>et al</i> study.</td></tr>
</tbody></table>
<div style="text-align: justify;">
The <b>takeaway points</b> were as follows:</div>
<div>
<ul>
<li style="text-align: left;">A core gut virome exists.</li>
<li style="text-align: left;">The core gut virome is conserved across geographically distant populations.</li>
<li style="text-align: left;">The core gut virome signatures change in disease states.</li>
<li style="text-align: left;">Sequence homology clustering reduces core virome dimensionality while preserving population signatures.</li>
</ul>
</div>
<div style="text-align: justify;">
In the end, what does this all mean? I think the biggest strength of this paper is that they are laying important groundwork for future studies of the human virome in the context of the "core virome". By identifying those phages present in all healthy states, the group has identified targets for future study that are likely to be important for a healthy system. This also establishes a new way for other researchers to start thinking about the viromes in their systems of interest.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
So <b>what's next</b>? Here are my predictions for the future directions of this study:</div>
<div>
<ul>
<li style="text-align: justify;">The group will likely expand to additional body sites and disease states.</li>
<li style="text-align: justify;">The group may go on to define the functional and predatory implications of core virome.</li>
<li style="text-align: justify;">They or others will begin establishing an understanding of the associations between the core virome and the core bacterial communities.</li>
</ul>
</div>
<div style="text-align: justify;">
Again, this is a cool paper and I suggest you check it out. I also presented this paper for our lab journal club a couple of weeks ago, and I made my <a href="http://microbiology.github.io/PDFs/SchlossHanniganJournalClub2016-09-08.pdf">slide deck available here</a> if you want to check it out. Finally, and as always, feel free to reach out either in the comments below, on Twitter, or by email. I am always excited to hear from my readers!</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<span style="float: left; padding: 5px;"><a href="http://www.researchblogging.org/"><img alt="ResearchBlogging.org" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" style="border: 0px;" /></a></span>
<br />
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<h3>
Works Cited</h3>
</div>
<div>
<br />
<br />
<br />
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=Proceedings+of+the+National+Academy+of+Sciences+of+the+United+States+of+America&rft_id=info%3Apmid%2F27573828&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Healthy+human+gut+phageome.&rft.issn=0027-8424&rft.date=2016&rft.volume=113&rft.issue=37&rft.spage=10400&rft.epage=5&rft.artnum=&rft.au=Manrique+P&rft.au=Bolduc+B&rft.au=Walk+ST&rft.au=van+der+Oost+J&rft.au=de+Vos+WM&rft.au=Young+MJ&rfe_dat=bpr3.included=1;bpr3.tags=Biology">Manrique P, Bolduc B, Walk ST, van der Oost J, de Vos WM, & Young MJ (2016). Healthy human gut phageome. <span style="font-style: italic;">Proceedings of the National Academy of Sciences of the United States of America, 113</span> (37), 10400-5 PMID: <a href="http://www.ncbi.nlm.nih.gov/pubmed/27573828" rev="review">27573828</a></span>
<br />
<br />
<br /></div>
Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com4tag:blogger.com,1999:blog-8971027081051936989.post-35104675268326509702016-09-18T15:59:00.000-04:002016-09-19T08:19:17.722-04:00How to Detect Circular Virus Genomes from Metagenomes<div style="text-align: right;">
</div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://phagehuntnz.files.wordpress.com/2015/10/olympics.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" height="147" src="https://phagehuntnz.files.wordpress.com/2015/10/olympics.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><br />
Many virus genomes are circular, like the olympic rings.</td></tr>
</tbody></table>
<div style="text-align: justify;">
When analyzing virus metagenomic data, we often find it helpful to identify contigs that represent complete circular genomes [1,2]. In addition to offering biological information, this is used as a quality control technique to evaluate whether the sequencing efforts were robust enough to allow for complete genome assembly. This approach has the advantage of reference independence because it does not require aligning reads to a reference genome to evaluate sequence completion.</div>
<div style="text-align: justify;">
</div>
<a name='more'></a><br />
<div style="text-align: justify;">
Because there is no end or start to a circular genome, assembled circular contigs contain sequence repeats in which the whole contig begins to repeat after the whole genome has been sequenced. This trait is used to detect circular genomes by "closing" the contigs by identifying the repeated genome signature. This can be done by aligning the contig nucleotide sequence to itself to "close the circle", as represented in the figure below.</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br /><div style="text-align: justify;">
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-ODgQfMuF5Yc/V9_XM6kbx1I/AAAAAAAABcA/vuH4vQ4CmUQSHH7BbIIf2BaQn1nawApewCLcB/s1600/CircContigExample.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="192" src="https://4.bp.blogspot.com/-ODgQfMuF5Yc/V9_XM6kbx1I/AAAAAAAABcA/vuH4vQ4CmUQSHH7BbIIf2BaQn1nawApewCLcB/s320/CircContigExample.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A linear contig representing a circular virus can be closed by detecting<br />sequence similarity at each end.</td></tr>
</tbody></table>
<br /></div>
<div style="text-align: justify;">
Because this approach is primarily implemented as "custom in-house scripts", it is hard to actually find good, freely available resources without hunting them down from their authors. In the interest of adding to the valuable open-source virome analysis resources available online, I wrote out a script that detects circular virus contigs. The script is titled <b>ccontigs.jl</b> and is available on the <a href="https://github.com/Microbiology/ccontigs">GitHub <b>ccontigs </b>repository</a>. See the documentation there for details.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The first question you might be asking yourself is what the <b>".jl"</b> file extension means? What language is that? The <b>.jl</b> extension means that this script was written in the <a href="http://julialang.org/">Julia programming language</a> that I am liking more and more for bioinformatics. I originally tried writing the program in Python using some BioPython tools but I found the pairwise alignment tool was quite slow and resource (memory) intensive. I had good experiences before with Julia <a href="http://prophage.blogspot.com/2016/06/the-up-and-coming-bioinformatics.html">before</a>, especially with regards to performance, so I rewrote the script in Julia and tried it out. The Julia script drastically outperformed the Python version so I stuck with it. The downside is that you need to install Julia on your computer/server, but this is pretty easy with instructions <a href="http://julialang.org/downloads/">found here</a>.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
As I was validating the script I noticed an important caveat to this approach that I hadn't really seen mentioned in the literature. Some linear virus genomes actually contain a repeat of the beginning of their genome again at the end of the genome (e.g. <a href="http://www.ncbi.nlm.nih.gov/nuccore/JX080304.2">Staphylococcus phage MSA6</a>). This means that a sequence similarity approach would "close" the contig as a circle even though it's a linear genome. Is this a problem? The answer depends on your question. </div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
If you are claiming a contig truly represents a completed circular genome, this "closing" method alone is somewhat insufficient and will need to be supplemented with a different approach. The method will however provide strong support for using this as a QC measure to support sequencing efforts as representing a large fraction of a virome. Even if the genome is linear, "closing" it still provides strong evidence that you sequenced enough to cover the entire genome.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Moving forward from this post, we now have an efficient, open-source tool for detecting circular contigs. We are also aware of the caveat that some linear genomes may be mis-annotated as representing circular genomes, but the impact of this caveat varies with the experimental question being asked.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
As always, please leave any questions, comments, or concerns in the comment section below, or reach out through Twitter or email. I am always happy to get feedback and help out other virome researchers.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
And yes, I know the metaphor in the first figure is a bit of a stretch. :)</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<span style="float: left; padding: 5px;"><a href="http://www.researchblogging.org/"><img alt="ResearchBlogging.org" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" style="border: 0;" /></a></span>
</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<h3>
WORKS CITED</h3>
<br />
<br />
<br /></div>
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=Genome+Research&rft_id=info%3Adoi%2F10.1101%2Fgr.122705.111&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=The+human+gut+virome%3A+Inter-individual+variation+and+dynamic+response+to+diet&rft.issn=1088-9051&rft.date=2011&rft.volume=21&rft.issue=10&rft.spage=1616&rft.epage=1625&rft.artnum=http%3A%2F%2Fgenome.cshlp.org%2Fcgi%2Fdoi%2F10.1101%2Fgr.122705.111&rft.au=Minot%2C+S.&rft.au=Sinha%2C+R.&rft.au=Chen%2C+J.&rft.au=Li%2C+H.&rft.au=Keilbaugh%2C+S.&rft.au=Wu%2C+G.&rft.au=Lewis%2C+J.&rft.au=Bushman%2C+F.&rfe_dat=bpr3.included=1;bpr3.tags=Biology">1. Minot, S., Sinha, R., Chen, J., Li, H., Keilbaugh, S., Wu, G., Lewis, J., & Bushman, F. (2011). The human gut virome: Inter-individual variation and dynamic response to diet <span style="font-style: italic;">Genome Research, 21</span> (10), 1616-1625 DOI: <a href="http://dx.doi.org/10.1101/gr.122705.111" rev="review">10.1101/gr.122705.111</a></span>
<br />
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=Proceedings+of+the+National+Academy+of+Sciences&rft_id=info%3Adoi%2F10.1073%2Fpnas.1601060113&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Healthy+human+gut+phageome&rft.issn=0027-8424&rft.date=2016&rft.volume=113&rft.issue=37&rft.spage=10400&rft.epage=10405&rft.artnum=http%3A%2F%2Fwww.pnas.org%2Flookup%2Fdoi%2F10.1073%2Fpnas.1601060113&rft.au=Manrique%2C+P.&rft.au=Bolduc%2C+B.&rft.au=Walk%2C+S.&rft.au=van+der+Oost%2C+J.&rft.au=de+Vos%2C+W.&rft.au=Young%2C+M.&rfe_dat=bpr3.included=1;bpr3.tags=Biology">2. Manrique, P., Bolduc, B., Walk, S., van der Oost, J., de Vos, W., & Young, M. (2016). Healthy human gut phageome <span style="font-style: italic;">Proceedings of the National Academy of Sciences, 113</span> (37), 10400-10405 DOI: <a href="http://dx.doi.org/10.1073/pnas.1601060113" rev="review">10.1073/pnas.1601060113</a></span>
</div>
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com0tag:blogger.com,1999:blog-8971027081051936989.post-8748317396516599962016-08-27T19:44:00.000-04:002016-08-27T19:44:22.556-04:00Improving Human Virome Studies: Updates to Virus Classification<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="http://jb.asm.org/content/184/16/4529/F2.large.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://jb.asm.org/content/184/16/4529/F2.large.jpg" height="300" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The proposed phage proteomic tree by <a href="http://jb.asm.org/content/184/16/4529.abstract">Rohwer and Edwards</a>.</td></tr>
</tbody></table>
<div style="text-align: justify;">
Taxonomy is an important aspect of microbiome research. Whether we are studying communities of bacteria, viruses, or other microbes, there are benefits to labeling microbes. Taxonomic names immediately give us information about their relationships to each other, such as similar bacteria being grouped into the same <a href="https://en.wikipedia.org/wiki/Genus">genus</a>. Taxonomic identities also provide some information about an organism's functionality and/or clinical pathology. For example, by mentioning that a bacteria is a member of the genus <a href="https://en.wikipedia.org/wiki/Staphylococcus">Staphylococcus</a>, you might think that it is a round, gram-positive bacterium that might inhabit the skin and is otherwise related to other members of that genus (including genomic relationships). In the end, the practice does what it aims to do, which is classify organisms in an informative way.</div>
<div style="text-align: justify;">
<br />
<a name='more'></a></div>
<div style="text-align: justify;">
Although it might seem like a simple practice at first, it is actually a very complicated field that continues to improve due to the effort of many talented scientists. This is especially true for virus taxonomy. Although improving, phage taxonomy has continued to suffer from issues of ambiguity and inconsistency. In this post I want to go over the recently proposed improvements to phage taxonomic conventions. I feel this is particularly <b>important</b> to go over because it will impact the analyses done by human virome researchers, as well as virome researchers in general.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The manuscript outlining the changes is actually a very nice, short, and easy read, so I will direct you to it for details if you are interested. <b>Overall, the changes reduce ambiguity and foster greater consistency in naming phages.</b> Here is a list of the proposed changes, which are listed with greater detail in the manuscript itself.</div>
<div style="text-align: justify;">
<br /></div>
<h3>
1. Replace "phage" with "virus" in bacteriophage taxonomy names.</h3>
<div style="text-align: justify;">
<b>Example</b>: "Escherichia phage T4" will become "Escherichia virus T4".</div>
<div style="text-align: justify;">
<br /></div>
<h3>
2. Removal of "like" from phage genus names.</h3>
<div style="text-align: justify;">
<b>Example</b>: "Lambdalikevirus" will become "Lambdavirus".</div>
<div style="text-align: justify;">
<br /></div>
<h3>
3. Discontinuation of "phi" and other transliterated Greek letters.</h3>
<div style="text-align: justify;">
Greek letters will be discouraged in names going forward.</div>
<div style="text-align: justify;">
<br /></div>
<h3>
4. Elimination of hyphens from taxon names.</h3>
<div style="text-align: justify;">
<b>Example</b>: "Yersinia phage L-413C" will become "Yersinia virus L413C".</div>
<div style="text-align: justify;">
<br /></div>
<h3>
5. Specificity of isolation host in taxon name.</h3>
<div style="text-align: justify;">
<b>Example</b>: "Enterobacteria phage T7" will become "Escherichia virus T7".</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The group also discusses the ongoing efforts in using genomic similarity for defining virus genome similarity. For example, viruses with greater than 40% amino acid sequence similarity have been categorized as being in the same genus. As is perhaps expected, this can result in somewhat uninformative categorizations that collect somewhat dissimilar viruses. This will be an area of development that we will have to continue watching.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
So what can we take away from this? Honestly this is a important paper for those of us interested in virus ecology, and especially the human virome. In a lot of ways, our understanding of the human virome is only as good as our reference databases. By clearing up ambiguities and inconsistencies in these databases, we can improve our ability to discuss the communities we observe and better equip ourselves with an understanding of the phage relationships.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Thanks for hanging in there to the end. I know taxonomy can be a bit of a dry topic for people, but it really is important and something we all need to try to stay current with, particularly if we think a lot about microbiology. Be sure to check out the manuscript itself for the whole story. For even further reading, check out the paper by Thompson <i>et al</i>. Finally, if you have any questions, comments, or concerns, please feel free to reach out either through the comment section below, Twitter, or email. I always love hearing from readers!</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<span style="float: left; padding: 5px;"><a href="http://www.researchblogging.org/"><img alt="ResearchBlogging.org" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" style="border: 0;" /></a></span>
<br />
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<h3>
Works Cited</h3>
<br />
<br />
<br />
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=Archives+of+Virology&rft_id=info%3Adoi%2F10.1007%2Fs00705-015-2728-0&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Taxonomy+of+prokaryotic+viruses%3A+update+from+the+ICTV+bacterial+and+archaeal+viruses+subcommittee&rft.issn=0304-8608&rft.date=2016&rft.volume=161&rft.issue=4&rft.spage=1095&rft.epage=1099&rft.artnum=http%3A%2F%2Flink.springer.com%2F10.1007%2Fs00705-015-2728-0&rft.au=Krupovic%2C+M.&rft.au=Dutilh%2C+B.&rft.au=Adriaenssens%2C+E.&rft.au=Wittmann%2C+J.&rft.au=Vogensen%2C+F.&rft.au=Sullivan%2C+M.&rft.au=Rumnieks%2C+J.&rft.au=Prangishvili%2C+D.&rft.au=Lavigne%2C+R.&rft.au=Kropinski%2C+A.&rft.au=Klumpp%2C+J.&rft.au=Gillis%2C+A.&rft.au=Enault%2C+F.&rft.au=Edwards%2C+R.&rft.au=Duffy%2C+S.&rft.au=Clokie%2C+M.&rft.au=Barylski%2C+J.&rft.au=Ackermann%2C+H.&rft.au=Kuhn%2C+J.&rfe_dat=bpr3.included=1;bpr3.tags=Biology">Krupovic, M., Dutilh, B., Adriaenssens, E., Wittmann, J., Vogensen, F., Sullivan, M., Rumnieks, J., Prangishvili, D., Lavigne, R., Kropinski, A., Klumpp, J., Gillis, A., Enault, F., Edwards, R., Duffy, S., Clokie, M., Barylski, J., Ackermann, H., & Kuhn, J. (2016). Taxonomy of prokaryotic viruses: update from the ICTV bacterial and archaeal viruses subcommittee <span style="font-style: italic;">Archives of Virology, 161</span> (4), 1095-1099 DOI: <a href="http://dx.doi.org/10.1007/s00705-015-2728-0" rev="review">10.1007/s00705-015-2728-0</a></span>
<br />
<br />
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=Archives+of+Microbiology&rft_id=info%3Adoi%2F10.1007%2Fs00203-014-1071-2&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Microbial+taxonomy+in+the+post-genomic+era%3A+Rebuilding+from+scratch%3F&rft.issn=0302-8933&rft.date=2014&rft.volume=197&rft.issue=3&rft.spage=359&rft.epage=370&rft.artnum=http%3A%2F%2Flink.springer.com%2F10.1007%2Fs00203-014-1071-2&rft.au=Thompson%2C+C.&rft.au=Amaral%2C+G.&rft.au=Campe%C3%A3o%2C+M.&rft.au=Edwards%2C+R.&rft.au=Polz%2C+M.&rft.au=Dutilh%2C+B.&rft.au=Ussery%2C+D.&rft.au=Sawabe%2C+T.&rft.au=Swings%2C+J.&rft.au=Thompson%2C+F.&rfe_dat=bpr3.included=1;bpr3.tags=Biology">Thompson, C., Amaral, G., CampeĆ£o, M., Edwards, R., Polz, M., Dutilh, B., Ussery, D., Sawabe, T., Swings, J., & Thompson, F. (2014). Microbial taxonomy in the post-genomic era: Rebuilding from scratch? <span style="font-style: italic;">Archives of Microbiology, 197</span> (3), 359-370 DOI: <a href="http://dx.doi.org/10.1007/s00203-014-1071-2" rev="review">10.1007/s00203-014-1071-2</a></span>
</div>
Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com2tag:blogger.com,1999:blog-8971027081051936989.post-47688911592675894232016-07-31T21:46:00.000-04:002016-07-31T21:46:17.449-04:00Antibiotics, Birth, and the Microbiome: A Personal Experience<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-oFOJlbd5ipI/V56nR0OSP9I/AAAAAAAABas/e3brmwTXDn8NsY-q4nCxxdOlraocLAfZgCLcB/s1600/IMG_5811.JPG" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" height="240" src="https://1.bp.blogspot.com/-oFOJlbd5ipI/V56nR0OSP9I/AAAAAAAABas/e3brmwTXDn8NsY-q4nCxxdOlraocLAfZgCLcB/s320/IMG_5811.JPG" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The new addition to our family!</td></tr>
</tbody></table>
<div style="text-align: justify;">
Well July has shaped up to be an incredible month. In addition to working on some cool projects whose results you will be seeing in the near future, my wife delivered our first child. Her name is Clara and we are very excited to be welcoming her into our family. Unfortunately the road to delivery was a little bumpy (although not nearly as bad as it could have been). One aspect of the process that stood out to me was the use of antibiotics during delivery. I thought this was interesting because we hear so much about the microbiome differences between vaginal and c-section births, but not much about antibiotic treatment. This week I wanted to share my experience with you, both to shed some light on what can happen during delivery, and to provide my own thoughts on the subject.</div>
<div style="text-align: justify;">
</div>
<a name='more'></a><br />
<div style="text-align: justify;">
To jump right in, the delivery process started with my wife's water breaking, just as it does with many women. The only problem here was that labor didn't start after. As you might guess, this is particularly troubling because the open, moist, and incubated amniotic sac is an ideal environment for an infection. So once the water broke, the infection clock started ticking. The standard practice for this situation dictates that labor <b>needs</b> to start within 24 hours of water breaking, whether it be natural, augmented, or induced.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
For us the 24 hours came and went, and despite our efforts to get labor started (walking, positions, etc), the contractions were only weakly progressing. This meant the team needed to augment my wife's labor, which involved providing a hormone (<a href="https://en.wikipedia.org/wiki/Oxytocin">pitocin</a> to be exact) to ramp up the contractions and start working the baby out. To cut a long story short, this was a <b>very</b> long process that my wife went through. And remember, this whole time we were racing to avoid an infection.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Because of the infectious risk, my wife's temperature was taken every half hour to an hour. A fever is one immediate indication of a potential infection. After about another day passed (about 48 hours after her water broke), her fever started to spike, which suggested the bacteria finally caught up to us and were starting to infect. Because the goal was to avoid infection (a pretty important goal for both the mom and baby), my wife was immediately administered broad-spectrum antibiotics to kill off the infecting bacteria. Luckily it seemed to work and her fever went back to normal for the remainder of labor.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
As a microbiologist and microbiome researcher, I thought this experience was pretty interesting. We are constantly worried about the detrimental effects of antibiotics, and it's certainly true that antibiotics are misused. But I think we also need to talk about situations where a somewhat liberal use of broad-spectrum antibiotics really is the best course of action. If we think about the situation I described, we never actually knew that my wife had an infection. We only knew that she had started a fever (there was not time for culturing at that point, although they did follow up with that). All we knew was that there was a chance of an infection, and the benefits of avoiding such an infection outweighed the risks associated with those antibiotics. An altered microbiome might be bad, but an infected newborn baby is likely to be much worse.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: right;">
</div>
<div style="text-align: justify;">
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="http://8b7a91801591cac4b290-abbac3ca2ecec271a197a4cd05b43329.r61.cf3.rackcdn.com/probiotics-vs-antibiotics.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://8b7a91801591cac4b290-abbac3ca2ecec271a197a4cd05b43329.r61.cf3.rackcdn.com/probiotics-vs-antibiotics.jpg" height="231" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">There is a time and place for antibiotics, although<br />they are still misused.</td></tr>
</tbody></table>
So what am I trying to say? Should we keep throwing around antibiotics at every sign of a cough? Certainly not. Antibiotics are misused in many ways, and it is clear that we can benefit from more targeted treatment approaches (such as phage therapy of course!). On the other hand I think it is worth pointing out that there are still situations that necessitate the use of broad antibiotics. Antibiotics can cause problems, but they are still a miracle of modern medicine and will have a place in medical practice for a very long time.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
In the end, both the mom and the baby left the hospital happy and healthy, and that is what I am grateful for. I am also very happy with the care we received at the University of Michigan hospital. They did a brilliant job!</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
So that was my recent experience with antibiotics. Thanks for bearing with a different kind of post this week, but I thought it might be interesting to share a personal story that relates to the research I write so much about. As always, feel free to reach out and let me know what you thought, or if you have any questions. Finally, I will wrap things up with a disclaimer that this was our experience, and every experience is different, so be sure to talk with your doctor if you find yourself in similar circumstances.</div>
<div style="text-align: justify;">
<br /></div>
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com2tag:blogger.com,1999:blog-8971027081051936989.post-75056103365487376792016-06-26T21:41:00.001-04:002016-06-28T11:00:45.934-04:00The Up-And-Coming Bioinformatics Language: A First Look At Julia<a href="https://4.bp.blogspot.com/-8_KC2298dEI/V3CCk1E8hyI/AAAAAAAABY8/AQFuBMQnpPkhLrjw61HZGwuHKao4rZx2ACLcB/s1600/Julia_prog_language.svg.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="216" src="https://4.bp.blogspot.com/-8_KC2298dEI/V3CCk1E8hyI/AAAAAAAABY8/AQFuBMQnpPkhLrjw61HZGwuHKao4rZx2ACLcB/s320/Julia_prog_language.svg.png" width="320" /></a>Programming is a dynamic field that transitions from one language to another over the years. A classic example is the transition to Perl, which then transitioned into Python. The R language has also exploded in recent years, and all of these languages are used heavily in bioinformatics. Instead of focusing on the current state of bioinformatics, I want to focus this post on where we could be going in the future. More specifically, I want to discuss an up-and-coming programming language named Julia, which has potential for use in bioinformatics.<br />
<br />
<a name='more'></a><br />
<a href="http://julialang.org/">Julia</a> is a new language that first appeared in 2012 and has been gaining attention ever since. The creators have focused on creating an efficient and fast language that is also relatively easy to use. Because people are talking more about it each day, and because I think it shows exceptional promise, I wanted to try it out for myself.<br />
<br />
<h2>
The Benchmarking</h2>
I was a little bummed when I saw their homepage benchmarking failed to include Perl, my goto language for a lot of the data munging associated with bioinformatics. Perl is also lightening fast for a scripting language, which makes it handy. I decided I would familiarize myself with the Julia language by setting up some basic benchmarking.<br />
<br />
To get a feel for Julia's speed, I decided to recreate a Perl script that I use to calculate the median length of sequences in a fasta file. I downloaded Julia from the <a href="http://julialang.org/downloads/">Julia website</a>, installed it on my computer, and rewrote the Perl script in Julia. In total this took me about 1-1.5 hours, which highlights the ease of writing in Julia. It really took no time at all before I was writing a decent Julia script. I had never used the language before, but it is familiar to any Python or R user.<br />
<br />
Once I had the two scripts, I ran them on the same example fasta file and compared the execution time required for both. I got the following results.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-x0n446TtlFk/V3B9cIv8S1I/AAAAAAAABYc/4FRha4MHjj8SrfNwSA3H0YwVRbHdWXjYwCLcB/s1600/BenchmarkingResults.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://4.bp.blogspot.com/-x0n446TtlFk/V3B9cIv8S1I/AAAAAAAABYc/4FRha4MHjj8SrfNwSA3H0YwVRbHdWXjYwCLcB/s400/BenchmarkingResults.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Comparison of Perl and Julia speeds for calculating the median sequence lengths in<br />
an increasingly larger fasta file.<a href="https://github.com/Microbiology/JuliaPerlBenchmark"> Code is found here.</a></td></tr>
</tbody></table>
<br />
So the Perl script clearly ran faster than the Julia script, and both increased in time at about the same rate as I added sequences. So what can we say from these results? I would conclude that although Julia is fast, it still can't beat Perl for parsing data and making quick calculations. Of course this comes with the caveat that I have very little experience writing in Julia and could have written it poorly (I did try to make it efficient to give it a fair chance though). I also only tested the two on relatively small files, and the results may be different for very large files. Regardless, I still think this is informative.<br />
<br />
<u><i>Check out the associated data and code</i></u> on the <a href="https://github.com/Microbiology/JuliaPerlBenchmark">JuliaPerlBenchmark GitHub page</a>.<br />
<br />
<h2>
Julia Pros</h2>
<br />
<ul>
<li>After spending some time with the Julia language, I really liked the familiarity of the syntax and data structures. Anybody with exposure to Python, R, or any similar high-level scripting/programming language will easily pickup Julia in about an hour or two. </li>
</ul>
<ul>
<li>I like that Julia seems to be a bit of a hybrid between R and Python. It seems like it could be really good for bioinformatics by allowing easy data formatting, analysis, and presentation in one cohesive and fast language environment.</li>
</ul>
<ul>
<li>Although it was a little slower than Perl for parsing sequencing data files, Julia is still a fast language and I think this will draw more and more bioinformaticians to use it.</li>
</ul>
<ul>
<li>Finally, Julia allows for easy integration with C, which I think will help with future development.</li>
</ul>
<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-5ho6A0-zhjw/V3CC2mPvOyI/AAAAAAAABZE/kfq1GYhs7u42o4mW9qsQBCrjAP8L8i_rgCLcB/s1600/Screen%2BShot%2B2016-06-26%2Bat%2B9.34.52%2BPM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="110" src="https://1.bp.blogspot.com/-5ho6A0-zhjw/V3CC2mPvOyI/AAAAAAAABZE/kfq1GYhs7u42o4mW9qsQBCrjAP8L8i_rgCLcB/s400/Screen%2BShot%2B2016-06-26%2Bat%2B9.34.52%2BPM.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Benchmarking results provided on the Julia homepage.</td></tr>
</tbody></table>
<br />
<h2>
Julia Cons</h2>
<br />
<ul>
<li>Although I like Julia, there are certainly some problems that will prevent me from switching over right now. The biggest issue is that it simply does not have the support and infrastructure that a language like Python or R has. Julia is still up-and-comming and the community is not at the same level as the R, Python, or Perl communities. I expect it will pickup in the coming years, but for now it just makes sense (for me) to work in the more developed communities of R, Python, and Perl.</li>
<li>Although Julia is fast, it still can't beat my simple and fast Perl scripting. Until it beats Perl performance in data formatting and management, I honestly won't have a strong incentive to make the move over to Julia heavy scripting.</li>
</ul>
<br />
<br />
<h2>
Final Thoughts</h2>
Julia is a promising and exciting new programming language that I think we will hear more about in the next few years. The community is small and there is less support compared to Python and R, but that could (and probably will) change over time. The general feeling I got for Julia was that it was a combination of Python and R that offered me the best of each in one language. That, in addition to the speed advantages over R and Python, could allow Julia to replace Python and R as major programming languages in the near future. I really do think it is reasonable to expect Julia to be the bioinformatics language-of-choice in the next ten to fifteen years. Ultimately though only time will tell.<br />
<br />
Any thoughts, comments, or concerns? Any bugs in my code or errors in my interpretations? Let me know in the comments below. You are also always welcome to reach out on Twitter or by email. I always love to hear from Prophage readers.<br />
<br />
<h2>
Update</h2>
I have been getting incredible feedback on this blog post and I wanted to update the readers with what I have learned, and how the data has improved. Thanks to the readers in the comments below, as well as on the GitHub repository, we have addressed two issues with the benchmark.<br />
<br />
<ol>
<li>The script I wrote needed to be written more efficiently. Ismael rewrote the script to run more efficiently, and also provided a solid explanation of what they did.</li>
<li>As you can see in the comments, the problem with this test is that Julia is taking time to start and compile the code. The time required to get started is considerably greater for Julia, which is the biggest reason for why Perl appears to perform better. Given this information, you might predict that Julia could outperform Perl on larger file sizes where the startup time become negligible. I quickly bolstered the size of my file to about 500MB (from 30MB) and reran the benchmark. Wouldn't you know it, Julia begins to outperform Perl at larger file sizes, which is awesome. The updated results are below.</li>
</ol>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-YauF8DDXTLI/V3Jc-oMBAZI/AAAAAAAABZs/soCEEKy_y2EceAZHQQlrTcpmvNWAIDNzwCLcB/s1600/BenchmarkingResults.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://1.bp.blogspot.com/-YauF8DDXTLI/V3Jc-oMBAZI/AAAAAAAABZs/soCEEKy_y2EceAZHQQlrTcpmvNWAIDNzwCLcB/s400/BenchmarkingResults.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Updated comparison of Perl and Julia speeds for calculating the median sequence lengths in<br />
an increasingly larger fasta file. Larger file than figure above.<a href="https://github.com/Microbiology/JuliaPerlBenchmark"> Code is found here.</a></td></tr>
</tbody></table>
<div>
<br /></div>
<div>
<br /></div>
<div>
So what what can we take away from this? It turns out that while Julia startup takes longer, it is blazing fast and actually outperforms Perl when using larger but reasonable files. With this new and more correct knowledge, I am happy to say that I am even more excited about Julia and think that it has a place in bioinformatics. Speed for me is a big thing, so I can see incorporating this into my own work.</div>
<div>
<br /></div>
<div>
I finally want to thank all of the readers who contributed to this blog post. I love that people were able to help make this little piece of data accurate and fair, and I feel like we all benefitted from the improved results. Thank you so much and please feel free to continue commenting.</div>
<br />
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com80tag:blogger.com,1999:blog-8971027081051936989.post-10163475465479503522016-06-12T22:00:00.000-04:002016-06-12T22:15:47.511-04:00Tips For Getting The Optimal Postdoc<div style="text-align: justify;">
<a href="http://www.rochester.edu/commencement/2011/doctoral/doctoral3.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="http://www.rochester.edu/commencement/2011/doctoral/doctoral3.jpg" height="211" width="320" /></a>So you've been in grad school for a while, you've published some cool papers, and you are ready to graduate with your PhD and take the next step in your career. For many, this means pursuing a <a href="https://en.wikipedia.org/wiki/Postdoctoral_researcher">postdoc</a>. But how do you get started, and what should you be thinking about? Since I was in this position only a short time ago, I felt I would share my thoughts on the process, hoping that it helps any readers getting ready for that same next step.</div>
<div style="text-align: justify;">
</div>
<a name='more'></a><br />
<div style="text-align: justify;">
Before I go any further though, I want to get everyone on the same page (this is not just a blog for grad students). A postdoc (short for postdoctoral research fellow) is someone who has graduated with their PhD and is conducting supervised research but in a more independent capacity than during their thesis.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
One of the first steps in preparing to embark on a postdoctoral research fellowship is figuring out which labs you should be considering, and then finally choosing one. But how are you supposed to decide? And even after you interview, how are you supposed to choose between the many excellent labs out there? Here are some points I considered during the process.</div>
<div style="text-align: justify;">
<br /></div>
<h2 style="text-align: justify;">
Define Your Next Step</h2>
<div style="text-align: justify;">
Before you do anything, be sure you have a very clear idea of what you want out of your postdoc. Do you ultimately want an academic position? Are you aiming for an industry position? Are you unsure and want to keep your options open? All of these are wonderful choices, but each impacts the process in a different way. Without a clear idea of where you are going, you are going to have a difficult time deciding on the best option. I suggest actually writing out what you want your post-postdoc step to be, and then figure out which next step best prepares you for that.</div>
<div style="text-align: justify;">
<br /></div>
<h2 style="text-align: justify;">
Look For a Great Mentor</h2>
<div style="text-align: right;">
</div>
<div style="text-align: justify;">
One of the most important aspects of a postdoc, and scientific training in general, is taking a position under an excellent mentor. I see a great mentor as someone who advocates for you, challenges you to do better, and supports your career goals. I could go on, but defining a great mentor warrants its own dedicated blog post. As you start the process of looking for labs, write down some qualities that your ideal mentor would have. When you start considering labs, think about whether the PI and other leadership meet that criteria. And be careful of the "prestige pitfall". I have seen many people take positions with unideal mentors (based on their individual criteria) because they are "famous" or "prestigious". Maybe that can work for you, but I have seen many people enter difficult situations in this way, so at least be aware of it.</div>
<div style="text-align: justify;">
<br /></div>
<h2 style="text-align: justify;">
Look For a Lab With Great Resources</h2>
<div style="text-align: justify;">
Chances are that you want to ramp up your research once you start your postdoc. Chances are that you want some confidence in your ability to stay in the lab as well. Both of these come with solid lab resources. Having a well funded PI means you are more likely to have your position next year. It also means that you can ramp up your research program, get data and papers, and be more competitive in grant applications. You can certainly be successful without a lot of lab money and other resources, but being in a well funded lab means that is one less (big) limiting factor that you are going to have to worry about.</div>
<div style="text-align: justify;">
<br /></div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="http://tripsaroundthailand.com/wp-content/uploads/2014/11/Tropical-paradise.png" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://tripsaroundthailand.com/wp-content/uploads/2014/11/Tropical-paradise.png" height="199" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Location, location, location!</td></tr>
</tbody></table>
<h2 style="text-align: justify;">
Look For a Lab in a Great Location</h2>
<div style="text-align: justify;">
You are a scientist <b>AND</b> a human being. That means you likely want to be happy in your life and enjoy your environment. To this end, I encourage you to think about the location of each lab you are considering. For example, if you love skiing, Florida might be a less ideal fit for you. Conversely, if you absolutely hate the snow, Minnesota would be a poor fit.</div>
<div style="text-align: justify;">
<br /></div>
<h2 style="text-align: justify;">
Look At The Lab Track Record</h2>
<div style="text-align: justify;">
Talk is cheap. Don't just ask if a lab is good, but look at whether they are producing (or capable of producing) the type of scientist you want to be. If you want to get into industry, you might want to take a second look at that lab who had a few members go onto industry positions. Of course this is more difficult for newer labs who simply don't have any track record, but I still believe this is an important process to go through.</div>
<div style="text-align: justify;">
<br /></div>
<h2 style="text-align: justify;">
Make The Most Of The Interview</h2>
<div style="text-align: justify;">
Once you have narrowed your list down to a few labs, you are going to travel to the lab and interview. Remember that this is as much about you interviewing them, as it is them interviewing you. Be sure to prepare for the interview with a list of questions to ask, and a set of goals you want to achieve. And actually write it out! This includes questions to ask the PI, as well as the other lab and department members. This is the best time to get a feel for the lab and figure out if it is a good fit.</div>
<div style="text-align: justify;">
<br /></div>
<h2 style="text-align: justify;">
Go With Your Gut</h2>
<div style="text-align: justify;">
You have been around a lot of labs at this point, so you have a good idea of what you are looking for. Even if you have a hard time articulating the exact feeling you have for different lab, you probably have a good "gut feeling" for what will work best for you. Trust your instincts.</div>
<div style="text-align: justify;">
<br /></div>
<h2 style="text-align: justify;">
Choose A Lab</h2>
<div style="text-align: justify;">
The final and most difficult step of the process is choosing a lab. This is especially hard because you have already narrowed down your options to great labs, and they are honestly all probably good choices (I know they were in my experience). In the end you have to talk to your loved ones, go for a walk to think, and go with your gut on what option you want to commit to. It's a near impossible decision to make for most people, but take comfort in knowing that all of the choices are probably great.</div>
<div style="text-align: justify;">
<br /></div>
<h2 style="text-align: justify;">
Final Thoughts</h2>
<div style="text-align: justify;">
So there you have it. Some general thoughts I have on the whole postdoc hunting process. Of course these are just my opinions and musings, and the process is different for each person. But hopefully this will be a good starting place for thinking about finding that perfect postdoc position. And if you are non-professional scientist and reading this, I hope this gave you some insight into what we think about in our scientific careers. Additionally, high-five for making it to the end of a very long post!</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Do you have any thoughts about the postdoc search process? Did I miss any crucial pointers? Do you have questions as you start the process? Feel free to let me know in the comments below, in an email, or on Twitter. I always love it when people reach out.</div>
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com27tag:blogger.com,1999:blog-8971027081051936989.post-6482249567968760502016-05-22T20:17:00.000-04:002016-05-22T20:17:35.371-04:00Piggybacking Instead of Killing: New Insights Into Virus Community Dynamics<div style="text-align: justify;">
<a href="http://www.popsci.com/sites/popsci.com/files/styles/large_1x_/public/import/2013/images/2009/03/inphasion485.jpg?itok=d019MI9A" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="http://www.popsci.com/sites/popsci.com/files/styles/large_1x_/public/import/2013/images/2009/03/inphasion485.jpg?itok=d019MI9A" height="320" width="303" /></a>The <a href="https://en.wikipedia.org/wiki/Human_microbiota">human microbiome</a> is an important component of human health and disease. It is an ecosystem of microbes that exists in and on humans, and can affect disease states through disturbances in composition, diversity, metabolism, etc. Understanding the human microbiome will not only allow us to better understand human health, but it will also allow us to treat medical conditions in new and effective ways (e.g. Fecal Microbiota Transplants).</div>
<div style="text-align: justify;">
<br />
<a name='more'></a></div>
<div style="text-align: justify;">
Most studies to date have focused on understanding the bacterial component of the human microbiome. While this route has proven beneficial, it fails to consider the more complex system at large. Bacteria are interacting with communities of microbes including viruses (including bacteriophages which are viruses that infect only bacteria), and understanding these phage-bacteria dynamics is crucial for understanding the true human microbiome system. Our paper this week provides such insights into the dynamics of virus communities and their interactions with their bacterial hosts.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
This paper by Knowles <i>et al</i> builds off of <b>two observations</b>. The <b>first</b> is that many phage-bacteria communities have been modeled to follow the "kill-the-winner" model of predation. This model states that lytic phages target and kill the most successful bacteria (the "winners"), thus preventing dominance of a single successful bacterium and maintaining relatively even bacterial distributions. The <b>second</b> observation is that many community phages are in fact temperate (they can exist while silently integrated in their bacterial host genome) and are poorly incorporated into the existing kill-the-winner model. To reconcile this disagreement, Knowles <i>et al</i> developed an extended model called "piggyback-the-winner".<br />
<br />
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="http://www.pnas.org/content/111/20/7486/F5.large.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://www.pnas.org/content/111/20/7486/F5.large.jpg" height="313" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Examples of cyclical predator/prey relationships which<br />
are observed in phage-bacteria systems. <a href="http://www.pnas.org/content/111/20/7486/F5.large.jpg">SOURCE</a></td></tr>
</tbody></table>
The proposed <b>"piggyback-the-winner"</b> model states that instead of "killing the winner" when bacterial density increases, lytic activity is instead suppressed and an increased proportion of phages enter their dormant, integrated infectious state. This model is based largely on the observation that virus density often decreases as "microbe" density increases. The group provides a variety of sources of evidence to support their model in viral communities at large (please read the paper for details).<br />
<br />
One point of concern with this paper is that the group relies heavily on linear relationships between bacteria and phages, when <a href="http://www.pnas.org/content/111/20/7486.abstract">we know that these predator-prey dynamics often follow cyclical patterns</a>. This is not to say that the study is flawed or less valuable, but it would have been nice to hear more about the implications of the more accurate cyclical models over the linear models that were used. This is especially relevant because some of the scatter plots seem to be approaching more of a cyclical pattern than linear.<br />
<br />
<h2>
tl;dr</h2>
<br />
<b>So what can we take away from this paper?</b> Knowles <i>et al</i> is proposing a new predator-prey model called the "piggyback-the-winner" model which essentially states that more microbes equals fewer viruses. The group primarily supports their model with linear abundance modeling from a variety of microbiomes, spanning from oceans to humans. This is a valuable step toward our understanding of the entire microbiome (bacteria, viruses, etc) and will inform future studies, both environmental and medical. We are also likely to see this model develop as more sophisticated techniques are used.<br />
<br />
If you enjoyed our discussion, go ahead and check out the full paper in Nature. There you can find all of the details that we skimmed over here in our brief discussion. It is actually a relatively short read so it is worth checking out. And of course if you have any comments to add or questions to ask, speak out in the comments below, reach out on Twitter, or send an email!<br />
<br />
<br />
<span style="float: left; padding: 5px;"><a href="http://www.researchblogging.org/"><img alt="ResearchBlogging.org" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" style="border: 0;" /></a></span>
<br />
<br />
<br />
<h3>
Works Cited</h3>
<br />
<br />
<br />
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=Nature&rft_id=info%3Apmid%2F26982729&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Lytic+to+temperate+switching+of+viral+communities.&rft.issn=0028-0836&rft.date=2016&rft.volume=531&rft.issue=7595&rft.spage=466&rft.epage=70&rft.artnum=&rft.au=Knowles+B&rft.au=Silveira+CB&rft.au=Bailey+BA&rft.au=Barott+K&rft.au=Cantu+VA&rft.au=Cobi%C3%A1n-G%C3%BCemes+AG&rft.au=Coutinho+FH&rft.au=Dinsdale+EA&rft.au=Felts+B&rft.au=Furby+KA&rft.au=George+EE&rft.au=Green+KT&rft.au=Gregoracci+GB&rft.au=Haas+AF&rft.au=Haggerty+JM&rft.au=Hester+ER&rft.au=Hisakawa+N&rft.au=Kelly+LW&rft.au=Lim+YW&rft.au=Little+M&rft.au=Luque+A&rft.au=McDole-Somera+T&rft.au=McNair+K&rft.au=de+Oliveira+LS&rft.au=Quistad+SD&rft.au=Robinett+NL&rft.au=Sala+E&rft.au=Salamon+P&rft.au=Sanchez+SE&rft.au=Sandin+S&rft.au=Silva+GG&rft.au=Smith+J&rft.au=Sullivan+C&rft.au=Thompson+C&rft.au=Vermeij+MJ&rft.au=Youle+M&rft.au=Young+C&rft.au=Zgliczynski+B&rft.au=Brainard+R&rft.au=Edwards+RA&rft.au=Nulton+J&rft.au=Thompson+F&rft.au=Rohwer+F&rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CEcology+%2F+Conservation">Knowles B, Silveira CB, Bailey BA, Barott K, Cantu VA, CobiĆ”n-GĆ¼emes AG, Coutinho FH, Dinsdale EA, Felts B, Furby KA, George EE, Green KT, Gregoracci GB, Haas AF, Haggerty JM, Hester ER, Hisakawa N, Kelly LW, Lim YW, Little M, Luque A, McDole-Somera T, McNair K, de Oliveira LS, Quistad SD, Robinett NL, Sala E, Salamon P, Sanchez SE, Sandin S, Silva GG, Smith J, Sullivan C, Thompson C, Vermeij MJ, Youle M, Young C, Zgliczynski B, Brainard R, Edwards RA, Nulton J, Thompson F, & Rohwer F (2016). Lytic to temperate switching of viral communities. <span style="font-style: italic;">Nature, 531</span> (7595), 466-70 PMID: <a href="http://www.ncbi.nlm.nih.gov/pubmed/26982729" rev="review">26982729</a></span>
<br />
<br />
<br /></div>
Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com0tag:blogger.com,1999:blog-8971027081051936989.post-77861077889110285312016-05-15T22:45:00.000-04:002016-05-15T22:45:38.708-04:00Real Time Code Correction with Linters<div class="separator" style="clear: both; text-align: center;">
</div>
<div style="text-align: justify;">
<div class="separator" style="clear: both; text-align: center;">
<a href="https://1.bp.blogspot.com/-15_oRI0T2w0/VzkxzRDrj5I/AAAAAAAABXI/uJ2X1BaeNFEeM1xEcbIv2PWs5EzCiIgdgCLcB/s1600/first-programming-job.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="183" src="https://1.bp.blogspot.com/-15_oRI0T2w0/VzkxzRDrj5I/AAAAAAAABXI/uJ2X1BaeNFEeM1xEcbIv2PWs5EzCiIgdgCLcB/s320/first-programming-job.jpg" width="320" /></a></div>
If you are a regular around here, or if you even took a look at the date since the last post, you may have noticed a gap. As tends to happen in the blog world, I took a hiatus to focus on other projects and research. This was actually very productive and I am excited for you to see the fruits of those labors in the coming months. So thanks for sticking with us and joining in the return of Prophage activity.</div>
<div style="text-align: justify;">
<br />
<a name='more'></a></div>
<div style="text-align: justify;">
This week I want to talk about improving our everyday programming using something called a linter. A linter is simply a program that runs with a text editor and checks for stylistic and programming errors (if you already know about these, I apologize as this is a simplified explanation). So put another way, it is like the spellcheck and grammar check functions we have seen in word processors (e.g. Microsoft Word) except it works with programming languages. Now I have only started using linters in the past couple of months, but I have totally fallen in love and wish I had been using them a long time ago. In this post we are going to familiarize ourselves with linters and hopefully end up downloading them to use in our own programming.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The first question we might ask is <b>why we should use a linter?</b> It sounds like just another complicated program to have to deal with. Well just like when we type emails or text messages, we make mistakes like typos and we rely on spell check and grammar check to alert us when a potential mistake has occurred. In the end we get a much cleaner, clearer, and professional document. A linter does this with code. It will alert us when it appears we have made a syntax error, or may have written in something unstable that could behave unexpectedly. An example of such a correction is changing working directories in a bash script.</div>
<br />
<!-- HTML generated using hilite.me --><br />
<div style="background: #ffffff; border-width: 0.1em 0.1em 0.1em 0.8em; border: solid gray; overflow: auto; padding: 0.2em 0.6em; width: auto;">
<pre style="line-height: 125%; margin: 0;"><span style="color: #888888;"># If I write this</span>
<span style="color: #007020;">cd</span> ~/documents
<span style="color: #888888;"># My linter tells me "Use cd ... || exit in case cd fails".</span>
<span style="color: #888888;"># So I change it to this safer line</span>
<span style="color: #007020;">cd</span> ~/documents <span style="color: #333333;">||</span> <span style="color: #007020;">exit</span>
</pre>
</div>
<br />
<div style="text-align: justify;">
Not only did the linter save me from an unsafe directory change, but if I was unaware of that danger in the past, I now know and can avoid it in the future. So in the end it can save you from errors, as well as teach you along the way.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
So now that we have seen linters can be helpful, how do we set them up on our computers? There is linter for almost every language, and these linters can be used in almost every text editor. I spend a lot of time programming in Bash, R, and Perl, and I have found the associated linters to be incredibly beneficial.</div>
<br />
<h2>
<u>R</u></h2>
<br />
<a href="https://github.com/jimhester/lintr">Lintr</a><br />
<br />
<h2>
<u>Bash</u></h2>
<br />
<a href="https://github.com/SublimeLinter/SublimeLinter-shellcheck">Shell Check</a><br />
<br />
<h2>
<u>Perl</u></h2>
<br />
<a href="https://github.com/oschwald/SublimeLinter-perl">Perl Linter</a><br />
<br />
<a href="https://github.com/oschwald/SublimeLinter-perlcritic">Perl Critic</a><br />
<br />
<div style="text-align: justify;">
I use all of these in <a href="https://www.sublimetext.com/">Sublime Text</a>, but you can also use them in most other text editors. <a href="http://www.sublimelinter.com/en/latest/index.html">Check this out for help getting started with Sublime Linter.</a> Please note that downloading Sublime Linter does not include all of the linters for various languages, so you have to install those additionally. However this installation is easy and they walk you through the process.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
And there you have it. Everything you need to get started with linters in your own programming practices. As always, if you have any questions, comments, or concerns about this post, please leave me a message in the comments. You can also reach out to me by email or Twitter.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Happy coding!</div>
<br />
*<a href="http://www.sololearn.com/Uploads/first-programming-job.jpg">Image Source</a><br />
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com0tag:blogger.com,1999:blog-8971027081051936989.post-62007702525814627902016-04-03T21:20:00.000-04:002016-04-03T21:20:44.244-04:00The Illumina Error Profile for Metagenomic Sequencing<div style="text-align: justify;">
<a href="http://i.telegraph.co.uk/multimedia/archive/01822/prostate_1822470b.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="http://i.telegraph.co.uk/multimedia/archive/01822/prostate_1822470b.jpg" height="200" width="320" /></a>Microbiology, and especially microbial ecology, has become increasingly dependent on advanced DNA and RNA sequencing technologies. This is most evident with the increasing popularity of the human microbiome and its various impacts on human health. While using DNA sequencing sometimes appears relatively simple (a result of the great efforts made to simplify the user experience), it is actually still a very complicated technique that requires a lot of thought and skill. One aspect that genomic scientists (whether focusing on human or microbial DNA) must always consider is the bias introduced by the sequencing platform itself. This week I want to focus on a recently published manuscript that describes the sequencing error profile associated with some of the most popular <a href="http://www.illumina.com/">Illumina</a> platforms.</div>
<div style="text-align: justify;">
<br />
<a name='more'></a></div>
<div style="text-align: justify;">
We know that sequencing platforms introduce systematic biases. Last year a group showed this to be true when performing 16S rRNA amplicon sequencing on Illumina platforms [1]. This year Schimer <i>et al </i>(from the same lab) expanded on that work by characterizing the errors associated with <a href="https://en.wikipedia.org/wiki/Metagenomics">metagenomic sequencing techniques</a> (i.e. random shotgun sequencing)[2].</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The paper aims to address <b>four points</b>:</div>
<div style="text-align: justify;">
<ol>
<li>Define error rates of substitutions and indels between platforms.</li>
<li>Identify sequence motifs associated with errors.</li>
<li>Evaluate ability of quality scores to predicts different error types.</li>
<li>Compare error removal approaches across platforms.</li>
</ol>
In the end they come to the following <b>conclusions</b>:</div>
<div style="text-align: justify;">
<ol>
<li>Substitutions are more frequent than indels and their frequency varies by platform.</li>
<li>Errors are associated with trimer motifs that are consistent across sequencing platforms.</li>
<li>Base errors are associated with low quality scores.</li>
<li>Quality trimming and Bayes Hammer are most effective for reducing errors when used together.</li>
</ol>
<div style="text-align: justify;">
There was one additional point that I thought was worth noting since Schimer <i>et al</i> didn't really get into it in the paper. The group talks about nucleotide motifs associated with errors, and make a note of error-associated adenine and thymine residues. This is interesting because adenines are used at the end of a sequences after it has read through the DNA fragment. Said another way, when a DNA fragment is shorter than what the sequencing platform is reading, it will read through the DNA and, once it falls off the end of the fragment, start inserting a string of A's as a placeholder. As far as I can tell, the research group did not perform the quality control step of trimming these A (and T for the reverse compliment) strings, meaning their analysis could be picking these up. This would mean that the A's could be throwing off their other analyses such as motif identification and sequence alignments. Because there were other error-associated motifs, it seems unlikely that this point ruins the paper, but it is important to note when interpreting their results.</div>
<div style="text-align: justify;">
<br /></div>
Overall this is a really cool paper filled with a lot of important information for anybody interested in doing microbial metagenomics. I definitely suggest reading it, as well as keeping it around as a reference. Additionally, since I presented this paper in our lab journal club, I have my <b>slide deck</b> freely available for you to download. <a href="http://microbiology.github.io/PDFs/SchlossHanniganJournalClub2016-03-23.pdf">Check it out here</a>.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<span style="float: left; padding: 5px;"><a href="http://www.researchblogging.org/"><img alt="ResearchBlogging.org" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" style="border: 0;" /></a></span>
<br />
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<h3>
Works Cited</h3>
</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br />
<br />
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=Nucleic+Acids+Research&rft_id=info%3Adoi%2F10.1093%2Fnar%2Fgku1341&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Insight+into+biases+and+sequencing+errors+for+amplicon+sequencing+with+the+Illumina+MiSeq+platform&rft.issn=0305-1048&rft.date=2015&rft.volume=43&rft.issue=6&rft.spage=0&rft.epage=0&rft.artnum=http%3A%2F%2Fnar.oxfordjournals.org%2Flookup%2Fdoi%2F10.1093%2Fnar%2Fgku1341&rft.au=Schirmer%2C+M.&rft.au=Ijaz%2C+U.&rft.au=D%27Amore%2C+R.&rft.au=Hall%2C+N.&rft.au=Sloan%2C+W.&rft.au=Quince%2C+C.&rfe_dat=bpr3.included=1;bpr3.tags=Biology"><b>1.</b> Schirmer, M., Ijaz, U., D'Amore, R., Hall, N., Sloan, W., & Quince, C. (2015). Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform <span style="font-style: italic;">Nucleic Acids Research, 43</span> (6) DOI: <a href="http://dx.doi.org/10.1093/nar/gku1341" rev="review">10.1093/nar/gku1341</a></span>
<br />
<br />
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=BMC+Bioinformatics&rft_id=info%3Adoi%2F10.1186%2Fs12859-016-0976-y&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Illumina+error+profiles%3A+resolving+fine-scale+variation+in+metagenomic+sequencing+data&rft.issn=1471-2105&rft.date=2016&rft.volume=17&rft.issue=1&rft.spage=&rft.epage=&rft.artnum=http%3A%2F%2Fwww.biomedcentral.com%2F1471-2105%2F17%2F125&rft.au=Schirmer%2C+M.&rft.au=D%E2%80%99Amore%2C+R.&rft.au=Ijaz%2C+U.&rft.au=Hall%2C+N.&rft.au=Quince%2C+C.&rfe_dat=bpr3.included=1;bpr3.tags=Biology"><b>2. </b>Schirmer, M., DāAmore, R., Ijaz, U., Hall, N., & Quince, C. (2016). Illumina error profiles: resolving fine-scale variation in metagenomic sequencing data <span style="font-style: italic;">BMC Bioinformatics, 17</span> (1) DOI: <a href="http://dx.doi.org/10.1186/s12859-016-0976-y" rev="review">10.1186/s12859-016-0976-y</a></span>
<br />
<br />
<br /></div>
Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com2tag:blogger.com,1999:blog-8971027081051936989.post-17643489076523878082016-03-20T19:13:00.000-04:002016-03-20T19:13:48.038-04:00My Experience Sharing Protocols in the New "protocols.io" Environment<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://3.bp.blogspot.com/-B82hXBrJzW0/Vu8t-8qj_WI/AAAAAAAABWM/A9mo6xDVJokew4G5lXZauBH_ewBnK_-eg/s1600/HiResCollaborationImage.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" height="319" src="https://3.bp.blogspot.com/-B82hXBrJzW0/Vu8t-8qj_WI/AAAAAAAABWM/A9mo6xDVJokew4G5lXZauBH_ewBnK_-eg/s320/HiResCollaborationImage.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><a href="http://cornerstonemag.net/wp-content/uploads/2014/01/HiResCollaborationImage.jpg"><Source></a></td></tr>
</tbody></table>
<div style="text-align: justify;">
Scientists publish methods in their manuscripts, but these summaries can fail to capture the technical details involved in the described processes. Many scientists get around this by by making the actual step-by-step protocols freely available to the public. There are a variety of avenues for accomplishing this. Some scientists publish their protocols with their manuscripts, some post them in public archives, and others publish them on their lab websites. There are advantages and disadvantages to these approaches, and most of us are always learning about new and improved resources to facilitate the sharing process. I recently learned about the online resource <a href="https://www.protocols.io/about">protocols.io</a>, which is a surprisingly robust and free resource for sharing experimental protocols.</div>
<div style="text-align: justify;">
</div>
<a name='more'></a><br />
<div style="text-align: justify;">
I originally learned about protocols.io when their group <a href="http://www.hurwitzlab.org/verve-net/">VERVE Net (of the Hurwitz lab)</a> graciously transcribed our group's published computational protocols over to the protocols.io environment. <a href="https://www.protocols.io/g/club-grice">Our computational protocols</a> were originally archived on <a href="https://figshare.com/articles/The_Human_Skin_dsDNA_Virome_Topographical_and_Temporal_Diversity_Genetic_Enrichment_and_Dynamic_Associations_with_the_Host_Microbiome/1281248">FigShare</a>. Although the protocols were only recently uploaded to protocols.io, I have been impressed with what it offers.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The most compelling <b>benefit</b> I can see with using protocols.io is that it offers a new degree of <b>visibility</b> to your research. By being part of their environment, your protocols are searched for, and viewed by, a wider scientific audience. This means that more people will learn about your research, use the approaches you developed, and cite your work as a beneficial contribution to the field.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The other <b>benefit</b> I can see from using protocols.io is the user-friendly interface optimized for scientific use. Not only can you search for and view protocols, but you can follow along with them in their app step-by-step, check off completed tasks, use integrated timers, etc. You can also <a href="https://help.github.com/articles/fork-a-repo/">fork</a> protocols (i.e. make your own copy) that you can update to meet your own experimental needs. It definitely feels like it was influenced by common software version control resources such as <a href="https://en.wikipedia.org/wiki/Git_(software)">Git</a>.</div>
<div style="text-align: justify;">
<br /></div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="http://www.researchtrends.com/wp-content/uploads/2012/09/cartoon1.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://www.researchtrends.com/wp-content/uploads/2012/09/cartoon1.jpg" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><a href="http://www.researchtrends.com/wp-content/uploads/2012/09/cartoon1.jpg"><Source></a></td></tr>
</tbody></table>
<div style="text-align: justify;">
Now this is all fine, but wouldn't a system like this only be good for "wet lab" protocols and not computational workflows? It certainly seems to have been built for "wet lab" protocols, but it works surprisingly well with computational workflows too. I honestly don't see it replacing source code repositories such as GitHub, but I do think it has a place for publishing widely-used standard operating procedures (SOPs) for various bioinformatics tasks. As an example, the <a href="http://www.mothur.org/wiki/MiSeq_SOP">Mothur SOP</a> for processing 16S rRNA sequencing data is available on the Mothur Wiki as a step-by-step workflow. I could see a reference workflow like this being published in the protocols.io environment.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
So in the end, I would suggest checking protocols.io out. It is a cool effort toward promotion of scientific transparency and collaboration, and I think you could benefit from using it.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Did I miss something or fail to elaborate on a point you want to hear more about? As always, I invite you to let me know in the comments below. I would love to hear from you!</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com2tag:blogger.com,1999:blog-8971027081051936989.post-33094550288647407142016-03-06T21:46:00.001-05:002016-03-06T21:46:54.955-05:00Helping Both Humans and Dogs: A Recent Study of Canine Atopic Dermatitis<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="https://2.bp.blogspot.com/-h61PlItSwOw/Vtznar-M_pI/AAAAAAAABVw/XD5Hpp2MWHM/s1600/CanineAD.png" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://2.bp.blogspot.com/-h61PlItSwOw/Vtznar-M_pI/AAAAAAAABVw/XD5Hpp2MWHM/s320/CanineAD.png" width="248" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Example of canine Atopic Dermatitis, as seen<br />
in the manuscript we are discussing.</td></tr>
</tbody></table>
<div style="text-align: justify;">
<a href="https://en.wikipedia.org/wiki/Atopic_dermatitis" target="_blank">Atopic dermatitis (AD)</a>, which is also referred to as Eczema, is a very common dermatological disease, especially in children. It is estimated that AD affects 10% of children. The disease presents as dry, scaly, itchy skin. Atopic dermatitis can be especially problematic when the victim (often a child) itches the skin extensively, thereby increasing susceptibility to skin infections. Treatment of the disease ranges from controlling the itchy skin with soothing topical medication to bathing the patient in dilute bleach (the <a href="https://nationaleczema.org/eczema/treatment/alternative-therapies/bleach-baths/" target="_blank">bleach bath technique</a>).</div>
<div style="text-align: justify;">
</div>
<a name='more'></a><br />
<div style="text-align: justify;">
In addition to genetics, there is evidence that AD has a microbial component. More specifically, research has linked the disease to <i><a href="https://en.wikipedia.org/wiki/Staphylococcus" target="_blank">Staphylococcus</a></i> bacteria colonization that may play a role in flares and disease control. The bacteria and human genetics are thought to be linked in part by the impaired skin barrier function (e.g. control of water loss, acidity, etc) that results in an altered environment for the bacteria, and especially <i>Staphylococcus</i>, to grow. </div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
What makes this week's study by Bradley <i>et al</i> particularly interesting over existing AD microbiome studies is that they investigate both the altered bacterial communities, as well as the altered barrier function of the diseased skin itself. Their study focused on canine AD, and so was conducted entirely in dogs. Canine AD affects approximately 10% of dogs, and perhaps more importantly, it closely resembles the human disease, thus providing information relevant to human medicine.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The group conducted their study with a cohort of 32 dogs, 15 of which were diagnosed with canine AD. Each dog had various skin sites swabbed for microbiome analysis by 16S rRNA gene sequencing (the standard approach for studying the microbiome). Sampling was done over time, so the temporal dynamics of the disease could also be visualized. Like previous studies, the group found that flaring skin was associated with an increased dominance of <i>Staphylococcus</i> in the microbiome (measured as relative abundance). They also found the diseased skin was associated with altered bacterial diversity, and that antimicrobial therapy restored the microbiome to a healthier state.</div>
<div style="text-align: justify;">
<br /></div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="http://eea.spaceflight.esa.int/attachments/spacestations/ID51c34f26ab2e5.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://eea.spaceflight.esa.int/attachments/spacestations/ID51c34f26ab2e5.jpg" height="212" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Example of a non-invasive device used to measure skin<br />
barrier function.</td></tr>
</tbody></table>
<div style="text-align: justify;">
The study really got cool when they evaluated the barrier function of the diseased skin and linked that data to their microbiome data. In the end, they found some links between microbiome diversity and some aspects of impaired barrier function. I emphasize <i>some</i> because the correlations were only between certain microbiome and barrier signatures. Overall this may suggest the AD microbiome signatures are the results of an altered skin environment due to impaired barrier function. Perhaps the presence of the bacteria are feeding into the progression of the skin flare? There are a lot of interesting research directions that this could go, and it will be exciting to watch where the group takes it next.</div>
<br />
In the end, this is a cool study and it is worth reading. The group provides valuable insight into a common disease both for humans and dogs. Moving forward, I would be very interested in seeing the group look more into the links between skin barrier function and <i>Staphylococcus</i> colonization. This might include a heavier immunological study that further investigates the molecular response of the AD skin to the microbes. It will be interesting to see what they come up with.<br />
<br />
So to totally wrap things up, I want to thank you for reading. This blog is not possible without you the reader. I would also love to hear from you about any questions, comments, or concerns. Feel free to leave a comment below, email me, or Tweet me.<br />
<br />
<br />
<span style="float: left; padding: 5px;"><a href="http://www.researchblogging.org/"><img alt="ResearchBlogging.org" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" style="border: 0;" /></a></span>
<br />
<br />
<h3>
Works Cited</h3>
<br />
<br />
<br />
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=Journal+of+Investigative+Dermatology&rft_id=info%3Adoi%2F10.1016%2Fj.jid.2016.01.023&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Longitudinal+evaluation+of+the+skin+microbiome+and+association+with+microenvironment+and+treatment+in+canine+atopic+dermatitis&rft.issn=0022202X&rft.date=2016&rft.volume=&rft.issue=&rft.spage=&rft.epage=&rft.artnum=http%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0022202X16004553&rft.au=Bradley%2C+C.&rft.au=Morris%2C+D.&rft.au=Rankin%2C+S.&rft.au=Cain%2C+C.&rft.au=Misic%2C+A.&rft.au=Houser%2C+T.&rft.au=Mauldin%2C+E.&rft.au=Grice%2C+E.&rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CMedicine%2CHealth%2CEcology+%2F+Conservation">Bradley, C., Morris, D., Rankin, S., Cain, C., Misic, A., Houser, T., Mauldin, E., & Grice, E. (2016). Longitudinal evaluation of the skin microbiome and association with microenvironment and treatment in canine atopic dermatitis <span style="font-style: italic;">Journal of Investigative Dermatology</span> DOI: <a href="http://dx.doi.org/10.1016/j.jid.2016.01.023" rev="review">10.1016/j.jid.2016.01.023</a></span>
<br />
<br />
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com23tag:blogger.com,1999:blog-8971027081051936989.post-23822692390281367862016-02-14T16:00:00.003-05:002016-02-14T16:00:58.008-05:00Methods Matter: Getting Started with the Skin Microbiome<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="http://assets.illumina.com/content/dam/illumina-marketing/images/techniques/web-graphic-intro-dna-sequencing.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://assets.illumina.com/content/dam/illumina-marketing/images/techniques/web-graphic-intro-dna-sequencing.jpg" height="206" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Your choice of sequencing approach matters. Think<br />
about your goals and the methodological caveats<br />
before starting your experiments.</td></tr>
</tbody></table>
<div style="text-align: justify;">
The field of microbiome research has been hugely popular in the past few years. It has forced us to rethink our approaches to various medical practices, and has captured the imaginations of both amateur and professional scientists. With this popularity has come an influx of scientists trying to incorporate the microbiome into their own research. It is of course great that people want to get into the field, but unfortunately it is deceptively difficult for newcomers who are not always aware of how best to get started. This has led to the execution of poorly designed studies that could have been improved by more methodological resources in the literature. To this end, my colleague (and lab mate) led a research project to evaluate the differences between sequencing methods of the skin microbiome, a consideration that is often overlooked by newcomers to the field. This week I want to briefly hit the highlights of the paper and suggest that you read it if you are interested in starting any skin microbiome work.</div>
<div style="text-align: justify;">
</div>
<a name='more'></a><br />
<div style="text-align: justify;">
The study was led by <a href="http://www.med.upenn.edu/gricelab/people/" target="_blank">Jacquelyn Meisel in Elizabeth Grice's laboratory</a>, and was published in the <a href="http://www.jidonline.org/" target="_blank">Journal of Investigative Dermatology</a> (the premier dermatology research journal). In their study, Meisel <i>et al</i> evaluated the effects of <b>three</b> different sequencing methods for studying the skin microbiome. </div>
<ol>
<li style="text-align: justify;"><b>Whole metagenome shotgun (WMS) sequencing</b>, which means the entire genomes (or genome fragments called contigs) of the skin bacteria were sequenced instead of a specific region (e.g. 16S rRNA). This method is costly and more difficult to analyze, but can provide answers to many questions regarding the genomic structure of the communities that cannot be answered using techniques involving marker genes.</li>
<li style="text-align: justify;"><b>16S rRNA V4 region gene sequencing</b>, which means the fourth variable region (V4) of all bacteria within the bacterial community is sequenced and used to provide taxonomic/phylogenetic information. Variable regions are used because the high throughput sequencing technologies cannot span the entire length of the gene, and the variable regions allow for the greatest differentiation between different bacteria (if we used a conserved region, they would all look the same). This method is great because it is cheaper, provides strong taxonomic/phylogenetic information about the community, and is sufficient to answer many research questions. It does not provide sequences for the entire genomes however.</li>
<li style="text-align: justify;"><b>16S rRNA V1-3 region gene sequencing</b>, which is the same approach as V4, although it is covering variable regions 1-3 instead of four. Different variable regions provide different resolution between members of the community because they are differentially variable between groups of bacteria. This region in particular is longer than V4, which means it can provide more information at the expense of being more difficult to sequence.</li>
</ol>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://image.slidesharecdn.com/molecularmethods-bioaerosols-peccia-140324090829-phpapp01/95/dnabased-methods-for-bioaerosol-analysis-18-638.jpg?cb=1395652387" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://image.slidesharecdn.com/molecularmethods-bioaerosols-peccia-140324090829-phpapp01/95/dnabased-methods-for-bioaerosol-analysis-18-638.jpg?cb=1395652387" height="308" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Illustration of the variable regions within the 16S rRNA<br />
gene. The valleys are regions of low conservation, and<br />
are labeled as variable regions 1-9. <a href="http://image.slidesharecdn.com/molecularmethods-bioaerosols-peccia-140324090829-phpapp01/95/dnabased-methods-for-bioaerosol-analysis-18-638.jpg?cb=1395652387" target="_blank"><Source></a></td></tr>
</tbody></table>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
So what did the group find? The highlight was that the V4 region poorly characterized the skin community, while the V1-3 and metagenomic approaches were much more accurate (accuracy was determined by sequencing a known community and comparing the results to the known composition). The most striking limitation to sequencing the V4 region was its inability to capture <i>Propionibacteria</i>.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The reason for using metagenomic approaches over 16S sequencing is thought to be that the metagenomic data allows for an understanding of the functional potential of the community. Meisel <i>et al</i> found that the functional predictions made using 16S data was similar to that found in the metagenome samples, meaning you are getting comparable results but paying considerably more for the metagenomic data.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The group also evaluated the effects of these methods on the resulting diversity calculated for the communities. They found that the resulting diversity was in fact impacted by the sequencing approach, highlighting a danger in comparing results from different studies that used different sequencing methods.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Now I know I have an obvious bias since I was a part of this research, but Jackie (Jacquelyn) led an excellent study that provides an important resource to the field. If you are curious about the importance of sequencing methods, or if you yourself want to incorporate this type of study into your research, I suggest checking this paper out. It can help you to interpret other skin microbiome studies, and could prevent you from making costly mistakes in your own research.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
As always, I would love to hear your questions, comments, and concerns in the comment section below, or through email/Twitter. You can find my information to the right.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br />
<span style="float: left; padding: 5px;"><a href="http://www.researchblogging.org/"><img alt="ResearchBlogging.org" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" style="border: 0;" /></a></span>
<br />
<br />
<h3>
Works Cited</h3>
<br />
<br />
<br /></div>
<div style="text-align: justify;">
<br /></div>
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=The+Journal+of+investigative+dermatology&rft_id=info%3Apmid%2F26829039&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Skin+microbiome+surveys+are+strongly+influenced+by+experimental+design.&rft.issn=0022-202X&rft.date=2016&rft.volume=&rft.issue=&rft.spage=&rft.epage=&rft.artnum=&rft.au=Meisel+JS&rft.au=Hannigan+GD&rft.au=Tyldsley+AS&rft.au=SanMiguel+AJ&rft.au=Hodkinson+BP&rft.au=Zheng+Q&rft.au=Grice+EA&rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CMedicine%2CHealth%2CEcology+%2F+Conservation">Meisel JS, Hannigan GD, Tyldsley AS, SanMiguel AJ, Hodkinson BP, Zheng Q, & Grice EA (2016). Skin microbiome surveys are strongly influenced by experimental design. <span style="font-style: italic;">The Journal of investigative dermatology</span> PMID: <a href="http://www.ncbi.nlm.nih.gov/pubmed/26829039" rev="review">26829039</a></span>
<br />
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com26tag:blogger.com,1999:blog-8971027081051936989.post-90491559539376897692016-02-07T21:20:00.001-05:002016-02-07T21:20:54.916-05:00The Open Metagenome Toolkit Project<div style="text-align: justify;">
<a href="http://ccbgm.illinois.edu/files/2014/10/DNA-strand-wity-binary-code_web.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="http://ccbgm.illinois.edu/files/2014/10/DNA-strand-wity-binary-code_web.jpg" height="266" width="320" /></a>Almost two years ago I started collecting some scripts that I wrote for my own microbial metagenomic analyses. These are some relatively simple <a href="https://www.perl.org/" target="_blank">Perl</a> and <a href="https://www.python.org/" target="_blank">Python</a> scripts that do some common tasks that are required when studying bacterial or viral metagenomes. This collection of scripts if called the <a href="https://github.com/Microbiology/OpenMetagenomeToolkit" target="_blank">Open Metagenome Toolkit</a>. I recently added a few more scripts that I think are helpful, including a script to translate nucleotide sequences and a script to calculate the average lengths of reads. This week our post is about the Open Metagenome Toolkit because it is a cool opportunity for collaborative programming in our microbiome community.</div>
<div>
<div style="text-align: justify;">
</div>
<a name='more'></a></div>
<div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Of course I hope you will use this toolkit in your own research because I think it will make your life a little easier. But even more so, I hope you will head over to <a href="https://github.com/Microbiology/OpenMetagenomeToolkit" target="_blank">the Github repository</a> and show off some of your coding skills by contributing some scripts or adding to the scripts that are already there. If you are just getting started with coding, you can use this as a learning opportunity by adding to the existing scripts and getting some feedback.</div>
</div>
<div>
<div style="text-align: justify;">
<br /></div>
</div>
<div>
<div style="text-align: justify;">
The point of this project is to facilitate collaboration. With that comes proper credit to every contributor. Therefore the least we can do is include the names of the contributors on the project homepage, along with a link to their homepage. So go ahead and contribute, and actually be a part of the project no matter what skill level you are at.</div>
</div>
<div>
<div style="text-align: justify;">
<br /></div>
</div>
<div>
<div style="text-align: justify;">
In addition to its focus on collaboration, this toolkit focuses on mobility. It relies only on Perl and Python, which are so common that they actually come pre-installed on many operating systems. There is no requirement for installing additional programs or modules, including BioPerl and BioPython. This is a major <b>strength</b> because it means the user does not have to install any dependencies. This is also a nice exercise in programming, and I think offers a high degree of control to the programmer.</div>
</div>
<div>
<div style="text-align: justify;">
<br /></div>
</div>
<div>
<div style="text-align: justify;">
So now that you have read the intro, go over and <a href="https://github.com/Microbiology/OpenMetagenomeToolkit" target="_blank">check out the toolkit</a>. It is easy to download, easy to use, and easy to get involved with. If you have any ideas for functions that should be added, go ahead and add them in <a href="https://github.com/Microbiology/OpenMetagenomeToolkit/issues" target="_blank">the issue section</a>. Otherwise you can directly add to the scripts.</div>
</div>
<div>
<div style="text-align: justify;">
<br /></div>
</div>
<div>
<div style="text-align: justify;">
Any questions or comments? Let me know in the comments below, on <a href="https://twitter.com/iprophage" target="_blank">Twitter</a>, or by email. I would love to hear from you!</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
</div>
Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com0tag:blogger.com,1999:blog-8971027081051936989.post-89663784848519481452016-01-24T20:46:00.000-05:002016-01-24T20:46:08.142-05:00Recent Study Reveals Role for Bacterial Viruses in Microbiome Evolution<div style="text-align: right;">
</div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="http://images.encyclopedia.com/utility/image.aspx?id=5893946&imagetype=Manual&height=300&width=300" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://images.encyclopedia.com/utility/image.aspx?id=5893946&imagetype=Manual&height=300&width=300" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The microbiome is a complex community of bacteria,<br />
viruses, and other microbes.</td></tr>
</tbody></table>
<div style="text-align: justify;">
Microbial communities are fierce battlegrounds between bacteria and other microbes competing for limited resources. One method some bacteria use to kill their competitors is the production of <a href="https://en.wikipedia.org/wiki/Bacteriocin" target="_blank">bacteriocins</a>. Bacteriocins are protein toxins produced by bacteria to limit the growth of related bacteria, thereby providing a competitive advantage to the bacteriocin-producing bacteria. This dynamic is important to our health because it can impact bacterial infections and overall microbiome composition. The group of Nedialkova <i>et al</i> recently added a whole new level of insight into bacteriocins and microbial ecology by linking bacteriocin production to the presence of <a href="https://en.wikipedia.org/wiki/Bacteriophage" target="_blank">bacteriophages</a> (bacterial viruses).</div>
<div style="text-align: justify;">
</div>
<a name='more'></a><br />
<div style="text-align: justify;">
Overall this was a pretty straightforward study and a nice read. The research group recognized that <i>Salmonella enteric </i>genome contain a myriad of prophage genomes, which means the virus genome are integrated into the bacterial genome and are waiting to come out into an infectious cycle when the bacteria is stressed (a process called <a href="http://www.archaealviruses.org/terms/induction.html" target="_blank">phage induction</a>). This is medically relevant, because many antibiotics can induce bacteriophages.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
The group provided evidence for phages playing an important role in colicin release (the Salmonella bacteriocin) by removing the viruses out of the cultured bacterial genomes and observing a resulting decreased ability of the bacteria to release their bacteriocin. They attempted to pinpoint the phage genes involved in bacteriocin release from the bacteria, but this ultimately served to highlight the complex cell signaling involved in bacteriocin production and release. The group wrapped their study up by showing that by affecting colicin "use", the phages impact the evolution of <i>S. enterica</i> by affecting their competitive advantages. This was tested by competing the Salmonella with <i>E coli</i> bacteria that are commonly found in the human gut.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
I really like this paper because it provides even more evidence on how important phages are for bacterial functionality and evolution. This role for phages is relevant to isolated bacterial systems, but is also very important for the human microbiome. Phages are important for the structure and function of the human microbiome, and thereby impact human health in a big way. Overall this really shows how complex the human microbiome is, and how important it is to study the phages in these communities, instead of focusing only on the bacteria.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
So now that we have previewed the paper, I suggest looking it up and reading the real thing. It is a well written and straightforward paper that is worth reading. And finally, if you noticed I left anything out or missed a point you think is worth bringing up, shoot me a comment below. You should also always feel free to reach out on Twitter or by email.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<span style="float: left; padding: 5px;"><a href="http://www.researchblogging.org/"><img alt="ResearchBlogging.org" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" style="border: 0;" /></a></span>
<br />
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<h3>
Works Cited</h3>
<br />
<br />
<br />
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=Environmental+microbiology&rft_id=info%3Apmid%2F26439675&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Temperate+phages+promote+colicin-dependent+fitness+of+Salmonella+enterica+serovar+Typhimurium.&rft.issn=1462-2912&rft.date=2015&rft.volume=&rft.issue=&rft.spage=&rft.epage=&rft.artnum=&rft.au=Nedialkova+LP&rft.au=Sidstedt+M&rft.au=Koeppel+MB&rft.au=Spriewald+S&rft.au=Ring+D&rft.au=Gerlach+RG&rft.au=Bossi+L&rft.au=Stecher+B&rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CMedicine%2CHealth">Nedialkova LP, Sidstedt M, Koeppel MB, Spriewald S, Ring D, Gerlach RG, Bossi L, & Stecher B (2015). Temperate phages promote colicin-dependent fitness of Salmonella enterica serovar Typhimurium. <span style="font-style: italic;">Environmental microbiology</span> PMID: <a href="http://www.ncbi.nlm.nih.gov/pubmed/26439675" rev="review">26439675</a></span><br />
<br /></div>
Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com0tag:blogger.com,1999:blog-8971027081051936989.post-28179059045490676912016-01-17T22:11:00.000-05:002016-01-17T22:11:06.507-05:00A Primer on Linear Regression and its Associated Misconceptions<br />
<div style="text-align: justify;">
</div>
<div style="text-align: justify;">
<a href="http://media.al.com/breaking/photo/11128652-large.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em; text-align: justify;"><img border="0" src="http://media.al.com/breaking/photo/11128652-large.jpg" height="228" width="320" /></a>Welcome to the new year and the first Prophage blog post for 2016! This is already looking like it will be a great year for science and blogging. But enough with the pleasantries, let's dive into some science.</div>
<br />
<div style="text-align: justify;">
</div>
<a name='more'></a><br />
<div style="text-align: justify;">
I wanted to start the year off with post about math. I know, I know, math is an intimidating way to start the year, but don't run off yet! I swear that this will be painless and we will even learn something new! We are going to keep things simple and focus on an elegant paper that presents some misconceptions about a complicated topic. This topic is multiple linear regression. My goal is to introduce you to the topic of linear regression and prepare you to read this week's paper.</div>
<div style="text-align: justify;">
<br /></div>
<h2 style="text-align: justify;">
What is Linear Regression & When Should I Use It?</h2>
<div style="text-align: justify;">
Before we talk about multiple linear regression, let's cover simple linear regression. In its most simplified form, linear regression is a method for modeling the interaction between an independent (i.e. explanatory) and dependent variable. This is often plotted as a scatter plot with the dependent variable on the y axis, the independent variable on the x axis, and the linear regression model drawn as a line (see figure below). </div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
We commonly use this approach when we want to predict a dependent value given an independent value. An example of this (in the plot below) is tree age vs diameter. We know that tree diameter depends on age, but what if we want to predict the diameter (dependent variable) of a tree at a given age (independent explanatory variable). We can perform a linear regression to create a simple predictive model (shown as the line) to tell us what the diameter is likely to be at a given age. In our example, at age 30 it looks like the tree diameter will be 5 inches. The slope of the line is a coefficient that represents the relationship between the explanatory (age) and dependent (diameter) variables.</div>
<div style="text-align: justify;">
<br /></div>
<h2 style="text-align: justify;">
What is Multiple Linear Regression & When Should I Use It?</h2>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; text-align: justify;"><tbody>
<tr><td style="text-align: center;"><a href="http://www.physics.csbsju.edu/stats/oak.dbh.age.gif" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://www.physics.csbsju.edu/stats/oak.dbh.age.gif" height="256" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A simple example of linear regression modeling.<br />
Here we are modeling the relationship between<br />
tree age and diameter. <a href="http://www.physics.csbsju.edu/stats/least_squares.html" target="_blank">SOURCE</a></td></tr>
</tbody></table>
<div style="text-align: justify;">
Now what if we want a better model that includes more than one explanatory variable. For instance, what if we want to predict tree diameter given it's age and the average summer temperature of the climate the tree lives in? We might expect a tree in a colder climate to have less of a diameter compared to a tree in a warm climate. Once we start considering more than one variable, we are doing a multiple linear regression. It's that simple. Much like in a simple linear regression, both explanatory variables (age and temperature) have a coefficient that represents the relationship between the explanatory and dependent variable. Think of this relationship coefficient as the slope for each explanatory variable.</div>
<div style="text-align: justify;">
<br /></div>
<h2 style="text-align: justify;">
What is the Misconception?</h2>
<div style="text-align: justify;">
As Frasier TR expertly points out, there is a lot of confusion around interpreting these relationship coefficients. People often interpret these as being the independent relationships between the explanatory variables (age and temperature) and the dependent variable (diameter) given the full range of values of the explanatory variables. This is unfortunately not true. These coefficients only represent the relationship (i.e. slope) between their associated independent variable and dependent variable when the other independent variable is zero. So to use our example, the coefficient associated with age only represents the relationship (slope) between age and diameter when the temperature is zero. Frasier expertly outlines why this is actually a nontrivial point that has likely led to many erroneous scientific conclusions. Frasier's explanation is incredibly well done so I will direct you to followup with this post by reading the paper and seeing his examples for why this distinction is important.</div>
<div style="text-align: justify;">
<br /></div>
<h2 style="text-align: justify;">
Wrapping It Up</h2>
<div style="text-align: justify;">
I know this was a math heavy post, but I hope you enjoyed it and even learned a little. After reading these brief paragraphs, you should have a general feel for what linear regression is and why it is useful. This will prepare you to dive into the Frasier paper that is absolutely with a read. And of course, I want to end by pointing out that this is a complicated topic that you can read entire books about. We did not even scratch the surface in this post, but at least we took the first step toward a better understanding of math and how it can be used for prediction.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
Questions, comments, or concerns? Want to discuss any of these points? Add a comment below. I would love you hear what you think.</div>
<div style="text-align: justify;">
<br />
<br /></div>
<div style="text-align: justify;">
<h2>
Works Cited</h2>
</div>
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=Molecular+ecology+resources&rft_id=info%3Apmid%2F26650184&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=A+note+on+the+use+of+multiple+linear+regression+in+molecular+ecology.&rft.issn=1755-098X&rft.date=2015&rft.volume=&rft.issue=&rft.spage=&rft.epage=&rft.artnum=&rft.au=Frasier+TR&rfe_dat=bpr3.included=1;bpr3.tags=Biology%2CEcology+%2F+Conservation">Frasier TR (2015). A note on the use of multiple linear regression in molecular ecology. <span style="font-style: italic;">Molecular ecology resources</span> PMID: <a href="http://www.ncbi.nlm.nih.gov/pubmed/26650184" rev="review">26650184</a></span>
<br />
<br />
<br />
<br />Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com6tag:blogger.com,1999:blog-8971027081051936989.post-39813017986744513892015-12-27T23:19:00.000-05:002015-12-27T23:19:53.700-05:00Understanding How Silent Phages Can Prevent Detection of Potentially Deadly Food Contaminants<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: justify;"><tbody>
<tr><td style="text-align: center;"><a href="http://media.tumblr.com/tumblr_lthp9u6Ib31qkc18n.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://media.tumblr.com/tumblr_lthp9u6Ib31qkc18n.jpg" height="230" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Many bacteria are detected by culturing, or growing them<br />
out on plates of artificial media.</td></tr>
</tbody></table>
<div style="text-align: justify;">
Contamination of food with bacteria is a huge issue that can sometimes cause life-threatening illness. The bacterial culprits can include <i><a href="https://en.wikipedia.org/wiki/Escherichia_coli" target="_blank">E. coli</a></i>, as well as <i><a href="https://en.wikipedia.org/wiki/Listeria_monocytogenes#Epidemiology" target="_blank">Listeria monocytogenes</a></i>. <i>L. monocytogenes</i> is a potent bacteria that can very effectively infect its human host. This bacterium is especially problematic for pregnant women whose newborn children can develop meningitis that can lead to complications as severe as death. Because this is a serious infectious agent, there have been a lot of quality control efforts towards detecting this bacterium in food before it is sold. In this week's post, we are going to discuss a relatively recent study that highlights the role of phages in these efforts. The study does this by showing that nutrients used in the tests can activate silent phage infections and prevent bacterial detection.</div>
<div>
<br />
<a name='more'></a></div>
<div>
<div style="text-align: justify;">
<i>L. monocytogenes</i>, as well as other bacteria, are often detected using <a href="https://en.wikipedia.org/wiki/Microbiological_culture" target="_blank">culturing techniques</a>. This means that the bacteria are actually grown on special media in a petri dish. Simply put, if we streak our sample across the plate and see the dangerous bacteria growing, we conclude that the bacteria is present and can potentially cause an infection. Like most tests, these culture techniques are not 100% accurate and can have erroneous results. Of these incorrect results, false negatives (failure to detect bacteria that are present) are of particular concern because they allow the contaminated food to be sold.</div>
</div>
<div>
<div style="text-align: justify;">
<br /></div>
</div>
<div>
<div style="text-align: justify;">
In a recent report, a group led by Letaitre <i>et al</i> investigated a potential cause of false negative results by evaluating the roles of phages in the culturing process. <a href="https://en.wikipedia.org/wiki/Prophage" target="_blank">We know</a> that bacteria are capable of being silently infected by phages that can come out into an active infection and kill the host bacterium when it is stressed. These stresses can include nutrient conditions. In their recent paper, Letaitre<i> et al</i> found that these mechanisms may be responsible for false negative results in <i>L. </i><i>monocytogenes</i> tests.</div>
</div>
<div>
<div style="text-align: justify;">
<br /></div>
</div>
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: justify;"><tbody>
<tr><td style="text-align: center;"><a href="http://aem.asm.org/content/81/6/2117/F1.large.jpg" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="http://aem.asm.org/content/81/6/2117/F1.large.jpg" height="241" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Phages and tails that were detected in the study.</td></tr>
</tbody></table>
<div>
<div style="text-align: justify;">
The study is overall fairly straightforward and a good read. The group tested a variety of components from standard test media that are widely used in the detection of <i>L monocytogenes </i>contaminants. They found that many components are in fact capable of inducing bacteriophages, which means the compounds in the media are capable of killing the bacteria by activating silent phage infections. By killing the bacteria through phage induction, the contaminating bacteria will not be detectable by culture test, and the final result will incorrectly indicate a lack of <i>L monocytogenes</i>. In the end, this highlights the importance of understanding how phages impact the quality control tests that are being used, and also suggests a different media should be considered in quality control detection of <i>L </i><i>monocytogenes</i>.</div>
</div>
<div>
<div style="text-align: justify;">
<br /></div>
</div>
<div>
<div style="text-align: justify;">
As I mentioned above, this is a fairly straightforward read and I would suggest checking it out, especially if you are interested in the details. Any questions, comments, or concerns? Let me know in the comments below, or feel free to shoot me an email anytime. And remember to always consider the phage component.</div>
</div>
<div>
<div style="text-align: justify;">
<br /></div>
</div>
<div>
<div style="text-align: justify;">
<br /></div>
</div>
<div>
<span style="float: left; padding: 5px; text-align: justify;"><a href="http://www.researchblogging.org/"><img alt="ResearchBlogging.org" src="http://www.researchblogging.org/public/citation_icons/rb2_large_gray.png" style="border: 0;" /></a></span>
</div>
<div>
<div style="text-align: justify;">
<br /></div>
</div>
<div>
<div style="text-align: justify;">
<br /></div>
</div>
<div>
<h3 style="text-align: justify;">
Works Cited</h3>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.jtitle=Applied+and+environmental+microbiology&rft_id=info%3Apmid%2F25595760&rfr_id=info%3Asid%2Fresearchblogging.org&rft.atitle=Listeria+phage+and+phage+tail+induction+triggered+by+components+of+bacterial+growth+media+%28phosphate%2C+LiCl%2C+nalidixic+acid%2C+and+acriflavine%29.&rft.issn=0099-2240&rft.date=2015&rft.volume=81&rft.issue=6&rft.spage=2117&rft.epage=24&rft.artnum=&rft.au=Lema%C3%AEtre+JP&rft.au=Duroux+A&rft.au=Pimpie+R&rft.au=Duez+JM&rft.au=Milat+ML&rfe_dat=bpr3.included=1;bpr3.tags=Biology">LemaƮtre JP, Duroux A, Pimpie R, Duez JM, & Milat ML (2015). Listeria phage and phage tail induction triggered by components of bacterial growth media (phosphate, LiCl, nalidixic acid, and acriflavine). <span style="font-style: italic;">Applied and environmental microbiology, 81</span> (6), 2117-24 PMID: <a href="http://www.ncbi.nlm.nih.gov/pubmed/25595760" rev="review">25595760</a></span>
</div>
<div style="text-align: justify;">
<br /></div>
<br /></div>
Anonymoushttp://www.blogger.com/profile/00538876632791983007noreply@blogger.com17