Background: Improvements in sequencing technology now allow easy acquisition of large datasets; however,\nanalyzing these data for phylogenetics can be challenging. We have developed a novel method to rapidly obtain\nhomologous genomic data for phylogenetics directly from next-generation sequencing reads without the use of a\nreference genome. This software, called SISRS, avoids the time consuming steps of de novo whole genome assembly,\nmultiple genome alignment, and annotation.\nResults: For simulations SISRS is able to identify large numbers of loci containing variable sites with phylogenetic\nsignal. For genomic data from apes, SISRS identified thousands of variable sites, from which we produced an accurate\nphylogeny. Finally, we used SISRS to identify phylogenetic markers that we used to estimate the phylogeny of\nplacental mammals. We recovered eight phylogenies that resolved the basal relationships among mammals using\ndatasets with different levels of missing data. The three alternate resolutions of the basal relationships are consistent\nwith the major hypotheses for the relationships among mammals, all of which have been supported previously by\ndifferent molecular datasets.\nConclusions: SISRS has the potential to transform phylogenetic research. This method eliminates the need for\nexpensive marker development in many studies by using whole genome shotgun sequence data directly. SISRS is\nopen source and freely available at https://github.com/rachelss/SISRS/releases.
Loading....