Data and Software


Twin trees © Julien Y. Dutheil.

Scripts, data, and supplementary material of the group is available under Open Access on our GitLab repository.

Software

The Bio++ libraries

Bio++: A C++ library for sequence analysis, phylogenetics, molecular evolution and population genetics. The goal of Bio++ is to provide "bricks" for building efficient software for molecular evolution (including phylogenetics, population genetics, genomics, etc). We therefore research both on the design of the code (object orientation, ontology) and implementation (efficient algorithms and data structures).

The SGED Tools

The Site/Group Extended Data format and tools, to handle sequence/alignment/3D structure annotations. The SGED format is natively supported by the Bio++ libraries and derivative programs. The package provides standalone Python tools to manipulate SGED files and perform statistical analyses such as (conditional) randomization.

iSMC

The iSMC framework. iSMC is developed and maintained by Gustavo Barroso.

CoalHMM

The CoalHMM software implements the CoalHMM model described in (Hobolth et all 2007, Dutheil et al 209, and Mailund et al 2011).

MafFilter

MafFilter is a program allowing the design of advanced filtering and processing pipelines for genome alignments.

CoMap

CoMap is a package dedicated to coevolution analysis by Cosubtitution Mapping. In contains the comap program and the MICA program (Mutual Information Coevolution Analysis), implementing several MI-based methods from the literature (Dutheil 2012, Brieffings in Bioinformatic). CoMap was developed using the Bio++ libraries.

PhySamp

PhySamp is a package dedicated to phylogenetic sampling, that is, the filtering of sequence alignments and corresponding phylogenetic trees. It contains two programs: bppAlnOptim, which optimizes the size of an alignment while minimizing the occurrence of missing data, and bppPhySamp, which samples an alignment to minimize redundancy based on a phylogeny.

TestNH

The TestNH package contains programs for testing and fitting non-homogeneous models of sequence evolution.The testnh program is described in Dutheil and Boussau (2008), the mapnh program is described in Romiguier et al (2012), and the partnh program in Dutheil et al (2012).

ConTest

Another program making use of substitution mapping procedures, to detect positions in a molecule under biochemical constraint (Dutheil 2007).