Bioinformatics

In the Allen lab we work to make our code freely available we actively support and update a few pipelines and software listed below. We principally aim to develop software to help answer questions in biodiversity and solve problems in biology.

Label_Reconciliations: This code will reconcile three different transcripts into on ‘best guess transcript’ taking into account mostly majority rules and fuzzy matching. It is most commonly used on public volunteer data of transcripts from museum specimens from Notes from Nature.

aTRAM: automated Target Restricted Assembly Method – all code and menus can be found here. aTRAM is an iterative assembler that performs reference-guided local de novo assemblies using a variety of available methods. It is well-suited to various tasks where Next-Generation Sequence (NGS) data needs to be queried for gene sequences, such as phylogenomics.

Overall aTRAM workflow, which includes a preparation process followed by assembly steps. The preparation process includes sharding raw data into an aTRAM library, including construction of a whole-dataset SQLite database from raw reads. The assembly process uses a bait sequence or set of sequences to perform an iterative assembly. The whole process generates assembled reads that are often longer than target baits. Figure recreated from “aTRAM 2.0: An Improved, Flexible Locus Assembler for NGS Data” (Evolutionary Bioinformatics, Sage Journals).

Other pipelines can be found on my Github.

%d bloggers like this: