Alphabetical list of Software

*Note: Most of the tutorials are based on Lewis. However, it should be straightforward to run the same jobs on Clark with the tutorials by reading How to Submit Jobs on Lewis and How to Submit Jobs on Clark as both Lewis and Clark are linux/unix sytems.

 Name   Description   Usage* 
 ABySS  Assembly By Short Sequences. A de novo sequence assembler that is designed for short reads.  Tutorial 
 AMBER  (Assisted Model Building with Energy Refinement) is a package of biomolecular simulation programs.  Tutorial 
 APBS  APBS is a software package for modeling biomolecular solvation through solution of the Poisson-Boltzmann equation (PBE).  Tutorial 
 ATSAS  A program suite for small-angle scattering data analysis from biological macromolecules  Tutorial 
 BEDTools  A flexible suite of utilities for comparing genomic features.  Tutorial 
 Biopython  "A set of freely available tools for biological computation written in Python. It is a distributed collaborative effort to develop Python libraries and applications which address the needs of current and future work in bioinformatics."  Tutorial 
 BLAST  (Basic Local Alignment Search Tool) aligns biological sequences locally and rapidly using heuristic techniques to speed the database search.  Tutorial 
 BLAT  "(BLAST-Like Alignment Tool) (on Lewis) commonly used to look up the location of a sequence in the genome or determine the exon structure of an mRNA."  Tutorial 
 Blender  "A 3D content creation suite including Modeling, Shading, Animation, Rendering, Compositing, and Interactive 3D."  Tutorial 
 Boost  "Free peer-reviewed portable C++ source libraries."  Tutorial 
 Bowtie  "A fast, memory-efficient short read aligner geared toward quickly aligning large sets of short DNA sequences (reads) to large genomes."  Tutorial 
 BreakDancer  " BreakDancer is a Perl/C++ package that provides genome-wide detection of structural variants from next generation paired-end sequencing reads."  Tutorial 
 BSmapper  A sequence mapper for bisulfite sequencing reads for DNA methylation studies. It can handle Sanger and 454 reads for mapping to whole genomes or target regions.  Tutorial 
 CAP3  "A DNA sequence assembly program that uses base quality values in computation of overlaps between reads, construction of multiple sequence alignments of reads, and generation of consensus sequences."  Tutorial 
 Cdbtools  CDB (Constant DataBase) indexing and retrieval tools for multi-FASTA files  Tutorial 
 CD-HIT  "a very fast program for clustering and comparing protein or nucleotide sequences"  Tutorial 
 CASAVA  The Illumina's Consensus Assessment of Sequence and Variation (CASAVA) software that captures summary information for resequencing and counting studies and places the data in a compact structure for visualization within GenomeStudio Software  Tutorial 
 ClustalW  A general purpose multiple sequence alignment program for DNA or proteins.  (Also available in the DeCypher package.)  Tutorial 
 CPMD  "The CPMD code is a parallelized plane wave / pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics."  Tutorial 
 CRUX  "a software toolkit for tandem mass spectrometry analysis, with a focus on peptide identification."  Tutorial 
 Cufflinks  "A program that assembles aligned RNA-Seq reads into transcripts, estimates their abundances, and tests for differential expression and regulation transcriptome-wide."  Tutorial 
 DeCypher  "A suite of programs for biological sequence alignment that uses a combination of optimized algorithms and customized hardware to allow very rapid searches of public and custom databases.  It includes, among other things, all flavors of BLAST, HMM-based profile searches, the Smith-Waterman procedure, and frameshift tolerant alignments to find more distantly related sequences.  Access to DeCypher is through a web page to Discovery."  Tutorial 
 DReAMM  "A software to design, render, and animate MCell models"  Tutorial 
 DX  "Open Visualization Data Explorer (OpenDX), a visualization framework that gives users the ability to apply advanced visualization and analysis techniques to their data."  Tutorial 
 Edena  "(Exact DE Novo Assembler) is an assembler dedicated to process the millions of very short reads produced by the Illumina Genome Analyzer.  Edena is based on the traditional overlap layout paradigm."  Tutorial 
 EMBOSS  "The "European Molecular Biology Open Software Suite" contains many tools for sequence alignment and other biological tasks.  To use an EMBOSS program, or the related EMBASSY programs, you need to have emboss in your path or use the path to the command, e.g., "/usr/local/emboss/bin/palindrome" will return a prompt for you to input information to find palindromes in a sequence."  Tutorial 
 FASTX_Toolkit  A collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.  Tutorial 
 FFTW  "A C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST)."  Tutorial 
 GAMESS  "GAMESS (General Atomic and Molecular Electronic Structure System) is a program for ab initio molecular quantum chemistry. GAMESS can perform a number of general computational chemistry calculations, including Hartree-Fock, density functional theory (DFT), generalized valence bond (GVB), and Multi-configurational self-consistent field (MCSCF)."  Tutorial 
 GAUSSIAN  "Gaussian predicts the energies, structures, and vibrational frequencies of molecules. The structures and reactions of even unstable molecules can be studied under a wide range of conditions."  Tutorial 
 GDIS  "A scientific visualization program for the display, manipulation, and analysis of isolated molecules and periodic structures."  Tutorial 
 GENESIS  "(short for GEneral NEural SImulation System) is a general purpose simulation platform that was developed to support the simulation of neural systems ranging from subcellular components and biochemical reactions to complex models of single neurons, simulations of large networks, and systems-level models."  Tutorial 
 Genome  "A software package (phred, phrap, consed, etc.) for automated DNA sequencer traces, assembling shotgun DNA sequence data, and viewing, editing, and finishing sequence assemblies, etc."  Tutorial 
 GenomeSTRiP  "Genome STRiP (Genome STRucture In Populations) is a suite of tools for discovering and genotyping structural variations using sequencing data. The methods are designed to detect shared variation using data from multiple individuals, but can also process single genomes."  Tutorial 
 gnuplot  "A portable command-line driven graphing utility. Gnuplot supports many different types of output: interactive screen terminals (with mouse and hotkey input), direct output to pen plotters or modern printers, and output to many file formats (eps, fig, jpeg, LaTeX, metafont, pbm, pdf, png, postscript, svg, ...). "  Tutorial 
 GROMACS  GROMACS (GROningen MAchine for Chemical Simulations) is a molecular dynamics simulation package for biochemical molecules like proteins, lipids, nucleic acids, polymers, etc.  Tutorial 
 HDF5  "A data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data."  Tutorial 
 HMMER  "A new generation of sequence homology search software for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. HMMER aims to be significantly more accurate and more able to detect remote homologs because of the strength of its underlying mathematical models."  Tutorial 
 Intel_Compilers  "Intel® compiler suites offer industry-leading C++ and Fortran compilers with optimization features and multithreading capabilities, highly optimized performance libraries, and error-checking, security, and profiling tools, allowing developers to create multithreaded applications and maximize application performance, security, and reliability."  Tutorial 
 iprscan  InterProScan is a tool that combines different protein signature recognition methods native to the member databases into one resource with look up of corresponding InterPro and (Gene Ontology) annotation.  Link 
 LAMMPS  "LAMMPS is a classical molecular dynamics code that models an ensemble of particles in a liquid, solid, or gaseous state. It can model atomic, polymeric, biological, metallic, granular, and coarse-grained systems using a variety of force fields and boundary conditions. "  Tutorial 
 MAFFT  "A multiple alignment program for amino acid or nucleotide sequences. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <~200 sequences), FFT-NS-2 (fast; for alignment of <~10,000 sequences), etc. "  Tutorial 
 Mapsembler  A targeted assembly software that takes as input a set of NGS raw reads and a set of input sequences (starters).  Tutorial 
 Maq  Mapping and Assembly with Qualities Maq vs 0.7.1 is installed.It builds assembly by mapping short reads to reference sequences.  Tutorial 
 MATLAB  stands for matrix laboratory and is an interactive programming language and numerical computing environment.  Tutorial 
 MCell  A modeling tool for realistic simulation of cellular signaling in the complex 3-D subcellular microenvironment in and around living cells  Tutorial 
 MEME  "A software package to discover motifs (highly conserved regions) in groups of related DNA or protein sequences and, search sequence databases using motifs."  Tutorial 
 miRExpress  A program for analyzing high-throughput sequencing data for profiling microRNA expression  Tutorial 
 MODELLER  A program for homology or comparative modeling of protein three-dimensional structures. The user provides an alignment of a sequence to be modeled with known related structures and MODELLER automatically calculates a model containing all non-hydrogen atoms  Tutorial 
 MOTHUR  "A single piece of open-source, expandable software for the bioinformatics needs of the microbial ecology community"  Tutorial 
 mpiBLAST  "A freely available, open-source, parallel implementation of NCBI BLAST"  Tutorial 
 mpiHmmer  An open source MPI implementation of the HMMER protein sequence analysis suite.  Tutorial 
 MUSCLE  "A new multiple sequence alignment algorithm. It is claimed to creates alignments with average accuracy comparable with or superior to the best current methods such as ClustalW2 or T-Coffee, depending on the chosen options."  Tutorial 
 NAMD  (NAnoscale Molecular Dynamics) allows high-performance simulation of large biomolecular systems written using the Charm++ parallel programming model.  Tutorial 
 NCBI_Tools  "A collection of the NCBI software tools for building bioinformatics resources including sequence search, alignment, and format conversion, etc."  Tutorial 
 OpenFOAM  "OpenFOAM (Open Source Field Operation and Manipulation) is a C++ toolbox for the development of customized numerical solvers, and pre-/post-processing utilities for the solution of continuum mechanics problems, including computational fluid dynamics (CFD)."  Tutorial 
 OpenMPI  "An open source Message Passing Library (MPI) that may be used to explicitly direct multi-threaded, shared memory parallelism for C/C++ and Fortran programs."  Link 
 OpenCV  "(Open Source Computer Vision) is a library of programming functions for real time computer vision."  Tutorial 
 PHYLIP  A package of programs for inferring phylogenies (evolutionary trees).  Tutorial 
 Primer3  A program for designing PCR (Polymerase Chain Reaction) primers.  Tutorial 
 ProbCons  A protein multiple sequence alignment program using a combination of probabilistic modeling and consistency-based alignment techniques  Tutorial 
 QIIME  "QIIME (canonically pronounced 'Chime') is a pipeline for performing microbial community analysis that integrates many third party tools which have become standard in the field."  Tutorial 
 R  "R is a language and environment for statistical computing and graphics. ...R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible."  Tutorial 
 R2R  "A software that is designed to speed the drawing of RNA secondary structure consensus diagrams, which show the conserved features within a set of related RNAs."  Tutorial 
 RAxML  A fast implementation of maximum-likelihood (ML) phylogeny estimation that operates on both nucleotide and protein sequence alignments.  Tutorial 
 RepeatMasker  A program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.  Tutorial 
 SAMtools  "SAM Tools provide various utilities for manipulating alignments in the SAM (Sequence Alignment/Map) format, including sorting, merging, indexing and generating alignments in a per-position format."  Tutorial 
 SAS  A Statistical Analysis Software (SAS).  Tutorial 
 SOAP  "The Short Oligo Analysis Package can be used to align short reads to reference sequences, and is also useful for trimming adaptors from the 3' end prior to alignment and counting the numbers of hits. "  Tutorial 
 SRILM  "A toolkit for building and applying statistical language models (LMs), primarily for use in speech recognition, statistical tagging and segmentation, and machine translation."  Tutorial 
 SSAHA2  Sequence Search and Alignment by Hashing Algorithm (SSAHA2) is a pairwise sequence alignment program designed for the efficient mapping of sequencing reads onto genomic reference sequences.    Tutorial 
 SSAKE  "The Short Sequence Assembly by K-mer search and 3' read Extension (SSAKE) is a genomics application for aggressively assembling millions of short nucleotide sequences by progressively searching for perfect 3'-most k-mers using a DNA prefix tree.  SSAKE is designed to help leverage the information from short sequences reads by stringently clustering them into contigs that can be used to characterize novel sequencing targets."  Tutorial 
 SWIG  A software development tool that connects programs written in C and C++ with a variety of high-level programming languages such as Perl, PHP, Python, Tcl and Ruby, etc.  Tutorial 
 T-Coffee  "A multiple sequence alignment package that is also able to combine sequence information with protein structural information (3D-Coffee/Expresso), profile information (PSI-Coffee) or RNA secondary structures (R-Coffee)."  Tutorial 
 TGICL  A software system for fast clustering of large Expressed Sequence Tags (EST) and mRNA databases.  Tutorial 
 Tophat  "A fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons."  Tutorial 
 TRF  (Tandem Repeats Finder) is a program to locate and display tandem repeats in DNA sequences.  Tutorial 
 VCAKE  A genetic sequence assembler capable of assembling millions of small nucleotide reads even in the presence of sequencing error. This software is currently geared towards de novo assembly of Illumina's Solexa Sequencing data  Tutorial 
 Velvet  A sequence assembler for short reads but also allows the inclusion of longer length sequences.  Tutorial 
 VelvetOptimiser  "VelvetOptimiser VelvetOptimiser is a multi-threaded Perl script for automatically optimising the three primary parameter options (K, -exp_cov, -cov_cutoff) for the Velvet de novo sequence assembler."  Tutorial 
 WUBLAST  (Washington University BLAST) aligns biological sequences locally and rapidly using heuristic techniques to speed the search. WU-BLAST allows more tweaking of parameters than does (NCBI) BLAST.  Tutorial 
 XPLOR-NIH  "A structure determination program which builds on the X-PLOR program, including additional tools developed at the NIH."  Tutorial 
 YASRA  "YASRA (Yet Another Short Read Assembler) performs comparative assembly of short reads using a reference genome, which can differ substantially from the genome being sequenced."  Tutorial