Introduction
Prodigal is a protein-coding gene prediction software tool for bacterial and archaeal genomes. It performs the following functions:
- Predicts protein-coding genes: Prodigal provides fast, accurate protein-coding gene predictions in GFF3, Genbank, or Sequin table format.
- Handles draft genomes and metagenomes: Prodigal runs smoothly on finished genomes, draft genomes, and metagenomes.
- Runs quickly: Prodigal analyzes the E. coli K-12 genome in 10 seconds on a modern MacBook Pro.
- Runs unsupervised: Prodigal is an unsupervised machine learning algorithm. It does not need to be provided with any training data, and instead automatically learns the properties of the genome from the sequence itself, including genetic code, RBS motif usage, start codon usage, and coding statistics.
- Handles gaps, scaffolds, and partial genes: The user can specify how Prodigal should deal with gaps and has numerous options for allowing or forbidding genes to run into or span gaps.
- Identifies translation initiation sites: Prodigal predicts the correct translation initiation site for most genes, and can output information about every potential start site in the genome, including confidence score, RBS motif, and much more.
- Outputs detailed summary statistics for each genome: Prodigal makes available many statistics for each genome, including contig length, gene length, GC content, GC skew, RBS motifs used, and start and stop codon usage.
For more information, visit the Prodigal page at GitHub.
Programming language: C
Brief description: A protein-coding gene prediction software tool for bacterial and archaeal genomes.
Open source license: GPL 3.0
Recommended Software Version
Prodigal 2.6.3
Parent topic: Prodigal 2.6.3 Porting Guide (Kylin V10)