Rate This Document
Findability
Accuracy
Completeness
Readability

Introduction

CPAT is a bioinformatics tool to predict RNA's coding probability based on the RNA sequence characteristics. To achieve this goal, CPAT calculates scores of these 4 linguistic features from a set of known protein-coding genes and another set of non-coding genes.

  • ORF size
  • ORF coverage
  • Fickett TESTCODE
  • Hexamer usage bias

CPAT will then build a logistic regression model using these 4 features as predictor variables and the "protein-coding status" as the response variable. After evaluating the performance and determining the probability cutoff, the model can be used to predict new RNA sequences.

For more information, visit the CPAT official website.

Programming language: Python

Brief description: CPAT is a bioinformatics tool to predict RNA's coding probability based on the RNA sequence characteristics.

Open source license: GNU General Public License

Recommended Software Version

CPAT 3.0.4