GTPasePred
Identification And Classification Of Small GTPases By Protein Structure Analysis


Download GTPasePred v.1 (zip)


Tutorial:
In order to predict one or more novel potential small GTPases, simply copy the protein sequences in the file sequences and, in the case of a Linux/Unix system, start the classification process by typing ./start in the terminal. The results are stored in the file Results.txt. In the case of a Windows system, use start.bat to encode the sequences, start R in the current directory and type in source("program"). The results of the classification process are shown on the screen.

First, the protein sequences are classified whether to be a small GTPase, and in the case of a positive classification, they are subsequently classified by the family RFs. The RF with the highest probability output (positive classification) is selected. If this RF has an output greater or equal 0.5, the protein sequences are assigned to its specific family, otherwise it is assigned as a GTPase in general and classified as "Ras or not further specified small GTPase".

Application to newly sequenced genomes
The algorithm can also be applied, when newly sequenced genomes are available. The work flow is as follows:
1. Identify the correct open reading frames (ORFs), e.g. with the ORF Finder, incrementally for all genes within the newly sequenced genome.
2. The translated protein sequences have to be saved in the file sequences, which subsequently can be used as the input for GTPasePred.
3. All proteins will be encoded using the aforementioned descriptor and classified whether to be a small GTPase or not.

Thus, combining ORF Finder with GTPasePred can be used to identify potential novel GTPases in newly sequenced genomes.

Please cite:
Dominik Heider, Sascha Hauke, Martin Pyka and Daniel Kessler. Insights into the classification of small GTPases. Advances and Applications in Bioinformatics and Chemistry, in press.