DiANNA Help Page

How to run DiANNA?
DiANNA provides two services:
  1. Cysteine classification
  2. Disulfide connectivity prediction
Both services requires only an input protein sequence, which must be pasted in a text box. The sequence must be in FASTA format (click here for an example).
Output interpretation
Cysteine classification:
The user can choose a ternary prediction (i.e. for each cysteine in the sequence we predict whether the cysteine is half-cystine, free cysteine or ligand-bound) or one of the three binary predictions (half-cystine vs. free cysteine, ligand-bound vs. half-cystine, ligand-bound vs. free cysteine). In the case of the ternary classification, for each cysteine, we report the probability of being ligand-bound, half-cystine or free cysteine, as computed by a three-class support vector machine. Then, for each cysteine predicted as ligand-bound, we predict to which atom-type it may be bound, out of four possible ligands (Fe, Cd, C, Zn) using a winner-takes-all decision (i.e. four different support vector machines, each one trained to recognize cysteines bonded to a specific ligand, are tested, and we assign the prediction to the one that produces the maximum score). Similarly, in the case of binary classification, we report the probability of being in one of two mutually exclusive states. For more details about the method employed and the results of binary classification experiments, have a look at the web supplement.
Disulfide connectivity
For each pair of cysteine in the input sequence, a neural network trained to recognize disulfide bonds produce a score ranging from 0 to 1 (higher the score, higher the prediction reliability). This scores are used to obtain the final prediction using a maximum weight matching. As a consequence, bonds which have a high score may or may not be in the final prediction. For example, consider a protein having 4 cysteines A, B, C and D. Let's assume that the neural network predicts the following scores:
AB0.9
AC0.1
AD0.8
BC0.8
BD0.1
CD0.5
Here, the bond A-B has the maximum score (0.9). Nevertheless, if you consider the bond A-B correct, then you have only one choice for the second bond, i.e. C-D (score 0.5). The pair of bonds that maximized the sum of the scores is instead A-D (0.8) and C-B (0.8). Therefore, in this case the bond with max score is not in the optimal solution.
© Boston College


DiANNA's homepage - Author's homepage