Evaluation

The partitioning of the structure into domains may result in domains consisting of contiguous stretches of polypeptide chain – one stretch per domain (contiguous domains). Frequently, however, regions of the polypeptide chain that are distant in sequence, are close together in 3D structure, thus a domain may consist of two or more segments of the chain which are non-contiguous in sequence (non-contiguous domains). We note that the average fragmentation of domains correlates with the average number of domains assigned by each method: if an automatic method assigns on average more domains it also assigns on average more fragments per domain. The fragmentation of domains is presented from three perspectives: a fraction of correctly/incorrectly fragmented domains, partitioning of domains into discontinuous fragments by each method, and correlation between partitioning protein into domains and the fragmentation of domains.

1. Fraction of correct/incorrect fragmented domains

Tabular data for the figure
Number of fragments/per domain Expert consensus PDP DomainParser PUU NCBI
1 91.7 82.4 91.3 72.2 85
2 7.6 14.9 8 18.1 13.1
3 0.7 2.7 0.7 7.2 1.9
4+ 0 0 0 2 0

Evaluation of domain assignment methods using using fragmentation of domains. The evaluation is performed using Balanced_Domain_Benchmark_2.

2. Partitioning of domains into discontinuous fragments by each method

Tabular data for the figure
Number of fragments PDP DomainParser PUU NCBI
1 fragment -9.3 -0.4 -19 -6.7
2 fragments 7.3 0.4 10.5 5.5
3 fragments 2 0 6.5 1.2
4+ fragments 0 0 2 0

Fraction of continuous and discontinuous domains mis-assigned by each method and binned by number of fragments per domain. The evaluation is performed using Balanced_Domain_Benchmark_2.

3. Comparing tendency toward correct assignment on the level of protein vs. the level of domain

Tabular data for the figure
  PDP DomainParser PUU NCBI Expert consensus
average # of domains per chain 2.03 1.78 2.11 2.03 1.95
average # of fragments per chain 1.20 1.09 1.49 1.30 1.09

Side-by-side comparison of the average number of fragments per domain and average number of domains per chain as assigned by each method. The left Y-axis scale refers to the average number of fragments per domain and the right Y-axis scale refers to the average number of domains per chain. The average number of fragments per domain is calculated using Y / X, where Y is the sum of all fragments assigned for each domain and X is the total number of domains assigned. The average number of domains per chain is calculated using A / B, where A is the sum of all domains assigned for each chain and B is the total number of chains. The proportion between average fragments and average number of domains is 1:1.65. The evaluation is performed using Balanced_Domain_Benchmark_2.


This work is sponsored by the National Institutes of Heath (NIH) Grant Number GM63208 (NIH/NIGMS)