Evaluation

The simplest criterion is the number of domains assigned by a given method. Errors in assignments of domains can be further classified as over-cuts (assigning more domains than the benchmark) or under-cuts (assigned fewer domains than the benchmark). Evaluation using three of the above criteria is performed for the entire Benchmark_2 dataset . This is somewhat rough estimation of method's correctness, as the correspondence between the domains assigned by an algorithmic method and that of expert consensus is not evaluated here (for more precise evaluation see Domain boundaries). This type of evaluation is presented in three different ways: a fraction of correctly/incorrectly assigned domains overall, a fraction of correctly/incorrectly assigned domains grouped by number of domains, and an assignment of domains by each method

1. Fraction of correct/incorrect assignments domains overall

Tabular data for the figure
Methods PDP NCBI DomainParser PUU
correct assignment 85.03 82.48 77.07 74.2
over-cut 11.15 9.87 4.46 18.47
under-cut 3.82 7.64 18.48 7.33

Evaluation of domain assignment methods using number of domains as a sole criteria. The evaluation is performed using Balanced_Domain_Benchmark_2.

2. Fraction of correct/incorrect assignments (grouped by number of domains)

Tabular data for the figure


Performance of automatic methods using the multi-domain performance criterion. For each number of domains subset, the correct assignment, over- and under-cutting rate is shown in green, red, and blue, respectively. The evaluation is performed using Balanced_Domain_Benchmark_2

3. An assiagnment of domains by each method (breakdown by number of domains)

Tabular data for the figure
Number of domains Expert consensus PDP DomainParser PUU NCBI
1 domain 33.7 35.2 44.8 38.4 37.1
2 domains 43.8 39.1 37.5 32.7 36.5
3 domains 17.5 17.8 14.3 14.6 18.4
4 domains 2.5 3.8 1.6 9.8 4.4
5 domains 1.6 3.8 1.9 2.2 1.9
6 domains 0.6 0.3 0 1.9 1
7 domains 0 0 0 0.3 0
8 domains 0 0 0 0 0.6

Comparison of overall number of domains assigned by each automatic method and by expert consensus (in percent) . The evaluation is performed using Balanced_Domain_Benchmark_2


This work is sponsored by the National Institutes of Heath (NIH) Grant Number GM63208 (NIH/NIGMS)