Methods
  NCBI

NCBI method description

NCBI method description: There is no publication dedicated to the method itself, rather it is used as a step in a process of defining protein cores. The available information on the method is rather sparse. NCBI uses a similar approach to that of PUU. The partition of the structure into two domains is satisfactory when intra-domain contact density is at least twice as high as the inter-domain contact density. In addition the domains are not permitted to have isolated secondary structure elements - those forming no contacts with the rest of the domain. The domain boundaries are not allowed to go through a secondary structure element, but should be placed in the loops between the elements. The smallest domain must contain at least 25 residues within secondary structures.

Publications


NCBI is the most balanced method in its approach as it produces a similar number of undercut and overcut errors. When only the number of domains is considered, its overall performance is just trailing that of PDP, the best performing method benchmarked. Some of NCBI’s undercut errors are due to the inability of the method to cut through secondary structure, an essential feature of the algorithm. In the cases where domains are connected within secondary structure elements (other than a loop) NCBI fails to separate domains, for example, 1c1za (Figure 2 A) and 1ds6a (Figure 2B). In addition a rule concerning placement of domain boundaries in the middle of the loop (1wgta; Figure 2C) differs from expert methods (1wgta; Figure 2C). The latter feature is detrimental when considering boundary consistency. It performs worst in terms of correct placement of domain boundaries. Another situation of undercutting involves structures with many domains as in the case of 1d0na (Figure 2D). In addition to the need to cut through beta strands, there is the further complication of a convoluted interface.

While NCBI uses an approach similar to PUU (according to the authors of the method), its performance is quite different qualitatively and quantitatively from PUU. PUU nearly always errs in the direction of over-cutting, while NCBI is balanced. Also NCBI performance is highly superior to that of PUU. The heuristics used by NCBI are surprisingly simple compared to those used by PUU. This may indicate that it is not the main principle of domain decomposition that is important, but rather the set of heuristics implemented in the post-processing step that affect the performance of the method. Moreover it appears the simpler the rules the higher the success. For example, PUU easily sacrifices the integrity of secondary structures to achieve compactness of the domain, in the case of NCBI the ratio of compactness/integrity of secondary structure is better balanced. Yet this ratio is not optimal as the method sometimes overcut large α-structures (1a6da, Figure 2E; 1bc5a, Figure 2F) as well as large β-structures (1mdah, Figure 2G; 5nn9, Figure 2H), while it undercuts smaller α-structures (1b1ba, Figure 2I). One of the rules is NCBI does not allow inclusion into a domain of a secondary structure that forms insufficient contacts with the rest of the domain. This too causes cases of overcut as in 1ghok (1ghok, Figure 2J).



Figure 2. Domain assignment by NCBI method.

This work is sponsored by the National Institutes of Heath (NIH) Grant Number GM63208 (NIH/NIGMS)