3.6 Permanent domains structurally prefer similar folds
Different types of domain interactions in a protein chain might have
some influence on the anatomy of the protein structure landscape. Hence,
we next explored a few of the structural aspects which could be
discriminated by permanent and transient domain interactions within a
protein. Firstly, we aimed to look at their preferences to have repeats.
Repeats can be of two types, viz, sequence repeats and structural
repeats. Structural repeats can be further classified into different
classes. Using different databases (see methods) to map the proportion
of proteins in our dataset to have repeats, we found a few proteins in
both repeat types where proteins having permanent domain interactions
showed a little more preference for sequence and structural repeats.
However, this observation cannot be relied upon due to the sparse number
of proteins. Among the structural repeats, the repeating units (domains)
of bead-on-string repeats (class-IV) are thought to either interact
loosely or not interact, which could have been interesting examples of
transient domains in multi-domain proteins. However, from the proteins
having at least one structural repeat containing domain, there was no
protein which belonged to this class. This could be due to the limited
amount of information in the database or due to the limited number of
domains in our study to represent multi-domain proteins. Secondly, to
overcome this limitation, we defined homodomains, where both domains
have same class, fold, and superfamily according to SCOPe. Thus, these
domains will have similar architecture and are evolutionarily related to
each other, which are supposed to be originated by duplication. Using
such a definition, we observed a comparatively higher proportion of
permanent domain containing proteins to have homodomains, 37.3% in
comparison to 28.6% of homodomains in the dataset. Although these
homodomains may not be true tandem repeats, such domains can provide
functional and structural advantages to the proteins having permanent
domains due to evolutionary pressure and topological constraints,
respectively. Thirdly, to investigate their structural constraints, we
explored their fold distribution in homodomains. We found that proteins
having permanent homodomains have a comparatively lower number of unique
folds than transient homodomains, which could signify the capability to
re-use folds. This suggests that if domains interact permanently in a
protein, there is a greater chance of finding another interacting domain
of common ancestry and similar structural topology. This observation is
similar to the observations of PPI, where obligate PPI tends to have
more homo-DDIs. When we considered the whole dataset to look into the
number of unique folds, both permanent and transient domain pairs showed
a similar count of unique folds quantitatively. However, qualitatively,
we observed a few biases of folds toward permanent and transient domain
interactions (Table 1). Superfolds such as TIM beta/alpha-barrel, OB
fold, and beta-grasp showed an inclination towards transient domains. On
the other hand, 7-bladed beta-propeller, Ribonuclease H-like motif fold,
and a few others showed inclinations towards permanent domains. Apart
from that, superfolds like Immunoglobulin-like beta-sandwich,
DNA/RNA-binding 3-helical bundle showed preferences for both permanent
and transient domains. Other sparsely occurring folds (frequency: less
than 5) showed little or no bias (Supplementary Table S3 and S4). These
observations show the structural preferences of different domain
interaction types and also justify how a limited number of folds are
re-used to sample various protein structural landscapes in DDI following
a power-law. This will enlighten the basic principles of domain
interaction type prediction, given that we know the interacting domains
in a protein, their topology, and evolutionary information.