2.7 Repeats analysis
Uniprot was used to know the presence of sequence repeat containing
proteins in the dataset. RepeatsDB was used to get structural repeats
populating at least one domain in proteins in the dataset. We used SCOPe
“sccs” id till superfamily level to define homodomain containing
proteins and used fold information to get folds of domains.