Figure 6 Superposition of the experimental (green) and closest calculated structures (blue) for two protein complexes. T1113 is a phage shell homodimer with intertwined polypeptide chains at the interface. The left-hand subunit is shown in grey, to clarify the intertwined region. The interface score (ICS) is 93%. H1140 is a nanobody (right)-antigen (left) complex (ICS 81%).
Figure 6 shows two examples of high agreement with the corresponding experimental structures, each representing an assembly challenge class that was problematic for older methods: T1113 is a small bacterial homodimer with no homologous structures available. The two polypeptide chains intertwine across the interface in a domain-swap like manner, a feature that defeats classical docking methods. Deep learning methods treat the whole complex as a single entity, circumventing that difficulty. H1140 is a nanobody /protein antigen complex. Earlier benchmarking (29) had shown that immune complexes defeat at least the standard AlphaFold2-Multimer procedure. In CASP15 there are a total of eight immune complex targets (five nanobody complexes and three antibody complexes). Of these, two had homologous experimental complexes available, and so were easy targets. Three others have the lowest accuracy interfaces in this CASP (ICS 0.12, 0.30, and 0.45). But for the remaining three, high quality (ICS 0.74, 0.80, 0.81) models were produced by a small number of participating groups. Standard AlphaFold Multimer with default parameters was not effective on any of these, in accordance with the general observation that for many targets, enhanced sampling is necessary to obtain the best results.
As with the single protein structure category, the most effective methods in assembly modeling are based on AlphaFold2, usually the newer AlphaFold-multimer (31), a version of AlphaFold where training included data for protein complexes. Three of the methods are also in the top five performers in the single protein category. Again, as with the single protein category, the most successful groups used modifications of standard AlphaFold procedures, including much more extensive sampling through variations on MSA construction, the use of multiple seeds, an increased number of recycles and extensive network dropout. In addition, one group (32) devised a machine learning/Voronoi polyhedral interface scoring function which evidently aided in selection of accurate models. Details of methods can be found in the CASP15 assemblies assessment paper (11) and papers by some of the best performing groups.
Although the improvement in accuracy is enormous, there are still a substantial fraction of poor scoring interfaces. There are multiple possible reasons for the lag in performance. This is the first-time deep learning methods have been extensively used for protein complexes whereas this was the third CASP where deep learning has been used for single proteins. Thus, we may see substantial improvement next time as lessons are learned from CASP15. More fundamentally, there are many fewer structures of complexes in the PDB than single proteins, so that training set is inherently smaller. It may be possible to use the current methods to generate additional synthetic training data (33). Analogously to single proteins, interface accuracy probably falls off with the depth of the multiple sequence alignment spanning the interface (although one leading group reported omitting these data (34)). That may explain the generally weak performance for immune complexes, so it is encouraging to see partial success there.