3.1 QSPR model and validation
A QSPR model was developed with 29 NI descriptors to quantify the relationship between the molecular structure characteristics andT gs of 695 polyesters, as shown in Eq. (4). The norm descriptors (I ) and the corresponding coefficients (b ) in the model are listed in Table S2 of the Supplementary Information.
Figure 4 shows the predictability of the QSPR model. With 556 data points used as the training set and 139 data points used as the testing set, the scatter plot of predicted and experimentalT g values is shown in Figure 4a . It is obvious that the majority of the data points are distributed close to the diagonal line, which indicates that the model provides good prediction accuracy, with AAE < 20 ℃ andR 2 > 0.90. The calculated and experimental T gs for the 695 polyesters are shown in the Sheet S1 of the Supporting Information(data.xlsx). Moreover,R 2training = 0.9054 andR 2testing = 0.9077 are significantly greater than 0.6, proving the good predictive performance of the model. Meanwhile, the two R 2 are very close, implying that the model has strong generalizability and is capable of well learning the relationship between the chemical structure of polyesters and their associated T g.
n = 695; R 2 = 0.9060;Q 2LOO-CV= 0.8889; AAE = 17.7197 ℃
where nA is number of atoms,nnH is number of non-hydrogen atoms;MSF are calculated with the polyester structures (H-suppressed); bi is the parameters andIi is the norm descriptors.