3.1 QSPR model and validation
A QSPR model was developed with 29 NI descriptors to quantify the
relationship between the molecular structure characteristics andT gs of 695 polyesters, as shown in Eq. (4). The
norm descriptors (I ) and the corresponding coefficients
(b ) in the model are listed in Table S2 of the Supplementary
Information.
Figure 4 shows the predictability of the QSPR model. With 556
data points used as the training set and 139 data points used as the
testing set, the scatter plot of predicted and experimentalT g values is shown in Figure 4a . It is
obvious that the majority of the data points are distributed close to
the diagonal line, which indicates that the model provides good
prediction accuracy, with AAE < 20 ℃ andR 2 > 0.90. The calculated and
experimental T gs for the 695 polyesters are shown
in the Sheet S1 of the Supporting Information(data.xlsx). Moreover,R 2training = 0.9054 andR 2testing = 0.9077 are
significantly greater than 0.6, proving the good predictive performance
of the model. Meanwhile, the two R 2 are very
close, implying that the model has strong generalizability and is
capable of well learning the relationship between the chemical structure
of polyesters and their associated T g.
n = 695; R 2 = 0.9060;Q 2LOO-CV= 0.8889; AAE = 17.7197 ℃
where nA is number of atoms,nnH is number of non-hydrogen atoms;MSF are calculated with the polyester structures
(H-suppressed); bi is the parameters andIi is the norm descriptors.