Polynomial Neural Network (PNN) algorithm description
Let us consider basic principles of Polynomial Neural Network (PNN)
algorithm organization. The proposed method can be used to analyze complex data
sets with the aim to determine internal data relationships and to present
knowledge about these relationships in the form of mathematical description
(polynomial regression).
As an examples of possible application area of PNN algorithm one can
consider any sphere where sets of observation data should be analyzed and data
relationships models should be build. There are, for example, chemistry (QSAR),
economical systems analyses, stocks and financial market instruments, insurance
risks study, medical diagnostics, etc.
To work with the algorithm one should formalize the data domain to
have number of observed quantitative parameters X={x_{1}, x_{2},
..., x_{m}} (the set of input variables). Then it
should be set of observations with "already known" result, that is n
observations of vector X (input) values and y
(output). The purpose of the algorithms is to find a subset of variables {x_{i1},
x_{i2}, ..., x_{ik}}
and a model y = f_{i} (x_{i1}, x_{i2},
..., x_{ik}) in the set of polynomials
of spower that minimizes some criterion value (CR).
Afterbuilding the model it can be used to calculate "unknown" result
y on new observed vectors X.
The algorithms are implemented as a GMDHtype Neural Network. First
layer generates the models y = g(x_{i} , x_{j} , x_{k}),
where where x_{i}, x_{j}, x_{k}
input variables. Next layers generate models as y = g(w_{i} , w_{j}
,w_{k}), where w_{i} , w_{j} ,w_{k}
are the models of previous layers.
The main objective is to create a stable algorithm
for nonlinear model reconstruction such as that its results are easily
interpreted by users.
The main features of the algorithm can be summarized as
follows:

Fast learning. The transforms with two coefficients only are
used, for example g(w_{i}, w_{j}) = a·w_{i}+b·w_{j}
in the linear case. Irrespectively of the power of resulting model and the
number of terms the second order matrices are only inverted. This provides fast
learning of the algorithm.

Results in the parametric form. The polynomial structures are
coded using vector of simple numbers [1,2] that provides the presentation of
the results in the parametric form of nonlinear equation.

Complexity control. Let us denote vector (power, c)^{T}
as a complexity, power is the power of the polynomial
and c is the number of terms. The power of the new
model is controlled by the condition that if, for example, g(w_{i},
w_{j}, w_{l})=a·w_{i}+b·w_{j}·w_{l},
then power(g(w_{i},w_{j},w_{l})) = max(power(w_{i}),
power(w_{j}) + power(w_{l})), where power()
designates the power of the polynomial. It gives us the possibility to restrict
the class of the models under consideration by power(w_{i}) < p
and to search models among the polynomials with power less than p.
The maximum complexity is defined by the user or can be automatically selected
using a full crossvalidation method.

Twicehierarchical neural net structure.Twicehierarchical
neural net structure is important feature of PNN. One of the problem is that
power of polynomials increases too fast in the traditional GMDH algorithm. At
the step r of iteration procedure one can have models of power r+1, W_{r}€P^{r+1}.
The control of complexity gives us an opportunity to implement the iteration
procedure without an increase of the power of polynomials or/and the number of
terms. External iterative procedure controls the complexity, i.e. the number of
the terms and the power of the polynomials in the intermediate models. The best
models form initial set for the next iterative procedure. This procedure
realizes a wide search without the complexity increase. Besides that the
twicehierarchical neural net structure provides the convergence of the
coefficients. The models that are calculated as a result of several
transformations have the coefficients that are close to the appropriate
regression coefficients.

Robust estimation. To use algorithm in the presence of large
errors (outliers) we have developed the PNN algorithm for robust nonlinear
(Mregression) model identification. This made possible to improve stability of
PNN algorithm.
The performance of PNN is demonstrated using examples
of artificial and real data sets. A first set was generated from a nonlinear
model of fourth power y = x_{1}·x_{5}^{3}+10
Random noise and three further random variables x2 
x4 were added and 14 observations were generated (n=14;
m=5). (Example 1)
The comparison of RPNN algorithm was done with the GN (Yu.P. Yurachkovsky,
1981)  one of the best GMDHtype Neural Network algorithm. .
Table 1. Values Calculated by GMDH(GN) and RPNN Methods for
the Training and Test Sets
without errors 
with 2 errors 
with 3 errors 
Y 
GN 
RPNN 
Y 
GN 
RPNN 
Y 
GN 
RPNN 
Training set 
38 
36 
37 
38 
5 
34 
38 
73 
27 
26 
27 
26 
26 
141 
24 
26 
29 
22 
520 
516 
521 
1520 
1653 
515 
1520 
1457 
519 
200 
102 
201 
200 
249 
193 
200 
272 
202 
139 
138 
138 
139 
227 
131 
139 
169 
129 
2571 
2568 
2571 
2571 
2510 
2572 
571 
2810 
2571 
1725 
1724 
1725 
1725 
1775 
1729 
1725 
1629 
1723 
2754 
2755 
2753 
2754 
2708 
2762 
2754 
2210 
2762 
1522 
1522 
1521 
1522 
1534 
1523 
522 
1054 
1524 
1457 
1462 
1468 
2457 
2319 
1465 
2457 
2540 
1466 
1723 
1724 
1725 
1723 
1775 
1723 
1723 
1888 
1723 
Test set 
2929 
2923 
2927 
2929 
2930 
2929 
2929 
3477 
2929 
658 
656 
657 
658 
931 
657 
658 
460 
655 
Twelve and two observations were used as training and test sets,
respectively. The model with excellent approximation and low prediction error
were calculated by GMDH(GN): y = 11.6  0.13·x_{5}^{2}
+ 1.0·x1·x_{5}^{3}, RMSE=6.14
for training set and RMSE=3.25 for the test set
(Table 1). Nevertheless RPNN improved the results: found the exact model
structure y = 10.16  0.05·x_{4}·x_{5} +
1.0·x1·x_{5}^{3} and
provided lower prediction error RMSE=1.65 for
training set RMSE=1.21 for the test set. In order to
study stability of GMDH(GN) and RPNN algorithms, the values of two and three
data cases from the training set were changed to be large errors of the initial
model. Results of the GN algorithm were affected by the outliers (Table 1, 2).
This method provided low generalization ability for the test set. c (Table 1,
2).
Table 2. RMSE for GMDH(GN) and RPNN Methods for Test
Sets

no errors 
1 error 
2 errors 
3 errors 
GN 
3.25 
139.3 
136.5 
291.4 
RPNN 
1.2 
1.7 
0.61 
1.42 
Application in environmental and toxicological studies
Prediction of the sublimation enthalpy
To study the stability and robustness of RPNN, we developed QSAR
models for the sublimation enthalpy of a series of 18 polychlorinated
hydrocarbons (PCBs). This work was elaborated in collaboration with Prof.
William J. Welsh, Department of Pharmacology Robert Wood Johnson Medical Scool,
GenChemics, and Dr. Igor Tetko.
What is the QuantativeStructure Activity Relationship Study? QSAR
studies represent an important part of the drug or new materials design process
and are used to reveal relationships between chemical structure of compounds
and their biological activities, toxicity or others. Various molecules
characteristics are used as input variables (ElectronTopological Indices,
Molecular Shape, Molecular Size, Charge Distribution, Molecular Fragments,
CoMFA Descriptors) the biological activity or toxicity is an output.
The atom and bondtype electrotopological state (Estate) indices (8
parameters) were used as inputs parameters for the Multiple Linear Regression
Analysis (MLRA), GMDH(GN) and RPNN. Estate indices introduced by Kier and Hall
(1990) combine electronic and topological characteristics of the molecules and
describe atoms (atomtype indices) and bonds (bondtype indices).
The first analysis included 16 PCBs in the training and 2 molecules in
the test set (Example 2).
Statistically significant models with low prediction error of the tests
molecules RMSE=1.06 and 0.93
were calculated by the MLRA and the RPNN methods, respectively. GN calculated
lower result RMSE=3.1. However, PNN and RPNN selected
nonlinear models of the same structure y = 0.036·x_{7}_{2}
+ 133·x_{4}  5.9·x_{1} and
y = 0.023·x_{7}^{2} + 132·x_{4} 
5.9·x_{1}. MLRA significantly decreased
its prediction ability when number of molecules in the training set dropped
below 12. On the contrary, the RPNN results were not affected by the number of
molecules in the training set. Even if the number of molecules in this set
decreased from 16 to 7 molecules, the RPNN still calculated the same regression
equations with only slight variations of the regression coefficients. For
example, the equation y = 0.041·x_{7}^{2} +
134·x_{4}  5.9·x_{1} was
calculated with 7 molecules in the training set. This model provided RMSE=4.3
for 11 molecules in the test set (Figure 1,2,3). These results indicated a
stability of RPNN method. The results calculated by GN method were lower
compared to RPNN but much better compared to MLRA method.
RPNN 
MLRA 


16 molecules in the training and 2 molecules in the test sets 


RPNN 
MLRA 


14 molecules for training sets and 4 molecules for test 


RPNN 
MLRA 


7 molecules for training sets and 11 molecules for test 
Figure 1,2,3. Prediction of the sublimation enthalpy of molecules by the MLRA
and RPNN methods.
Figure
4. Root Mean Squared error calculated for the test set molecules as a function
of the number of molecules in the training set. MLRA failed to provide
significant models if less than nine molecules were used in the training set.
The same data set included 16 PCBs was used to study the robustness of
RPNN to outliers. The values of two data cases from the training set were
changed to be large errors  line 3: 182.12 instead of 82.12 and line 16: 18.33
instead of 108.33. RPNN results for both train and test sets were practically
the same as for the noiseless data (Figure 5,6).
Figure
5,6.
Because nonlinearity of dependencies between parameters and responses
in QSAR study and "dirty" data sets by presence of outliers RPNN represents a
promising tools for applications in drug design, environmental and
toxicological studies.
Reference:

Yurachkovsky, Y.P. 1981. "Restoration of Polynomial Dependencies Using
SelfOrganization." Soviet Automatic Control, 14, 1722.

T.I Aksenova, I.V. Tetko, V.V. Volkovich, GMDHtype Neural Network in
QuantativeStructure Activity Relationship Studis on the Internet, Modelling
and simulation 2001(ESM2001), 15th European Simulation Multiconference, Prague,
pp.685689
