Polynomial Neural Network (PNN) algorithm description
Let us consider basic principles of Polynomial Neural Network (PNN)
algorithm organization. The proposed method can be used to analyze complex data
sets with the aim to determine internal data relationships and to present
knowledge about these relationships in the form of mathematical description
(polynomial regression).
As an examples of possible application area of PNN algorithm one can
consider any sphere where sets of observation data should be analyzed and data
relationships models should be build. There are, for example, chemistry (QSAR),
economical systems analyses, stocks and financial market instruments, insurance
risks study, medical diagnostics, etc.
To work with the algorithm one should formalize the data domain to
have number of observed quantitative parameters X={x1, x2,
..., xm} (the set of input variables). Then it
should be set of observations with "already known" result, that is n
observations of vector X (input) values and y
(output). The purpose of the algorithms is to find a subset of variables {xi1,
xi2, ..., xik}
and a model y = fi (xi1, xi2,
..., xik) in the set of polynomials
of s-power that minimizes some criterion value (CR).
Afterbuilding the model it can be used to calculate "unknown" result
y on new observed vectors X.
The algorithms are implemented as a GMDH-type Neural Network. First
layer generates the models y = g(xi , xj , xk),
where where xi, xj, xk
input variables. Next layers generate models as y = g(wi , wj
,wk), where wi , wj ,wk
are the models of previous layers.
The main objective is to create a stable algorithm
for nonlinear model reconstruction such as that its results are easily
interpreted by users.
The main features of the algorithm can be summarized as
follows:
-
Fast learning. The transforms with two coefficients only are
used, for example g(wi, wj) = a·wi+b·wj
in the linear case. Irrespectively of the power of resulting model and the
number of terms the second order matrices are only inverted. This provides fast
learning of the algorithm.
-
Results in the parametric form. The polynomial structures are
coded using vector of simple numbers [1,2] that provides the presentation of
the results in the parametric form of nonlinear equation.
-
Complexity control. Let us denote vector (power, c)T
as a complexity, power is the power of the polynomial
and c is the number of terms. The power of the new
model is controlled by the condition that if, for example, g(wi,
wj, wl)=a·wi+b·wj·wl,
then power(g(wi,wj,wl)) = max(power(wi),
power(wj) + power(wl)), where power()
designates the power of the polynomial. It gives us the possibility to restrict
the class of the models under consideration by power(wi) < p
and to search models among the polynomials with power less than p.
The maximum complexity is defined by the user or can be automatically selected
using a full cross-validation method.
-
Twice-hierarchical neural net structure.Twice-hierarchical
neural net structure is important feature of PNN. One of the problem is that
power of polynomials increases too fast in the traditional GMDH algorithm. At
the step r of iteration procedure one can have models of power r+1, Wr€Pr+1.
The control of complexity gives us an opportunity to implement the iteration
procedure without an increase of the power of polynomials or/and the number of
terms. External iterative procedure controls the complexity, i.e. the number of
the terms and the power of the polynomials in the intermediate models. The best
models form initial set for the next iterative procedure. This procedure
realizes a wide search without the complexity increase. Besides that the
twice-hierarchical neural net structure provides the convergence of the
coefficients. The models that are calculated as a result of several
transformations have the coefficients that are close to the appropriate
regression coefficients.
-
Robust estimation. To use algorithm in the presence of large
errors (outliers) we have developed the PNN algorithm for robust nonlinear
(M-regression) model identification. This made possible to improve stability of
PNN algorithm.
The performance of PNN is demonstrated using examples
of artificial and real data sets. A first set was generated from a nonlinear
model of fourth power y = x1·x53+10
Random noise and three further random variables x2 -
x4 were added and 14 observations were generated (n=14;
m=5). (Example 1)
The comparison of RPNN algorithm was done with the GN (Yu.P. Yurachkovsky,
1981) - one of the best GMDH-type Neural Network algorithm. .
Table 1. Values Calculated by GMDH(GN) and RPNN Methods for
the Training and Test Sets
| without errors |
with 2 errors |
with 3 errors |
| Y |
GN |
RPNN |
Y |
GN |
RPNN |
Y |
GN |
RPNN |
| Training set |
| 38 |
36 |
37 |
38 |
5 |
34 |
38 |
-73 |
27 |
| 26 |
27 |
26 |
26 |
-141 |
24 |
26 |
29 |
22 |
| 520 |
516 |
521 |
1520 |
1653 |
515 |
1520 |
1457 |
519 |
| 200 |
102 |
201 |
200 |
249 |
193 |
200 |
272 |
202 |
| 139 |
138 |
138 |
139 |
227 |
131 |
139 |
169 |
129 |
| 2571 |
2568 |
2571 |
2571 |
2510 |
2572 |
571 |
2810 |
2571 |
| 1725 |
1724 |
1725 |
1725 |
1775 |
1729 |
1725 |
1629 |
1723 |
| 2754 |
2755 |
2753 |
2754 |
2708 |
2762 |
2754 |
2210 |
2762 |
| 1522 |
1522 |
1521 |
1522 |
1534 |
1523 |
522 |
1054 |
1524 |
| 1457 |
1462 |
1468 |
2457 |
2319 |
1465 |
2457 |
2540 |
1466 |
| 1723 |
1724 |
1725 |
1723 |
1775 |
1723 |
1723 |
1888 |
1723 |
| Test set |
| 2929 |
2923 |
2927 |
2929 |
2930 |
2929 |
2929 |
3477 |
2929 |
| 658 |
656 |
657 |
658 |
931 |
657 |
658 |
460 |
655 |
Twelve and two observations were used as training and test sets,
respectively. The model with excellent approximation and low prediction error
were calculated by GMDH(GN): y = 11.6 - 0.13·x52
+ 1.0·x1·x53, RMSE=6.14
for training set and RMSE=3.25 for the test set
(Table 1). Nevertheless RPNN improved the results: found the exact model
structure y = 10.16 - 0.05·x4·x5 +
1.0·x1·x53 and
provided lower prediction error RMSE=1.65 for
training set RMSE=1.21 for the test set. In order to
study stability of GMDH(GN) and RPNN algorithms, the values of two and three
data cases from the training set were changed to be large errors of the initial
model. Results of the GN algorithm were affected by the outliers (Table 1, 2).
This method provided low generalization ability for the test set. c (Table 1,
2).
Table 2. RMSE for GMDH(GN) and RPNN Methods for Test
Sets
| |
no errors |
1 error |
2 errors |
3 errors |
| GN |
3.25 |
139.3 |
136.5 |
291.4 |
| RPNN |
1.2 |
1.7 |
0.61 |
1.42 |
Application in environmental and toxicological studies
Prediction of the sublimation enthalpy
To study the stability and robustness of RPNN, we developed QSAR
models for the sublimation enthalpy of a series of 18 polychlorinated
hydrocarbons (PCBs). This work was elaborated in collaboration with Prof.
William J. Welsh, Department of Pharmacology Robert Wood Johnson Medical Scool,
GenChemics, and Dr. Igor Tetko.
What is the Quantative-Structure Activity Relationship Study? QSAR
studies represent an important part of the drug or new materials design process
and are used to reveal relationships between chemical structure of compounds
and their biological activities, toxicity or others. Various molecules
characteristics are used as input variables (Electron-Topological Indices,
Molecular Shape, Molecular Size, Charge Distribution, Molecular Fragments,
CoMFA Descriptors) the biological activity or toxicity is an output.
The atom and bond-type electrotopological state (E-state) indices (8
parameters) were used as inputs parameters for the Multiple Linear Regression
Analysis (MLRA), GMDH(GN) and RPNN. E-state indices introduced by Kier and Hall
(1990) combine electronic and topological characteristics of the molecules and
describe atoms (atom-type indices) and bonds (bond-type indices).
The first analysis included 16 PCBs in the training and 2 molecules in
the test set (Example 2).
Statistically significant models with low prediction error of the tests
molecules RMSE=1.06 and 0.93
were calculated by the MLRA and the RPNN methods, respectively. GN calculated
lower result RMSE=3.1. However, PNN and RPNN selected
non-linear models of the same structure y = -0.036·x72
+ 133·x4 - 5.9·x1 and
y = -0.023·x72 + 132·x4 -
5.9·x1. MLRA significantly decreased
its prediction ability when number of molecules in the training set dropped
below 12. On the contrary, the RPNN results were not affected by the number of
molecules in the training set. Even if the number of molecules in this set
decreased from 16 to 7 molecules, the RPNN still calculated the same regression
equations with only slight variations of the regression coefficients. For
example, the equation y = -0.041·x72 +
134·x4 - 5.9·x1 was
calculated with 7 molecules in the training set. This model provided RMSE=4.3
for 11 molecules in the test set (Figure 1,2,3). These results indicated a
stability of RPNN method. The results calculated by GN method were lower
compared to RPNN but much better compared to MLRA method.
| RPNN |
MLRA |
 |
 |
| 16 molecules in the training and 2 molecules in the test sets |
| |
|
| RPNN |
MLRA |
 |
 |
| 14 molecules for training sets and 4 molecules for test |
| |
|
| RPNN |
MLRA |
 |
|
| 7 molecules for training sets and 11 molecules for test |
Figure 1,2,3. Prediction of the sublimation enthalpy of molecules by the MLRA
and RPNN methods.
Figure
4. Root Mean Squared error calculated for the test set molecules as a function
of the number of molecules in the training set. MLRA failed to provide
significant models if less than nine molecules were used in the training set.
The same data set included 16 PCBs was used to study the robustness of
RPNN to outliers. The values of two data cases from the training set were
changed to be large errors - line 3: 182.12 instead of 82.12 and line 16: 18.33
instead of 108.33. RPNN results for both train and test sets were practically
the same as for the noiseless data (Figure 5,6).

Figure
5,6.
Because nonlinearity of dependencies between parameters and responses
in QSAR study and "dirty" data sets by presence of outliers RPNN represents a
promising tools for applications in drug design, environmental and
toxicological studies.
Reference:
-
Yurachkovsky, Y.P. 1981. "Restoration of Polynomial Dependencies Using
Self-Organization." Soviet Automatic Control, 14, 17-22.
-
T.I Aksenova, I.V. Tetko, V.V. Volkovich, GMDH-type Neural Network in
Quantative-Structure Activity Relationship Studis on the Internet, Modelling
and simulation 2001(ESM2001), 15th European Simulation Multiconference, Prague,
pp.685-689
|