         Articles ::

# Polynomial Neural Network (PNN) algorithm description

Let us consider basic principles of Polynomial Neural Network (PNN) algorithm organization. The proposed method can be used to analyze complex data sets with the aim to determine internal data relationships and to present knowledge about these relationships in the form of mathematical description (polynomial regression).

As an examples of possible application area of PNN algorithm one can consider any sphere where sets of observation data should be analyzed and data relationships models should be build. There are, for example, chemistry (QSAR), economical systems analyses, stocks and financial market instruments, insurance risks study, medical diagnostics, etc.

To work with the algorithm one should formalize the data domain to have number of observed quantitative parameters X={x1, x2, ..., xm} (the set of input variables). Then it should be set of observations with "already known" result, that is n observations of vector X (input) values and y (output). The purpose of the algorithms is to find a subset of variables {xi1, xi2, ..., xik} and a model y = fi (xi1, xi2, ..., xik) in the set of polynomials of s-power that minimizes some criterion value (CR).

Afterbuilding the model it can be used to calculate "unknown" result y on new observed vectors X. The algorithms are implemented as a GMDH-type Neural Network. First layer generates the models y = g(xi , xj , xk), where where xi, xj, xk input variables. Next layers generate models as y = g(wi , wj ,wk), where wi , wj ,wk are the models of previous layers.

The main objective is to create a stable algorithm for nonlinear model reconstruction such as that its results are easily interpreted by users.

The main features of the algorithm can be summarized as follows:
1. Fast learning. The transforms with two coefficients only are used, for example g(wi, wj) = a·wi+b·wj in the linear case. Irrespectively of the power of resulting model and the number of terms the second order matrices are only inverted. This provides fast learning of the algorithm.
2. Results in the parametric form. The polynomial structures are coded using vector of simple numbers [1,2] that provides the presentation of the results in the parametric form of nonlinear equation.
3. Complexity control. Let us denote vector (power, c)T as a complexity, power is the power of the polynomial and c is the number of terms. The power of the new model is controlled by the condition that if, for example, g(wi, wj, wl)=a·wi+b·wj·wl, then power(g(wi,wj,wl)) = max(power(wi), power(wj) + power(wl)), where power() designates the power of the polynomial. It gives us the possibility to restrict the class of the models under consideration by power(wi) < p and to search models among the polynomials with power less than p.
The maximum complexity is defined by the user or can be automatically selected using a full cross-validation method.
4. Twice-hierarchical neural net structure.Twice-hierarchical neural net structure is important feature of PNN. One of the problem is that power of polynomials increases too fast in the traditional GMDH algorithm. At the step r of iteration procedure one can have models of power r+1, Wr€Pr+1. The control of complexity gives us an opportunity to implement the iteration procedure without an increase of the power of polynomials or/and the number of terms. External iterative procedure controls the complexity, i.e. the number of the terms and the power of the polynomials in the intermediate models. The best models form initial set for the next iterative procedure. This procedure realizes a wide search without the complexity increase. Besides that the twice-hierarchical neural net structure provides the convergence of the coefficients. The models that are calculated as a result of several transformations have the coefficients that are close to the appropriate regression coefficients.
5. Robust estimation. To use algorithm in the presence of large errors (outliers) we have developed the PNN algorithm for robust nonlinear (M-regression) model identification. This made possible to improve stability of PNN algorithm.

The performance of PNN is demonstrated using examples of artificial and real data sets. A first set was generated from a nonlinear model of fourth power y = x1·x53+10 Random noise and three further random variables x2 - x4 were added and 14 observations were generated (n=14; m=5). (Example 1)
The comparison of RPNN algorithm was done with the GN (Yu.P. Yurachkovsky, 1981) - one of the best GMDH-type Neural Network algorithm. .

Table 1. Values Calculated by GMDH(GN) and RPNN Methods for the Training and Test Sets

 without errors with 2 errors with 3 errors Y GN RPNN Y GN RPNN Y GN RPNN Training set 38 36 37 38 5 34 38 -73 27 26 27 26 26 -141 24 26 29 22 520 516 521 1520 1653 515 1520 1457 519 200 102 201 200 249 193 200 272 202 139 138 138 139 227 131 139 169 129 2571 2568 2571 2571 2510 2572 571 2810 2571 1725 1724 1725 1725 1775 1729 1725 1629 1723 2754 2755 2753 2754 2708 2762 2754 2210 2762 1522 1522 1521 1522 1534 1523 522 1054 1524 1457 1462 1468 2457 2319 1465 2457 2540 1466 1723 1724 1725 1723 1775 1723 1723 1888 1723 Test set 2929 2923 2927 2929 2930 2929 2929 3477 2929 658 656 657 658 931 657 658 460 655

Twelve and two observations were used as training and test sets, respectively. The model with excellent approximation and low prediction error were calculated by GMDH(GN): y = 11.6 - 0.13·x52 + 1.0·x1·x53, RMSE=6.14 for training set and RMSE=3.25 for the test set (Table 1). Nevertheless RPNN improved the results: found the exact model structure y = 10.16 - 0.05·x4·x5 + 1.0·x1·x53 and provided lower prediction error RMSE=1.65 for training set RMSE=1.21 for the test set. In order to study stability of GMDH(GN) and RPNN algorithms, the values of two and three data cases from the training set were changed to be large errors of the initial model. Results of the GN algorithm were affected by the outliers (Table 1, 2). This method provided low generalization ability for the test set. c (Table 1, 2).

Table 2. RMSE for GMDH(GN) and RPNN Methods for Test Sets

 no errors 1 error 2 errors 3 errors GN 3.25 139.3 136.5 291.4 RPNN 1.2 1.7 0.61 1.42

Application in environmental and toxicological studies

Prediction of the sublimation enthalpy

To study the stability and robustness of RPNN, we developed QSAR models for the sublimation enthalpy of a series of 18 polychlorinated hydrocarbons (PCBs). This work was elaborated in collaboration with Prof. William J. Welsh, Department of Pharmacology Robert Wood Johnson Medical Scool, GenChemics, and Dr. Igor Tetko.

What is the Quantative-Structure Activity Relationship Study? QSAR studies represent an important part of the drug or new materials design process and are used to reveal relationships between chemical structure of compounds and their biological activities, toxicity or others. Various molecules characteristics are used as input variables (Electron-Topological Indices, Molecular Shape, Molecular Size, Charge Distribution, Molecular Fragments, CoMFA Descriptors) the biological activity or toxicity is an output.

The atom and bond-type electrotopological state (E-state) indices (8 parameters) were used as inputs parameters for the Multiple Linear Regression Analysis (MLRA), GMDH(GN) and RPNN. E-state indices introduced by Kier and Hall (1990) combine electronic and topological characteristics of the molecules and describe atoms (atom-type indices) and bonds (bond-type indices).

The first analysis included 16 PCBs in the training and 2 molecules in the test set (Example 2). Statistically significant models with low prediction error of the tests molecules RMSE=1.06 and 0.93 were calculated by the MLRA and the RPNN methods, respectively. GN calculated lower result RMSE=3.1. However, PNN and RPNN selected non-linear models of the same structure y = -0.036·x72 + 133·x4 - 5.9·x1 and y = -0.023·x72 + 132·x4 - 5.9·x1. MLRA significantly decreased its prediction ability when number of molecules in the training set dropped below 12. On the contrary, the RPNN results were not affected by the number of molecules in the training set. Even if the number of molecules in this set decreased from 16 to 7 molecules, the RPNN still calculated the same regression equations with only slight variations of the regression coefficients. For example, the equation y = -0.041·x72 + 134·x4 - 5.9·x1 was calculated with 7 molecules in the training set. This model provided RMSE=4.3 for 11 molecules in the test set (Figure 1,2,3). These results indicated a stability of RPNN method. The results calculated by GN method were lower compared to RPNN but much better compared to MLRA method.

 RPNN MLRA  16 molecules in the training and 2 molecules in the test sets RPNN MLRA  14 molecules for training sets and 4 molecules for test RPNN MLRA 7 molecules for training sets and 11 molecules for test
Figure 1,2,3. Prediction of the sublimation enthalpy of molecules by the MLRA and RPNN methods. Figure 4. Root Mean Squared error calculated for the test set molecules as a function of the number of molecules in the training set. MLRA failed to provide significant models if less than nine molecules were used in the training set.

The same data set included 16 PCBs was used to study the robustness of RPNN to outliers. The values of two data cases from the training set were changed to be large errors - line 3: 182.12 instead of 82.12 and line 16: 18.33 instead of 108.33. RPNN results for both train and test sets were practically the same as for the noiseless data (Figure 5,6).  Figure 5,6.

Because nonlinearity of dependencies between parameters and responses in QSAR study and "dirty" data sets by presence of outliers RPNN represents a promising tools for applications in drug design, environmental and toxicological studies.

Reference:

1. Yurachkovsky, Y.P. 1981. "Restoration of Polynomial Dependencies Using Self-Organization." Soviet Automatic Control, 14, 17-22.
2. T.I Aksenova, I.V. Tetko, V.V. Volkovich, GMDH-type Neural Network in Quantative-Structure Activity Relationship Studis on the Internet, Modelling and simulation 2001(ESM2001), 15th European Simulation Multiconference, Prague, pp.685-689    Copyright © 2002-2009, PNN Ltd. All rights reserved.