GMDH-T-ANN - GMDH Type Artificial Neural Network
Introduction
This page describes the utilization of the gmdh-t-ann program available in the openGMDH repositories. The implemented algorithm is based in the paper “Modeling and prediction using GMDH networks of Adalines with nonlinear preprocessors” [1]. It was further extended to accomplish the Combinatorial algorithm, as described in [2] (Self-Organization of Neural Networks with Active Neurons).
The program consists of a set of functions for creating and training the networks. Some of the functions are accessed by the users, while others – the majority – form the internals of the program. This page aims the description of how to apply the algorithms, so it is focused in the functions which are actually employed by the users.
The gmdh-t-ann program was developed using the MATLAB[3] environment. Nevertheless, it was found that the code also work in GNU Octave, but it was not extensively tested in that environment. The examples presented in this page and the ones available in the repositories are tested in Octave.
This program is being developed as part of the work for a doctorate. It is still a work in process. Suggestions for improvements and bug reports are welcome and should be sent to the author.
Styles of networks available
It is possible to create two main styles of networks, named here as “default” and “combi”.
Default ANNs will have two inputs per active neuron. Layers are first built with a number of neurons correspondig to the combination of its inputs taken two by two. Thus, at least two inputs are needed for the establishment of one neuron.
Combi networks have neurons with one input in the first layer, two inputs in the second layer, three in the third layer and so on. The number of neurons firstly created in each layer during the training process is given by the combination of their inputs taken in groups of l, where l is the number of the layer. For example, if the second layer of a network yields six outputs after training, the third layer will have C6,3 neurons before training, i.e., it will have 20 neurons. In order to work around problems created by the large network structures during the training process, it is possible to define a limit for the number of inputs allowed in the neurons; Once this limit is reached in one layer, neurons in the further created layers will have that number of inputs.
Training process
The training process currently implemented based in [1]. Nevertheless, variations on this processes occur depending on parameters provided by the users.
The general view of the training process is presented below:
- Create the neurons in the first layer according to the number of inputs of the network and to the chosen algorithm (“default” or “combi”).
- Define the weights of the neurons using part of the input/output data, called the training set.
- Calculate the performance values for the neurons using another part of the data, the selection set.
- Remove the neurons which yielded the worse performance, according to the selection criterium defined.
- If it is possible, create another layer using the outputs of the current layer as inputs and go to step 2.
- If it is not possible to add layers, interrupt the process and trim the network, keeping the best performing neuron in the last layer and removing any neuron that do not contribute to its output.
The exact behavior of the train process depends on the configuration parameters given to the functions described below:
Functions reference
The following functions are used to create, train, simulate and visualize the GMDH networks:
gmdhNew
function gmdhNet = gmdhNew(inputCount, ...)
Creates a new GMDH network structure. With no more parameters than the input count, the returned network will follow the default PNN structure: each layer has n!/(2(n-2)!) elements with two inputs, where n is the number of system inputs for the input layer or the number or elements of the previous layer, for hidden layers.
With the option 'networkStyle' set to the value 'Combi', the Combi algorithm is used to model and train the network. In this style, the elements in the first layer have one input and the number of elements is equal to the number of inputs to the network; also, the elements are defined by 1st order expressions. Starting from the 2nd layer, all neurons will have 2nd order expressions. Nevertheless, in the 2nd layer, the elements will have 2 inputs, in the 3rd layer, 3 inputs, and so on. The parameters accepted by the function are:
parameter description
------------------ -------------------------------------------------
forwardInputs If set to true, the training algorithm will
forward the network inputs and the outputs of all
previous layers as inputs to the new layers. After
the training is finished, the selection process
will remove neurons and forwarders which do not
contribute to the output of the network.
layerTrainFunction Specify the layer training algorithm. Currently,
the available value is 'gmdhTrainMeanSquareLayer'.
networkStyle Determine if the network will follow the 'Default'
or the 'Combi' algorithm. If 'Combi' is chosen,
the parameter forwardInputs is automatically set
to true.
selectMethod Defines the method for excluding the badly
performancing neurons in each layer during the
training. Two methods are available:
- 'selectMedian' (default): exclude neurons wich
produced an MSE above the median of all MSEs in
the layer.
- 'selectBest': keep the N best neurons, where N
is defined by the 'maxLayerNeurons' train
parameter.
Both methods are influenced by the train parameter
property 'gmdhNet.trainParams.maxLayerNeurons'. It
will determine the maximum number of neurons after
the selection process. Using 'selectMedian', if
the number of selected neurons is bigger than
maxLayerNeurons and if maxLayerNeurons is bigger
than zero, the worse performing neurons are
removed. With 'selectBest', the criterium applied
is only the order of performance.
gmdhTrain
function outval = gmdhTrain(gmdhNet, trainSamples, trainTargets, selectSamples, selectTargets)
Trains the networks created with gmdhNew. The parameters are:
gmdhNet The network created with gmdhNew.
trainSamples The input samples used to define the weights in the training neurons. A m by n matrix where each row is one sample with n inputs.
trainTargets The desired training targets for the network. A m by 1 vector, where each value is the corresponding target to the input samples provided in the trainSamples parameter.
selectSamples The input samples used to select the training neurons. A q by n matrix where each row is one sample with n inputs.
selectTargets The desired training targets for the network. A q by 1 vector, where each value is the corresponding target to the input samples provided in the selectSamples parameter.
gmdhSimNet
function netOutputs = gmdhSimNet(gmdhNet, samples)
Simulate a trained network. The output of the function is formed by the outputs of the network simulated with the samples parameter. The parameters are:
gmdhNet The network created with gmdhNew and trained with gmdhTrain.
samples The input samples to the simulation. A s by n matrix where each row is one input sample with n inputs.
gmdhPrintNet
function gmdhPrintNet(gmdhNet, options)
Prints a gmdh network structure into the screen or into a file.
Called with only one argument, e.g., gmdhPrintNet(net_structure) , the function will direct the output to the screen. To print into a file, a file name must be provided using the 'fileName' option.
Each layer in the network will be printed as a column. The neurons in each layer are aligned on the top of the printing. By default, the function will print just the input indexes.
Example:
>> net = gmdhNew(4);
>> net = gmdhAddLayer(net);
>> gmdhPrintNet(net);
L1 L2
(04) (01 02)
(01) (01 03)
(02) (01 04)
(03) (02 03)
(02 04)
(03 04)
Modifications in the outputs can be done with the following options:
option value example
-------------- ------------ --------------------------------------------
'printWeights' true | false gmdhPrintNet(net, 'printWeights', true);
'inputFormat' format string gmdhPrintNet(net, 'inputFormat', '%03d');
'weightFormat' format string gmdhPrintNet(net, 'weightFormat', '%08.2f');
'fileName' file name string gmdhPrintNet(net, 'fileName', 'network.txt');
Option 'printWeights' will include the weights of the neurons in the layers columns.
Options 'inputFormat' and 'weightFormat' will change the default mask used for printing the input indexes and the weights of the netowrks. For most cases, the default values will produce an organized output. If the trained network have too many inputs or, for any other reason, gmdhPrintNet produce not aligned outputs, it is possible to change the masks to organize it.
Option 'fileName' will route the output of the function for the specified file. The content of the file will be an ASCII representation of the network. If the file already exists, all its previous contents will be lost.
Download and Examples
The program is accessible in the gmdh-t-ann directory from the openGMDH repository.
Examples of utilization are available in the examples directory under the gmdh-t-ann folder in the repository.
