Neural Network

71 downloads 3736 Views 190KB Size Report
SAS Enterprise Miner – Neural Network. A Neural Network is a set of connected input/output units where each connection has a weight associated with it. During  ...
Enterprise Miner – Neural Network

1

ECLT5810 E-Commerce Data Mining Techniques SAS Enterprise Miner – Neural Network A Neural Network is a set of connected input/output units where each connection has a weight associated with it. During the learning phase, the network learns by adjusting the weights so as to be able to predict the correct class label of the input samples.

A. Neural Network Node

By default, the Neural Network node automatically constructs a multilayer feed-forward network that has one hidden layer consisting of three neurons. In general, each input is fully connected to the first hidden layer, each hidden layer is fully connected to the next hidden layer, and the last hidden layer is fully connected to the output. The Neural Network node supports many variations of this general form.

Basically, the neural network node has the following settings: o o o o o o I.

Data Tab Variables Tab General Tab Basic Tab Advanced Tab Output Tab Data Tab The Data tab consists of two subtabs: o General -- sets the predecessor data sets for training, validation, testing, and scoring.

o

Different data sets are used as follows.

Enterprise Miner – Neural Network 1. 2. 3.

o

2

The Training data set is used to train (estimate) the network weights. The Validation data set is used to choose network producing the lowest unbiased average error. The Test data set is used to obtain an unbiased estimate of generalization error after you have chosen a network based on validation error.

Options -- specifies whether or not to use a sample of the DMDB for preliminary training, and specifies the maximum number of rows in the training data set to permit interactive training.

II.

Variables Tab Similar to other nodes, the variables tab can be used to view the data set variables, their model roles, measurements, types, formats, and labels, and also set their status, edit the target profile.

III.

General Tab

In the General tab, we can o specify the model selection criteria o set the user interface o specify whether or not to accumulate training history o specify how you want to monitor the training process. I. Model Selection Criteria o Average Error -- chooses the model that has the smallest average error for the validation data set. o Misclassification Rate -- chooses the model that has the smallest misclassification rate for the validation data set. o Profit/Loss (Default) -- If you defined two or more decisions in the profit or loss matrix, then the node chooses the model that maximizes the profit or minimizes the loss for the cases in the validation data set. If a validation data set is not available, then the node uses the training data set. If the decision matrix contains less than two decisions, then the

Enterprise Miner – Neural Network

II. III. IV. IV.

3

profit or loss information is not considered in the model selection process. When there are less than two decisions, the average error criterion is actually used for model selection. To use the profit/loss criterion, you must define a profit or loss matrix in the target profile for the target. The profit or loss values are adjusted for the prior probabilities that you specify in the prior vector of the target profile. To learn more about defining a target profile, read the Target Profiler chapter. Advanced user interface Accumulating Training History Monitoring the Training Process

Basic Tab

Configurations of the basic tab include: o Network architecture -- the default is multilayer perceptron - no direct connections, and the number of hidden neurons is data dependent. o Preliminary runs -- the default is none. o Training technique -- the default depends on the number of weights applied during execution. o Runtime limit -- the default is 2 hours. Set Network architecture

o

o

Hidden neurons -- If you select the number of hidden neurons based on the noise in the data (any of the first four items), the number of neurons is determined at run time and based on the total number of input levels, total number of target levels, and the number of training data rows in addition to the noise level.  High noise data  Moderate noise data  Low noise data  Noiseless data  Set number Direct connections By default, the network does not include direct connections. In this case, each input unit is connected to each hidden unit and each hidden unit is connected to

Enterprise Miner – Neural Network

o

4

each output unit. If you set the Direct connections value to Yes, then each input unit is also connected to each output unit. Direct connections define linear layers, whereas hidden neurons define nonlinear layers. Network architecture  Generalized Linear Model  Multilayer Perceptron -- default.  Ordinary RBF-Eq. Widths -- ordinary radial basis function with equal widths.  Ordinary RBF-Uneq. Widths -- ordinary radial basis function with unequal widths.  Norm. RBF-Eq. Heights -- normalized radial basis function with equal heights.  Norm. RBF-Eq. Volumes -- normalized radial basis function with equal volumes.  Norm. RBF-Eq. Widths -- normalized radial basis function with equal widths.  Norm. RBF-Eq. Widths and Heights -- normalized radial basis function with equal widths and heights.  Norm. RBF-Uneq. Widths and Heights -- normalized radial basis function with unequal widths and heights.  All of the normalized radial basis functions are grayed out if the number of hidden neurons is set to 1. Depending on the complexity of the data and the underlying network architecture, neural networks can take a long time to converge.

V.

Advanced Tab In the Advanced tab, you specify advanced network settings in the following subtabs: o Network Subtab o Initialization Subtab o Optimization Subtab o Train Subtab o Prelim Subtab To activate the Advanced tab, you must first select the Advanced user interface check box in the General tab. When you select the Advanced tab, the Network subtab is displayed. 1. Network Subtab o create the network, using the same set of templates that are provided in the basic user interface of the Basic tab o diagram the network, making connections as required o add hidden layers and control the number of units o view and set connection properties, specifying distribution, scale parameter, location parameter, and random seed o view and set node properties and group variables into nodes as needed.

Enterprise Miner – Neural Network

5

Diagramming the Network o Left: inputs o Middle: the hidden layers o Right: outputs (correspond to the targets, e.g. GOOD_BAD) By default, the input nodes are grouped by type, and only the input-node types are shown in the diagram. The default groupings of the nodes are explained here: • • • • • •

All interval input variables are grouped into one interval-type node: Interval. All nominal input variables are grouped into one nominal-type input node: Nominal. All ordinal input variables are grouped into one ordinal-type input node: Ordinal. All interval target variables are grouped into one interval-type target node: T_intrvl. Each nominal target variable is given its own separate node. Each ordinal target variable is given its own separate node.

Grouped variables can be placed into separate nodes, and separate nodes of the same type can be combined to form a grouped node. Modifying the network diagram: o Select nodes and connections by left-clicking o Right-click to display a menu of actions: o Properties, o Add hidden layer, etc. After adding the hidden layer, you may want to examine or change its properties. For example, you may want to change the number of neurons in the hidden layer by o viewing and setting connection properties: o select a connection arrow in the diagram o right-click to display a menu of actions o select Properties o viewing and setting node properties o select a node in the diagram o right-click to display a menu of actions o select Properties Note: Variables can be moved as desired among compatible nodes. For example, interval input variables can be separated or merged with other interval input variables, but they cannot be merged with nominal input variables. Beware that there are relationships among the neural network option settings. Some combinations of option settings are incompatible, and the network cannot be trained. The default settings are compatible and are highly recommended. The default settings are shown in green, and they have been carefully selected. If you change a default setting, then the new setting is shown in black. When you reset one or more nodes, the default settings are restored. Option settings that are incompatible with previous selections are grayed in subsequent menus.

Enterprise Miner – Neural Network 2.

6

Initialization Subtab

• Generate a random seed, by clicking the Generate New Seed button. The random seed effects the starting point for training the network. If the starting point is close to the final settings, then the training time can be dramatically reduced. Conversely, if the starting point is not close to the final settings, then training time tends to increase. You may want to first accept the default setting, and then in later runs, specify other random seeds. • Specify Randomize: scale estimates, target weights, and target bias weights. • Select Starting values for training: o None: default, o Current Estimates: uses the most recent training weights. o Selected data set.. 3.

Optimization Subtab

o

Optimization Step • Train (default) -- the model is trained when selected. • Preliminary -- performs preliminary optimization. • Train and Preliminary -- performs preliminary optimization, and then, for training, uses the weights that yield the best objective function value for the preliminary optimizations. • Early stopping -- defines a set of training parameters to hasten convergence for complex models. • Evaluate Function -- computes the objective function using the current weights.

o

Objective Functions: • Default -- the node chooses an objective function that is compatible with all of the specified error functions. • Deviance -- the difference between the likelihood for the actual network and the likelihood for a saturated model in which there is one weight for every case in the training data set.

Enterprise Miner – Neural Network

7

• Maximum Likelihood -- the negative log likelihood is minimized. • M Estimation -- primarily used for robust estimation when the target values may contain outliers.

4.

o

Weight Decay Parameter, which is a numeric parameter: the larger the value, the greater the restriction on total weight growth. The default is zero.

o

Convergence Parameters, • NLP - set parameters common to nonlinear optimization processes. • Adjustments - adjust the absolute convergence parameter for each of the

Train Subtab Training is finding the optimal weights for the neural network. The goal of training is to find the weights that yield a global minimum for the objective function. •

Training Technique. The most popular conventional optimization algorithms include Levenberg-Marquardt, Quasi-Newton, and Conjugate Gradient. The Levenberg-Marquardt technique is recommended for smooth least squares objective functions and network architectures with a small number of weights (up to 100). The Quasi-Newton technique is recommended for network architectures with a moderate number of weights (up to 500). The Conjugate Gradient technique is recommended for network architectures with a large number of weights (more than 500). Back-propagation-based training technique: o Standard backprop is the most popular back propagation method, but it is slow, unreliable, and requires the user to tune the learning rate manually, which can be a tedious process. o Quickprop and RPROP tend to take more iterations than conjugate gradient methods, but each iteration is very fast. o Other conventional training techniques include Double Dogleg, Trust Region, and Newton-Raphson with Line Search or Ridging. • Minimum Iterations. The default is missing, which indicates that the value will be set at run time based on the size of the training data and the network architecture. • Maximum Iterations. The default is missing, which indicates that the value will be set at run time based on the size of the training data and the network architecture. • Maximum CPU Time. The default is 168 hours. 5.

VI.

Prelim Subtab o Preliminary optimization is useful for finding good starting values for the network weights. Performing preliminary optimization may help you avoid finding weights that yield a local minimum for the objective function. o When you perform preliminary optimization, it is advisable to do at least 10 preliminary runs.

Output Tab 1. Select data sets or scoring (or using the score node instead). 2. Select data sets for browsing 3. View properties of output datasets.

B. Running the Neural Network Node o o o

The Neural Network node can be run in a non-interactive mode and in an interactive mode. When the node is running, the SAS Process Monitor displays graphically the error function after each optimization iteration (provided that you did not disable the monitor in the Basic tab). You can use the monitor to interrupt training, continue training, or stop training altogether. To stop

Enterprise Miner – Neural Network

8

processing the current iteration, select Stop Current. To stop processing all iterations, select Stop All. To close the monitor, select Close.

C. Neural Network Results Browser After you run the node in non-interactive or interactive mode, you can view the results in the Results Browser. The Results Browser contains the following tabs: • • • • • • • •

1.

Model Tab Tables Tab Weights Tab Plot Tab Code Tab Log Tab Output Tab Notes Tab

Model Tab The Model tab contains the following subtabs: • •

General -- provides administrative information about the model and enables you to view target profile information. Network -- summarizes the network settings.

2. Tables Tab The Tables tab displays output data sets. Use the drop-down arrow to select a data set. This example shows the following data sets. • • • 3.

Current estimates -- the objective function values and the weight (parameter) values for the current network. Current statistics -- the goodness-of-fit statistics for the current network. New Plot -- the summary statistics for the iterations.

Weights Tab • The Table subtab displays the current network weights.

Enterprise Miner – Neural Network



9

The Graph subtab displays a plot of the current network weights.

The toolbox enables you to interact with the grid plot. o

View Info -- click on this tool, then click on a bar (weight) to display information about that weight.

o

Move -- click on this tool, and then click and drag the histogram to the desired location.

o

Scroll -- click on this tool and then slide your cursor to scroll through the grid plot of weights.

o 4.

Reset -- click on this tool to reset the histogram to its initial display settings.

Plot Tab By default, the Plot tab displays a plot of the average error function values as the objective function is minimized. For each iteration the parameter values are updated, and hence the value of the error function changes. In general, the error function value for the training data set decreases across iterations. The goal is to find parameter values that minimize the error function.

Enterprise Miner – Neural Network

10

Note: To display a plot of the profit-or-loss values for every iteration, right-click on the plot and select either the Profit or Loss pop-up menu item. When you click on a plotted value, the vertical white line moves to that iteration, and it becomes the current network. When you right-click in this tab, a pop-up menu appears: • New plot -- provides a new plot of the selected variables. • History plot -- provides a plot history of the selected variables. • Error -- plots the error function values for the iterations. • Profit -- plots the profit values for the iterations. • Loss -- plots the loss values for the iterations. • Statistics -- enables you to select statistics to be plotted. • Weights -- enables you to select weights to be plotted. A plot of the weights is useful for studying the stability of the weights. • Enable popup data info -- when enabled and you click on the plot, a pop-up display shows the run number, the optimization step, the iteration number, and the objective function values for the data sets. Select Disenable popup data info to disable the pop-up display. • Set network at... -- displays a set of choices: Set network at selected iteration and Set network at minimum error function (sets the network at the iteration that minimizes the validation data set). 5.

Code Tab The Code tab displays the automatically generated SAS software code.

6.

Log Tab The Log tab displays the log of the optimization process.

7.

Output Tab The Output tab displays the printed output that was generated from the submitted code: • Preliminary iteration history • Optimization start parameter estimates (network weights) • Levenberg-Marquart optimization criteria • Iteration statistics for the Levenberg-Marquart Optimization • Optimization results (final) parameter estimates • Objective function value at the last iteration (not shown on the display, but listed after the parameter estimates).

Enterprise Miner – Neural Network

11

D. Output Data Sets from the Neural Network Node In the Data subtab of the Output tab of a Neural Network node, you can select the variables that are included in the scored training, validation, test, or score data set: o S_ -- standardized variables o H -- hidden units o P_ -- posterior probabilities for categorical targets or predicted values for interval targets o I_ -- normalized category that the case is classified into o U_ -- unnormalized category that the case is classified into o D_ -- label of the decision that is chosen by the model o E_ -- error function o R_ -- residuals o EP_ or EL_ -- expected profit or loss for the decision that is chosen by the model o IC_ -- investment cost.

E. Assessing the Models     

Add an Assessment node Connect each model node to the Assessment node. Open the Assessment node. Select all three models by dragging across each model row. To create a lift chart, use Tools, and select Lift Chart.



Select the model for subsequent scoring. o Select the Output tab. o Click on Neural Network entry in the Highlight Model for Output list box.

References: Han, J., and Kamber, M. (2001). Data Mining: Concepts and Techniques. Morgan Kaufman. SAS Enterprise Miner. http://support.sas.com/documentation/onlinedoc/miner/