How to Forecast Time Series Data using Neural Network Toolbox in Matlab
June 23, 2022
- Machine Learning
Forecasting is the process of predicting future results based on current and past events.
Time series data is gathered over time. A neural network toolbox is a Matlab toolbox that helps implement neural networks without writing code. We can use this toolbox to forecast time series data easily.
This tutorial will look at the general overview of neural networks. We will also discuss how to use this toolbox to implement the neural network.
Finally, we will determine how to develop and generate the code for a model. You can use this code for other forecasting purposes.
For a proper understanding of this tutorial, you need:
- MATLAB installed.
- Proper understanding of MATLAB basics.
How to use the neural network toolbox
nnstart in the command window to open this toolbox. For example, this command opens a new window shown below:
This toolbox provides various options that we can use to implement our data. It means that you can also perform other operations apart from the prediction.
For example, we have the fitting app, pattern recognition app, clustering app and time series app. We can use all these apps to analyze the input data. These apps are known as wizards. Since we will perform a prediction, we select the
pattern recognition app.
Note that this app provides other resources to help one understand the neural network toolbox and neural network. They also provide an external source of a dataset. Select the
more information tab at the top of the window above to access these sources.
When you open the
pattern recognition app, you get the following interface:
This app uses a two-layer feed-forward network to predict data. A feed-forward network is a classification algorithm that is biologically inspired.
It consists of simple neurons organized in layers, and each neuron is connected to the previous layers. The meaning of this explanation is shown below:
The neural recognition app will help you select data, as well as create, and train networks. It then evaluates its performance using cross-entropy and confusion matrix.
A confusion matrix is a layout table that shows the performance of a neural network on test data. It is mainly used to visualize supervised machine learning.
When you click on
next, you get a window that allows you to select input data, as shown below:
Click on the three dots labelled
2 to input your data. The dialogue that allows you to select data from it will open.
In some instances, you may not have the input data. The neural network toolbox has some sample data that you can use to create the neural network.
Click on the
load example data set below the window above to get this dataset. It will open sample datasets that you can use to create a new neural network. Let’s begin with the sample dataset in Matlab. The dataset interface is shown below:
There is a brief description of the selected dataset. For example, we have selected
breast cancer. A brief description of this dataset is on the right side of the window.
You can use this description to understand the dataset. It also gives alternative ways to create a neural network from the command window using the tool properties. To do this in the command window, we execute the code below:
[x,t] = cancer_dataset; net = patternnet(10); net = train(net,x,t); view(net) y = net(x); plotconfusion(t,y)
In the code above, the
cancer_dataset is loaded into Matlab, and the input data is stored in the
The target is stored in the variable
patternnet() function creates the neural pattern with
10 neurons. This pattern is trained by the
train() function which uses the network(net), input data and the target. The confusion matrix is also plotted using the
Since we are not creating and training the dataset in the command window, click on
import at the bottom of the window above. Once you import the dataset, both the input and target are filled, as shown below:
Our dataset is arranged column-wise.
Note that you can import input data and the target differently. This is done by clicking on the three dots beside that dataset.
When you click
next, it opens a new window shown below:
In this section, the dataset is divided into three, i.e. train, validation, and test data. As we can see, 70% of the dataset is for training, 15% for validation and the remaining 15% for testing.
The total samples in our dataset are 699. These are the default division ratio, but you can change this to your prefered ratio using the drop-down menu.
On the right side of this window, we have a brief description of the role of the samples. First, the training data is used to train the network. Also, it adjusts according to the error.
The validation sample is used to generalize the network and stops the training once it’s complete.
The test samples are used to measure the accuracy of the network. These samples have no dependency on the samples used for training.
next to access the following interface:
Here, you specify the number of hidden neurons that the network should use. The number of hidden neurons you use is based on the dataset’s type.
As you increase the number of hidden neurons, the accuracy increases. However, when the number of these hidden neurons is so high, the model will learn more parameters than required, which leads to network inaccuracy.
The default value of these hidden neurons is
10. We can change this number to our prefered value for accuracy purposes. When you click
next, it opens the training interface shown below:
When you click on the
train, the training begins. The training stops once the generalization or validation stops improving. At this point, the neural network is assumed to have completed learning.
If we train our network multiple times, we will obtain different results. This is because of different initial conditions and sampling.
Once the training is done, the window shown below appears:
This window shows the algorithm, training progress, and output plots. These outputs are the performance, training state, error histogram and error matrix.
For example, this network uses
23 iterations. The performance is
0.227 which is low. We never set our neural network properly because the sample data was not large. When the dataset is too small, the training performance is likely to be poor.
You can visualize the plot of the training process. For example, if you want to visualize the plot for the performance, click on the performance to see the following plots:
Also, if you want to visualize the confusion matrix, click on the
confusion which is shown below:
When you navigate back to the training window, it will be different. This new window allows you to re-train using certain data, import large datasets, or adjust network size.
next to deploy the solutions. These are the Matlab code and the Simulink model for the neural network that we have created. The interface is shown below:
When you select the
Matlab function, a code is generated for the neural model that we have created. The Simulink model for deploying our model is also highlighted below:
We can also get the diagram for our neural network:
We can use this network in any dataset that you train for prediction. This network is saved as a Matlab script.
next to save your network in the deploy solution window:
We can save the network as a simple or advanced script. The advanced script has additional options that help you improve your neural network.
Note that you must first generate the Matlab code or the Simulink model before saving your network. Once this is done, we click the
finishbutton to complete this whole process.
The advantage of the neural network toolbox in Matlab is that it has a user-friendly interface.
However, to use this toolbox, you need an advanced understanding of your dataset. Understanding the machine learning concept is also critical.
Also, to use this toolbox, you don’t need knowledge of the generated script. We should only ensure that the output results attained by our network are good and then generate the script.
Peer Review Contributions by: Wanja Mike