A computational learning system that understands and translates data inputs in one form into desired outputs is called a
neural network. It can also be defined as a computer’s ability to learn and improve over time by recognizing hidden patterns and correlations from the raw data.
In this tutorial, we will discuss feed-forward and recurrent neural networks. We’ll work our way up to the recurrent neural network starting with the feed-forward neural network. Both networks will be implemented in python, and their differences will be examined.
To follow along the reader should have the following:
- Have a basic knowledge of Python.
- Have the Python environment of your choice installed.
Table of contents
- Table of contents
- Neural networks
- Importance of neural networks
- Feed-forward neural network
- Feed-forward neural network implementation
- Applications of Feed-forward neural network
- Recurrent neural network
- RNN implementation
- Difference between RNN and Feed-forward neural network
Neural networks mimic the human brain’s ability to spot patterns in a dataset. With neural networks, we can teach computers to learn and interpret data in a manner inspired by the human brain.
In this tutorial, we will be discussing:
- Recurrent Neural Networks (RNN)
- Feed-forward neural network (FFN)
Importance of neural networks
Complex problems such as pattern recognition and facial recognition are solved with neural networks. Handwriting recognition for check processing, signal processing, data analysis, speech-to-text transcription, and weather forecasting are a few other examples.
You can learn more about them here.
Feed-forward neural network
A feed-forward neural network (FFN) is a single-layer perceptron in its most fundamental form. Components of this network include the hidden layer, output layer, and input layer.
In the above image, the neural network has input nodes, output nodes, and hidden layers. Due to the absence of connections, information leaving the output node cannot return to the network. As the name of the network suggests, the information goes in only one direction.
- The input layer comprises neurons that receive input.
- The hidden layer contains a large number of neurons that modify the inputs and interact with the output layer.
- The output layer contains the result of the computation.
Feed-forward neural network implementation
Let’s implement a feed-forward neural network in Python.
import numpy as num # Contains a variety of mathematical functions, including random number generators, linear algebra procedures, Fourier transforms, and more from sklearn import datasets
Create sample weights
Weights are used to describe the strength of a neural connection. It varies from 0 to 1.
num.random.seed(0) X, y = datasets.make_moons(200, noise=0.20)# Generating and plotting the dataset # nodes inputlayer_dimensionality = 4 outputlayer_dimensionality = 3 hiddenlayer_dimensionality = 6
In the above code:
- We generated and plotted the datasets.
- We defined our neural network architecture, including the three nodes.
- On each node, we gave a different dimension. The dimensions will be used later to calculate the weighted sum of neurons.
a1 = num.random.randn(inputlayer_dimensionality, hiddenlayer_dimensionality)# weights for layer 1 c1 = num.zeros((1, hiddenlayer_dimensionality))# bias for layer 1 a2= num.random.randn(hiddenlayer_dimensionality, hiddenlayer_dimensionality)# weights for layer 2 c2 = num.zeros((1, hiddenlayer_dimensionality))# bias for layer 2 a3= num.random.randn(hiddenlayer_dimensionality, outputlayer_dimensionality)# weights for layer 3 c3 = num.zeros((1, outputlayer_dimensionality))# bias for layer 3
Note that the weighted sum is the sum of weights, input signal, and bias element.
In the above code:
- All weights provided in the first, second, and third layers are used to calculate the weighted sum of neurons in the first, second, and third hidden layers.
- A softmax function is applied in the last layer. The list of numbers sent to this function is transformed into a probability list whose probabilities are proportional to the numbers in the list.
Forward propagation of the input signal
To reach the output layer, the propagation will occur over several layers which include the first, second and the third hidden layer.
#First hidden layer d1 = X.dot(a1) + c1 q1 = num.tanh(z1) #second hidden layer d2 = q1.dot(a2) + c2 q2 = num.tanh(z2) #third hidden layer d3 = q2.dot(a3) + c3 probs = num.exp(d3) / num.sum(num.exp(d3), axis=1, keepdims=True)
How the weighted sum of the input is transformed into an output from a layer of the network is defined by an activation function.
In the above code:
- Forward propagation of activation is calculated based
tanhfunction to 6 neurons in the first layer.
- Forward propagation of activation from the first layer is calculated based
tanhfunction to 6 neurons in the second layer.
- Forward propagation of activation from the second layer is calculated based
tanhfunction to 3 neurons in the output layer.
- Probability is calculated as an output using the softmax function.
Applications of Feed-forward neural network
- An illustrious network of genetic regulation is a feedforward system to detect non-temporary atmospheric modifications.
- In automation and machine management, feedforward control may be a discipline.
- Derivative parallel feedforward compensation can be used to transform an open-loop transfer system into one that operates at its minimum.
Recurrent neural network
One of the most frequent types of artificial neural networks is called a recurrent neural network. It is commonly used for automatic voice recognition and machine translation (RNN). It is possible to forecast the most likely future situation utilizing patterns in sequential data by employing recurrent neural networks.
Recurrent neural networks (RNN) make it easier to model sequence data. RNNs, which come from feedforward networks, act in a way that is similar to how human brains do. In other words, recurrent neural networks can predict sequential data in ways that other algorithms can’t.
As strong as they are, recurrent neural networks are vulnerable to gradient-related training issues. The $n$ derivatives of a network with $n$ hidden layers will be multiplied.
When these derivatives are significant, the gradient increases exponentially as it propagates backward until it bursts. This is known as the
exploding gradient problem.
A problem known as the
vanishing gradient problem occurs when the gradient propagates yet the derivatives are so small that the gradient finally vanishes.
We will minimize the issues in the following ways:
- We will limit the number of gradients when training the model to prevent them from exploding. It’s called Gradient Clipping.
- We will prevent the weights from shrinking to zero by setting their initial values to identity matrices and zero for their biases. Weight initialization is the technical term for this procedure.
We will use this dataset to train a simple RNN.
import numpy as num# contains various mathematical functions, including random number generators, linear algebra procedures, Fourier transforms, and more. from keras.models import SequentialW# A simple stack of layers with only one input, and one output tensor can be modeled using the sequential model. from keras.layers import Dense, SimpleRNN# does operations on the input and return the output. from sklearn.preprocessing import MinMaxScaler#Scale each feature to a specific range to transform it. import matplotlib.pyplot as mpl#A collection of Matplotlib's most useful functions. from sklearn.metrics import mean_squared_error#Mean squared error regression loss. import math#Mathematical functions must be applied to any functions that you employ. from pandas import read_csv#will be used to return a new DataFrame with the data and labels from the CSV file.
Create a simple RNN and define weights
Below we create a model that includes a SimpleRNN layer and a Dense layer. Afterward, they are utilized to learn sequential data.
def create_RNN(hidden_units, dense_units, input_shape, activation):#Create a recurrent neural network to compute a control policy. ourModel = Sequential()#appropriate for a plain stack of layers where each layer has exactly one input ourModel.add(SimpleRNN(hidden_units,input_shape=input_shape,activation=activation))#fully-connected RNN where the output from previous timestep is to be fed to next timestep ourModel.add(Dense(units=dense_units, activation=activation))#the regular deeply connected neural network layer. ourModel.compile(loss='mean_squared_error', optimizer='adam')#Once the model is created, you can config the model with losses and metrics return ourModel # Returns our model demo_ourModel = create_RNN(2, 1, (3,1), activation=['linear', 'linear'])# used as a builder to create RNN model x1 = demo_ourModel.get_weights() x2 = demo_ourModel.get_weights() a1 = demo_ourModel.get_weights() x3 = demo_ourModel.get_weights() a2 = demo_ourModel.get_weights() # Displaying weights print('x1 = ', x1, ' x2 = ', x2, ' a1 = ', a1, ' x3 =', x3, 'a2 = ', a2) # Returns the weights on screen
x1 = [[0.10581112 1.1404327 ]] x2 = [[-0.32814407 0.9446277 ] [ 0.9446277 0.32814407]] a1 = [0. 0.] x3 = [[-0.5538285] [-0.5600237]] a2 = [0.]
In the above code:
- The SimpleRNN layer creates two hidden units, while the Dense layer creates one dense unit, all returned in the demo model object. Both layers utilize a linear activation function with a
3*1input shape value.
- My interpretations of the data may differ from yours because we employed a randomized weighting technique in our analysis. The most critical component is figuring out how the elements work together to create the final output.
We reshape the input to the required
time_steps, and features. In this case,
time_steps indicates the number of prior time steps to use for forecasting the next value of the time-series data, and
input_shape defines the parameter.
Time-series data is data that is recorded over consistent intervals of time.
x = num.array([1, 2, 3])# returns an array, or any sequence. inputX = num.reshape(x,(1, 3, 1))#Gives a new shape to an array without changing its data. prediction_ourModel = demo_ourModel.predict(inputX)# Model groups layers into an object with training and inference features z = 2 d0 = num.zeros(z) d1 = num.dot(x, x1) + d0 + a1 d2 = num.dot(x, x1) + num.dot(d1,x2) + a1 d3 = num.dot(x, x1) + num.dot(d2,x2) + a1 c3 = num.dot(d3, x3) + a2 # Displaying vectors print('d1 = ', d1,'d2 = ', d2,'d3 = ', d3)# Prints the values of the given vectors #Displaying predictions print("Network Prediction", prediction_ourModel)# displays the Network Prediction print("Computational Prediction", c3)# displays the Computational Prediction
d1 = [[0.10581112 1.14043272]] d2 = [[1.25418528 2.75504378]] d3 = [[2.50837057 5.5100876 ]] Network Prediction [[-4.474987]] Computational Prediction [[-4.47498683]]
In the above code:
- We provided the network with an input of $x$ and let it generate output for three-time steps.
- We then figured out what the hidden units were doing at each of the first three points in time.
- The zero vector is used to set the value of $d0$. $d3$ and $x3$ are used to calculate $c3$. We don’t need an activation function because we work with linear units.
Test the network
To test the RNN, we’ll use a simple time-series dataset.
def get_train_test(url, split_percent=0.8):# Quick utility that wraps input validation diff = read_csv(url, usecols=, engine='python')#supports optionally iterating or breaking of the file into chunks. ourdata = num.array(diff.values.astype('float32'))# returns an array, or any sequence. ourscaler = MinMaxScaler(feature_range=(0, 1))# Feature transformations are accomplished by scaling each individual feature to a predetermined range. Each feature is scaled and translated separately by this estimator to fit within the specified range. ourdata = ourscaler.fit_transform(ourdata).flatten()#transformations are done on individual data pn = len(data)# returns the number of items in a data datasplit = int(pn*split_percent)# splits data into the predetermined number ourtrain_data = data[range(datasplit)]# training data ourtest_data = data[datasplit:]# testing the data return ourtrain_data, ourtest_data, ourdata# Returns our test data,trained data and our data # targets and inputs as Y and X are created here def get_XY(dat, time_steps):# Return only metrics/values that we will base our predictions inputy = num.arange(time_steps, len(dat), time_steps)#Return evenly spaced values within a given interval Y = dat[inputy]#Create and modify a dat repository. inputx = len(Y)# returns the number of items in a data X = dat[range(time_steps*inputx)]#Create and modify a dat repository and return evenly spaced values within a given interval X = num.reshape(X, (inputx, time_steps, 1)) #Gives a new shape to an array without changing its data. return X, Y # Returns the target Y and inputs X def create_RNN(hidden_units, dense_units, input_shape, activation):#Create a recurrent neural network to compute a control policy. ourModel = Sequential()#appropriate for a plain stack of layers where each layer has exactly one input ourModel.add(SimpleRNN(hidden_units,input_shape=input_shape,activation=activation))# fully-connected RNN where the output from previous timestep is to be fed to next timestep ourModel.add(Dense(units=dense_units, activation=activation))#the regular deeply connected neural network layer. ourModel.compile(loss='mean_squared_error', optimizer='adam')#Once the model is created, you can config the model with losses and metrics return ourModel # Returns our model def print_error(trainY, testY, train_predict, test_predict): # Error of predictions train_error = math.sqrt(mean_squared_error(trainY, train_predict))# computes the mean squred root of th mean squred error and trains it test_error = math.sqrt(mean_squared_error(testY, test_predict))# computes the mean squred root of th mean squred error and tests it # Displaying the Root mean squred error print('Train RMSE: %.3f RMSE' % (train_error))# Prints the trained rmse print('Test RMSE: %.3f RMSE' %(test_error)) # Prints the tested rmse # Displaying a plot of the result def plot_result(trainY, testY, train_predict, test_predict):# Plots the result actualData = num.append(trainY, testY)# adds the items to the end of the list predictions = num.append(train_predict, test_predict)# adds trained and test prediction items to the end of the list rows = len(actualData)# returns the number of items in a data mpl.figure(figsize=(15, 6), dpi=80)# Figure instance supports callbacks through a callbacks attribute mpl.plot(range(rows), actualData)# makes a plot on the actual data mpl.plot(range(rows), predictions)# makes a plot on the predicted data mpl.axvline(x=len(trainY), color='r')#Add a vertical line across the Axes. mpl.legend(['Actual data', 'Predicted data'])#Place a legend on the Axes. mpl.xlabel('Observation number ')# adds Parameters on the x axis mpl.ylabel('Dataset scaled')# adds Parameters on the y axis mpl.title('Actual and Predicted Values.')# adds a title #dataset usrl dataset_url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/monthly-sunspots.csv'# this includes a link to the dataset that we will be using time_steps = 12# rounds in sets of time train_data, test_data, data = get_train_test(dataset_url)#includes training, fetching and testing our dataset from the url trainX, trainY = get_XY(train_data, time_steps)#training the inputs and the targets testX, testY = get_XY(test_data, time_steps)#testing the inputs and the targets #initializing our Model ourModel = create_RNN(hidden_units=3, dense_units=1, input_shape=(time_steps,1), activation=['tanh', 'tanh'])# creatin RNN and its dense layers ourModel.fit(trainX, trainY, epochs=20, batch_size=1, verbose=2)#Running 20 epochs and traing the targets and the inputs # make predictions tp = ourModel.predict(trainX)# makes a prediction on trainX ts = ourModel.predict(testX)# makes a prediction on testX # Display the error print_error(trainY, testY, tp, ts)# Show error #Displays a graph plot_result(trainY, testY, tp, ts)# Shows a graph result
The red line separates the training and test examples.
In the above code:
- You read the data from an URL to get a percentage of the data for the test, and then you can divide that percentage by the percentage of train data. The train and test data are returned as single-dimensional arrays once the data has been scaled.
- We begin by creating rows of non-overlapping time steps for Keras model training. This is done in preparation for training with time-series data.
- We create the RNN model and train.
- Afterward, we calculate the mean square error, which measures how far the actual values deviate from the forecasted ones.
- In the end, we plot the result using
You can run the above code here.
Difference between RNN and Feed-forward neural network
- In contrast to feedforward networks, recurrent neural networks feature a single weight parameter across all network layers. Reinforcement learning can still be achieved by adjusting these weights using backpropagation and gradient descent.
- Unlike recurrent neural networks, which continuously feed information from input to output, feedforward neural networks constantly feed data back into the input for further processing and final output.
- Recurrent neural networks contain a feedback loop that allows data to be recycled back into the input before being forwarded again for further processing and final output. Whereas feedforward neural networks just forward data from input to output. Data can only flow in one direction in feedforward neural networks. Data from prior levels can’t be saved because of this forward traveling pattern; hence there is no internal state or memory. RNN, on the other hand, uses a loop for cycling through the data, allowing it to keep track of both old and new information.
In this tutorial, we learned about both feed-forward and recurrent neural networks. We also learned about their implementations using Python language, and in addition, we covered their difference.
Find the whole code for this tutorial here.
- Recurrent Neural Network Algorithms for Deep Learning.
- Introduction to Backpropagation Through Time.
- Feed-forward neural network.
Peer Review Contributions by: Srishilesh P S