A neural network is an algorithms that is used to recognize pattern and relationships in a set of data through a process that replicates the way the human brain behaves. Neural networks can adapt to changing input or the features; so the network generates the best possible output without the need to redesign the output criteria.
The various uses of Neural Networks are development of process such as time-series forecasting, algorithmic trading, securities classification, credit risk modelling and constructing proprietary indicators and price derivatives.
Logical Computation with Neurons
Here we have one or more binary inputs (on or off) and one binary output. The neuron simply activates its output when more than a certain number of its inputs are active.
- The first network on the left is simply the identity function: if neuron A is activated, then neuron C gets activated as well (since it receives two input signals from neuron A), but if neuron A is off, then neuron C is off as well.
- The second network performs a logical AND: neuron C is activated only when both neurons A and B are activated (a single input signal is not enough to activate neuron C).
- The third network performs a logical OR: neuron C gets activated if either neuron A or neuron B is activated (or both).
- Finally, if we suppose that an input connection can inhibit the neuron’s activity (which is the case with biological neurons), then the fourth network computes a slightly more complex logical proposition: neuron C is activated only if neuron A is active and if neuron B is off. If neuron A is active all the time, then you get a logical NOT: neuron C is active when neuron B is off, and vice versa.
The Multi-layer network of neurons are represented in blue and magenta colour. We will use this network to classify things and make predictions. The above diagram shows a simple neural network that consists of 5 input and 5 output layer and 2 hidden layer of Neurons.
From the left, we have:
- The input layer of our model in orange.
- Our first hidden layer of neurons in blue.
- Our second hidden layer of neurons in magenta.
- These outputs are then passed on to the output layer shown in yellow to predict the result.
The arrows which connects the dots it shows how all the neurons are analogous and how the data travels from the input layer all the way through to the output layer.
A neural network contains layers of interconnected nodes. Each and every node is a perceptron and is similar to multiple linear regression. This perceptron is then fed to the signal that is produced by a multiple linear regression into an activation function that may be nonlinear.
A neural network in our case helps in evaluating the price data and finding out new opportunities for making trade decisions based on the data analysis. These networks can distinguish indirect nonlinear inter dependencies and patterns, which other methods of technical analysis cannot differentiate.
Different types of Neural Networks
- Feedforward Neural Network – Artificial Neuron
It is one of the simplest types of artificial neural networks. In a feedforward neural network, the data is passed through the various input nodes until it reaches the output node.
In case of feedforward neural network, the weights are calculated on the basis of sum of products of inputs. This is then further fed to the output.
- Radial Basis Function Neural Network
A radial basis function takes into account the distance of any relative point to the centre. Such neural networks have two layers. First one is the inner layer, where the features are combined with the radial basis function.
Then the output of these features is taken into account when calculating the same output in the next time-step.
- Multilayer Perceptron
A multilayer perceptron has three or more layers. It is used to classify data that cannot be separated linearly. This is thecnn type of artificial neural network that is fully connected. This is because every single node in a layer is connected to each node in the following layer.
- Convolutional Neural Network
A convolutional neural network (CNN) uses a variation of the multilayer perceptrons. A CNN contains one or more than one convolutional layers. These layers can either be completely interconnected or pooled.
Before passing the result to the next layer, the convolutional layer uses a convolutional operation on the input. Due to this convolutional operation, the network can be much deeper but with much fewer parameters. CNNs are also being used in image analysis and recognition in agriculture where weather features are extracted from satellites like LSAT to predict the growth and yield of a piece of land.
- Recurrent Neural Network(RNN) – Long Short Term Memory
A Recurrent Neural Network is a type of artificial neural network in which the output of a particular layer is saved and fed back to the input. This helps predict the outcome of the layer.
The first layer is formed in the same way as it is in the feedforward network. That is, with the product of the sum of the weights and features. However, in subsequent layers, the recurrent neural network process begins.
From each time-step to the next, each node will remember some information that it had in the previous time-step. In other words, each node acts as a memory cell while computing and carrying out operations. The neural network begins with the front propagation as usual but remembers the information it may need to use later.
If the prediction is wrong, the system self-learns and works towards making the right prediction during the backpropagation. This type of neural network is very effective in text-to-speech conversion technology.
- Sequence-To-Sequence Models
A sequence to sequence model consists of two recurrent neural networks. There’s an encoder that processes the input and a decoder that processes the output. The encoder and decoder can either use the same or different parameters. This model is particularly applicable in those cases where the length of the input data is not the same as the length of the output data.
Sequence-to-sequence models are applied mainly in Chatbot’s, machine translation, and question answering systems.
Long Short Term Memory
Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture. Unlike standard feed forward neural networks, LSTM has feedback connections. It can not only process single data points (such as images), but also entire sequences of data (such as speech or video). For example, LSTM is applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition.
LSTM networks are well-suited to classifying, processing and making predictions based on time series data, since there can be lags of unknown duration between important events in a time series. LSTMs were developed to deal with the exploding and vanishing gradient problems that can be encountered when training traditional RNNs.
LSTM can be used to create large recurrent networks that in turn can be used to address difficult sequence problems in machine learning and achieve state-of-the-art results.
Instead of neurons, LSTM networks have memory blocks that are connected through layers.
A block has components that make it smarter than a classical neuron and a memory for recent sequences. A block contains gates that manage the block’s state and output. A block operates upon an input sequence and each gate within a block uses the sigmoid activation units to control whether they are triggered or not, making the change of state and addition of information flowing through the block conditional.
There are three types of gates within a unit:
- Forget Gate: conditionally decides what information to throw away from the block.
- Input Gate: conditionally decides which values from the input to update the memory state.
- Output Gate: conditionally decides what to output based on input and the memory of the block.
Each unit is like a mini-state machine where the gates of the units have weights that are learned during the training procedure.
STOCK PRICE PREDICTION USING LSTM
Nowadays, the most significant challenges in the stock market is to predict the stock prices. The stock price data represents a financial time series data which becomes more difficult to predict due to its characteristics and dynamic nature. Here we will be predicting the Sensex stock prices.
Step 1: Raw Data: In this stage, the historical stock data is collected from www.bseindia.com and this historical data is used for the prediction of future stock prices.
For Training our LSTM model WE will be using the closing price.
Step 2: Data Pre-processing: The pre-processing stage involves
a) Data discretization: Data discretization is defined as a process of converting continuous data attribute values into a finite set of intervals with minimal loss of information. Part of data reduction but with particular importance, especially for numerical data.
b) Data transformation: Normalization, the goal of normalization is to change the values of numeric columns in the dataset to a common scale, without distorting differences in the ranges of values.
c) Data integration: Integration of data files. After the dataset is transformed into a clean dataset, the dataset is divided into training and testing sets so as to evaluate. Creating a data structure with 45 time steps and 1 output.
Step 3: Importing Necessary package to create our LSTM model. Tensor Flow 2.0 is used to create the model
Step 4: Training Neural Network: In this stage, the data is fed to the neural network and trained for prediction assigning random biases and weights. Our LSTM model is composed of a sequential input layer followed by 3 LSTM layers and dense layer with activation and then finally a dense output layer with linear activation function.
The type of optimizer used can greatly affect how fast the algorithm converges to the minimum value. Also, it is important that there is some notion of randomness to avoid getting stuck in a local minimum and not reach the global minimum. There are a few great algorithms, but I have chosen to use Adam optimizer. The Adam optimizer combines the perks of two other optimizers: ADAgrad and RMSprop.
Dropout is a technique where randomly selected neurons are ignored during training. They are “dropped-out” randomly. This means that their contribution to the activation of downstream neurons is temporally removed on the forward pass and any weight updates are not applied to the neuron on the backward pass. So basically method of preventing overfitting considers what happens when some of the neurons are suddenly not working. This forces the model to not be over dependent on any groups of neurons, and consider all of them. Dropouts have found their use in making the neurons more robust and hence allowing them to predict the trend without focusing on any one neuron. Here are the results of using dropouts.
Step 5: Output Generation: In this layer, the output value generated by the output layer of the RNN is compared with the target value. The error or the difference between the target and the obtained output value is minimized by using back propagation algorithm which adjusts the weights and the biases of the network.
The data pre-processing step is again repeated for the test data.
Visualizing the Prediction
The popularity of stock market trading is growing rapidly, which is encouraging researchers to find out new methods for the prediction using new techniques. The forecasting technique is not only helping the researchers but it also helps investors and any person dealing with the stock market. In order to help predict the stock indices, a forecasting model with good accuracy is required. In this work, we have used one of the most precise forecasting technology using Recurrent Neural Network and Long Short-Term Memory unit which helps investors, analysts or any person interested in investing in the stock market by providing them a good knowledge of the future situation of the stock market.
About Sandip Yadav:
LinkedIn ID: https://www.linkedin.com/in/sandip-yadav-16118b194
Sandip Yadav is B.Tech in Computer Science. Currently he is working as Analyst Intern with NikhilGuru Consulting Analytics Service LLP (Nikhil Analytics), Bangalore.