- 2020-11-23 21:14
*views 20*- LSTM
- lstm prediction model

In this paper , We will use neural networks , especially LSTM Model to predict the behavior of time series data . The problem to be solved is the classical stock market forecast . All the data and codes used can be trusted to me .

Although this is an old problem , But it has not been solved until today . The fact is simple ： Stock prices are determined by several factors , Historical stock prices are only a small part of it . therefore , Forecasting price behavior is a very difficult problem .

abstract

first , I will introduce some data visualization

Data set for . then , I will briefly discuss the difficulty and limitations of using moving average algorithm to predict stock market behavior . next , Recursive neural networks and LSTM Concept of , And with LSTM For example, for a single company

The stock price is predicted . last , I'll show you how to predict at the same time 4 Company price LSTM, And compare the results , See as we use more companies at the same time , Does the forecast improve .

Data visualization

The dataset is from Yahoo

Finance with CSV Downloaded in format . It has 4 The stock prices of companies are 01/08/2010 to 01/07/2019 period . We call them companies A,B,C and D.

The basic step is to use Pandas open CSV file . First look at the data ：

df_A = pd.read_csv(‘data/company_A.csv’)df_A[‘Date’] =

pd.to_datetime(df_A[‘Date’])df_A.tail() Plt.figure(figsize =

(15,10))plt.plot(df_A['Date'], df_A['Close'], label='Company

A')plt.plot(df_B['Date'], df_B['Close'], label='Company

B')plt.plot(df_C['Date'], df_C['Close'], label='Company

C')plt.plot(df_D['Date'], df_D['Close'], label='Company

D')plt.legend(loc='best')plt.show()

4 The closing price of all the company's shares

Moving average

A classical algorithm used in this problem is moving average (MA). It includes calculation m Average of past observation days , And use this result as the prediction of the next day . To prove this , Here is an example of a moving average , use m As the closing price of the company 10 and 20 day .

df['MA_window_10'] = df['Close'].rolling(10).mean().shift() #shift so the day

we want to predict won't be useddf['MA_window_20'] =

df['Close'].rolling(20).mean().shift()

Use moving average pairs A Make a one-step forecast of the company's closing price

When we try to use the moving average to predict the future 10 Day closing price , give the result as follows ：

Use moving average to company A of 10 Day closing price forecast

Each red line represents based on the past 10 Day forecast . therefore , The red line is discontinuous .

Use exponential moving average (EMA), We have achieved some small improvements ：

Use the index moving average to the company A Make a one-step prediction of the closing price of .

contrast MA and EMA：

use MA and EMA yes A Comparison of one-step prediction of company closing price

This method is simple . What we really want is advance n Forecast the future trend of the stock ,MA and EMA Can't complete the task .

Recurrent neural network (RNN)

To understand LSTM network , We first need to understand recurrent neural networks . When past results have an impact on current results , This network is used to identify patterns .RNN An example used is the time series function , The data order is very important .

In this network architecture , Neurons use not only conventional inputs ( Previous layer output ) As input , It also uses its previous status as input .

RNN framework

it is to be noted that H Represents neuronal state . therefore , When in state H_1 Time , Neuron usage parameters X_1 and H_0( Its previous state ) As input . The main problem with this model is memory loss . The old state will soon be forgotten . In the sequence we need to remember just in the past ,RNN Can't remember .

LSTM network

LSTM Network originated from RNN. But it can solve memory loss by changing the structure of neurons .

LSTM Neuronal architecture

New neurons have 3 A door , Each door has a different goal . The door is ：

* Input gate

* Output gate

* Forget the door

LSTM Neurons still receive their previous state as input ：

LSTM neuron n Pass its previous state as a parameter .

LSTM Forecast individual companies

last , Let's use LSTM To predict the company A act .

But first , Consider the following parameters . We want to predict the future n day (foward_days) Enter past observations m day (look_back). therefore , If we have a past m Day input , Network output will be the future n Day forecast . We will Train and Test Split data in . The test will be conducted by k Cycles (num_period) form , Each cycle is a series of n Day forecast .

look_back = 40forward_days = 10num_periods = 20

Now? , We use Pandas open CSV file , Keep only the columns we will use , Date and closing price .A The company's initial closing price chart is ：

plt.figure(figsize = (15,10))plt.plot(df)plt.show()

A Closing price of the company

In order , We scale the input , stay Train/Validation and Test Split data in , And format it to provide a model .

Now? , We build and train models .

NUM_NEURONS_FirstLayer = 128NUM_NEURONS_SecondLayer = 64EPOCHS = 220#Build the

modelmodel =

Sequential()model.add(LSTM(NUM_NEURONS_FirstLayer,input_shape=(look_back,1),

return_sequences=True))model.add(LSTM(NUM_NEURONS_SecondLayer,input_shape=(NUM_NEURONS_FirstLayer,1)))model.add(Dense(foward_days))model.compile(loss='mean_squared_error',

optimizer='adam')history =

model.fit(X_train,y_train,epochs=EPOCHS,validation_data=(X_validate,y_validate),shuffle=True,batch_size=2,

verbose=2)

The result is ：

A The company's model applies to all data

Observe the test set carefully ：

A The company's model is applicable to the test set

Please note that , Each red line represents a based on the past 40 day (look_back) of 10 Day forecast (forward -

days). have 20 Red line , Because we chose to 20 Cycles (num_periods) Test on . This is why the red prediction line is discontinuous .

By repeating the same process for all companies , The best results on the test set are for the company C Prediction of ：

C The company's model is applicable to the test set

Although this is the best model , But the result is not very good . There are many possible reasons for this result . Some of them may be ：

* Only the historical data of the closing price is not enough to predict the stock price

* The model can also be improved

LSTM forecast 4 Company

last , We will use LSTM The model predicts all 4 The behavior of this company ,A,B,C and D, And with a single LSTM Contrast the company's results . The purpose is to analyze whether the use of data from multiple companies can improve individual forecasts for each company .

It should be pointed out that , All 4 individual csv All have the same date . such , The network will not receive future information from one company to predict the value of another company .

The initial data is ：

All 4 Closing price of the company

After data normalization and formatting of the model , Train the model ：

NUM_NEURONS_FirstLayer = 100NUM_NEURONS_SecondLayer = 50EPOCHS = 200#Build the

modelmodel =

Sequential()model.add(LSTM(NUM_NEURONS_FirstLayer,input_shape=(look_back,num_companies),

return_sequences=True))model.add(LSTM(NUM_NEURONS_SecondLayer,input_shape=(NUM_NEURONS_FirstLayer,1)))model.add(Dense(foward_days

* num_companies))model.compile(loss='mean_squared_error',

optimizer='adam')history =

model.fit(X_train,y_train,epochs=EPOCHS,validation_data=(X_validate,y_validate),shuffle=True,batch_size=1,

verbose=2)

The result is ：

4 Company LSTM Results of the model

Observe the test set carefully ：

4 Company LSTM Test set prediction of model

It's time to compare the results . Single company LSTM The results are displayed on the left ,4 Companies LSTM The results are displayed on the right . The first row shows the predictions in the test set , The second row shows the forecasts in all data sets .

A company

B company

C company

D company

conclusion

It is impossible to predict the behavior of the stock market only by historical prices .LSTM The forecast is unacceptable . Even using the historical prices of several companies , Predictions are getting worse .

Technology

- Java296 blogs
- Python265 blogs
- Vue125 blogs
- c language122 blogs
- algorithm107 blogs
- MySQL96 blogs
- Flow Chart81 blogs
- javascript79 blogs
- more...

Daily Recommendation

views 12

views 5

views 5

views 5

views 4

© ioDraw All rights reserved

**CSS Define variables redis Cache penetration of , Buffer breakdown , Cache avalanche vue Get data from the background UserWarning: Failed to load image Python extension: warn(f“Failed to load image Python extension:SpringBoot Cached @Cacheable Detailed introduction Simply check the network with commands Design of fourth-order Butterworth low-pass filter Redis Cache avalanche , pierce through , breakdown MySQL database Single table data record query vue Real time acquisition time