A time series is a sequence of data points that occur in successive order over time. It shows all the data set variables that change over time.
Time series analysis extracts meaningful patterns and attributes from the historical data. It enables the model to gain knowledge and identify trends in the dataset.
Time series builds a model that predicts future values based on historical data. The model can forecast forex exchange rates, stock prices, weather, and Covid19 caseload. In stock prediction, a time series model tracks the movement of stock prices, such as Apple stock. Accurate predictions of the model will yield profit to the investors.
In this tutorial, we will build an electricity consumption prediction model. We will use Auto Time Series library (AutoTS) to train the model.
Table of contents
 Prerequisites
 Getting started with Auto Time Series library
 Benefits of using Auto Time Series library
 Installing Auto Time Series library
 Working with the dataset
 Loading the dataset
 Plotting the line graph
 Splitting the dataset
 Selecting the timestamp and the target columns
 Initializing the Auto Time Series model
 Selecting the best model
 Actual vs Forecast values
 Conclusion
 References
Prerequisites
To easily understand this article, a reader should:
 Understand time series
 Know how to build a time series model
 Understand time series decomposition in Python
 Know some of the different types of time series models
 Use Google Colab notebook
Getting started with Auto Time Series library
Auto Time Series (AutoTS) is an opensource Python library to automate time series analysis and forecasting. It trains highaccuracy models within a short time. AutoTS automatically runs multiple time series models on the training dataset. It then automatically selects the best model from all the models.
There are different types of time series models. The most common models that Auto Time Series runs are as follows:
 PyFlux Model.
 NonSeasonal ARIMA Model.
 Seasonal SARIMAX Model.
 Basic Machine Learning Model.
 Vector Autoregressive Model.
 Facebook Prophet Model.
All the listed models above support time series analysis and forecasting. AutoTS chooses the best model based on its accuracy score and predictions made. We will then plot a line graph to show the forecast values.
Benefits of using Auto Time Series library

It performs automated dataset preprocessing. It will automatically transform the input dataset into a format the model can use. It removes noise and unnecessary information in the dataset.

It can handle missing values and outliers. AutoTS handles the missing values to ensure we have a complete dataset. It also removes outliers that are not within the dataset range.

It trains highaccuracy models. AutoTS produces reliable and accurate models.

It selects the optimal time series model. AutoTS automatically runs multiple time series models listed above. It then automatically selects the optimal model. This model will give the most accurate results.

Automatic hyperparameter tuning and configurations. AutoTS automatically finetunes the model parameters. It ensures the model gives the best accuracy score.
Installing Auto Time Series library
To install the Auto Time Series library, run this command:
!pip install auto_ts
We import this using this code:
import auto_ts as AT
Let’s now start working with our dataset.
Working with the dataset
We will use the electricity consumption dataset to train the model. The dataset shows the monthly electricity consumption of an individual household from 20160101
to 20200501
. You can download the electricity consumption dataset here.
The dataset output:
From the image above, the dataset has six columns:

Bill_Date
: It shows the date on which the billing period ends. 
On_peak
: It is the electricity consumption during the peak season. 
Off_peak
: It is the electricity consumption during the offpeak season. 
Usage_charge
: It is the total cost of electricity consumption without the tax. 
Billed_amount
: It is the total cost of electricity consumption and the tax. 
Billing_days
: It shows the number of days within the billing period.
We need to convert the Bill_Date
column to the DateTime format. The DateTime format is the format Auto Time Series understands. It also enables us to perform timeseries operations on this column.
We will use the Python Datetime module.
from datetime import datetime
Let’s create a Python function to convert the Bill_Date
column to the DateTime format.
def parse(x):
return datetime.strptime(x, '%m/%d/%Y')
We will call the function when loading the dataset.
Loading the dataset
We will load the dataset using Pandas.
import pandas as pd
To load the dataset and also convert the Bill_Date
column to the DateTime format, use this code:
df = pd.read_csv('/content/electricity_consumption.csv', parse_dates = ['Bill_Date'], date_parser=parse)
To see the loaded dataset, use this code:
df
The output of the dataset:
To check the dataset information, use this code:
df.info()
The output:
From the output, the dataset has 53 entries. Also, there are no missing values.
Let’s make the Bill_Date
the index column.
ec_df = df.set_index('Bill_Date')
To see the dataset with Bill_Date
as the index column, use this code:
ec_df.head()
The dataset output:
Selecting the dependent variable
The dependent variable is the variable that the model will predict. This variable changes with time. The dependent variable is the Billed_amount
.
ec_data = ec_df['Billed_amount']
Plotting the line graph
We will plot the line graph that shows the data points using Matplotlib. Let’s import Matplotlib.
import matplotlib.pyplot as plt
To plot the line graph, use this code:
ec_data.plot(grid=True)
The line graph output:
The image shows the Billed_amount
and the Bill_Date
from 2016 to 2020.
Let’s plot a line graph to show electricity consumption for 2019.
Line graph for 2019
To plot the line graph, use this code:
ec_df_2019=ec_df.loc['2019']
ec_data_2019=ec_df_2019['Billed_amount']
ec_data_2019.plot(grid=True)
The output:
From the image above, the highest energy consumption was for September. We can also plot a bar graph to show electricity consumption for 2019.
Bar graph for 2019
To plot the bar graph, use this code:
ec_df_2019=ec_df.loc['2019']
ec_data_2019=ec_df_2019['Billed_amount']
ec_data_2019.plot.bar()
The output:
The bar graph shows the highest energy consumption was in September.
Creating a copy of the dataset
We will use this copy of the dataset to train the model.
final_df = df.copy()
final_df=final_df[['Bill_Date','On_peak','Off_peak','Billed_amount','Billing_days']]
Splitting the dataset
We will split the dataset into two sets. One set for model training and the other for model testing.
train = final_df[:50]
test = final_df[50:]
The first 50 entries/data points will train the model. The remaining entries will test the model.
Let’s print the shape of the train and test datasets.
print(train.shape, test.shape)
The output:
(50, 5) (3, 5)
Selecting the timestamp and the target columns
The Auto Time Series model expects an input dataset with timestamp and target columns. The timestamp
column contains the DateTime of the time series. The target
column has the time series values (data points). The model will learn from these columns.
ts_column = 'Bill_Date'
sep = ','
target = 'Billed_amount'
The Bill_Date
is the timestamp column, and the Billed_amount
is the target column. Also, our dataset is commaseparated.
Initializing the Auto Time Series model
We initialize the Auto Time Series model using the following code:
ml_dict = AT.Auto_Timeseries(train, ts_column,
target, sep, score_type='rmse', forecast_period=6,
time_interval='Months', non_seasonal_pdq=None, seasonality=True,
seasonal_period=12,seasonal_PDQ=None, model_type='best',
verbose=2)
The Auto Time Series model has the following parameters:

train
: It contains the training set. These are the first 50 entries/data points that trains the model. 
ts_column
: It contains the DateTime of the time series. 
sep
: It specifies the dataset format. Our dataset is commaseparated values (CSV). 
score_type
: It is the scoring metrics for the model. We use the Root Mean Square Error (RMSE). RMSE calculates the error of a model when making predictions. It indicates the absolute fit of the model to the data – how close the observed data points are to the predicted values. 
forecast_period
: It shows the number of months the model will predict. The model will make predictions for the next six months. 
time_interval
: It shows the time interval of the time series. It can be in minutes, hourly, daily, monthly, or yearly. Our dataset has monthly intervals. 
non_seasonal_pdq
: It contains the parameters that train the NonSeasonal ARIMA model. 
seasonality
: It handles the periodic changes in the time series that occur within a given time. Seasonality shows a regular pattern within the dataset.
Seasonality can be daily, weekly, or yearly. Our dataset has monthly seasonality. In our dataset, the highest energy consumption occurs during September. It keeps on repeating during this month for all the years. It is because of the seasonality effect.

seasonal_period=12
: It shows the monthly seasonality. 
seasonal_PDQ=None
: It contains the parameters that train the Seasonal SARIMAX Model. 
model_type='best
: It shows the types of models that Auto Time Series will use for training. We set the values tobest
so that Auto Time Series will run multiple time series models and select the best one.
When you execute the code above, Auto Time Series will run multiple time series models and produce the following outputs:
Running Facebook Prophet Model
Running PyFlux Model
Running NonSeasonal ARIMA Model
Running Seasonal SARIMAX Model
Running VAR Model
Running Machine Learning Models
Showing time series components
It shows the overall trend of the time series data and the seasonality in the dataset.
Original time series
Histogram of original time series
After the Auto Timeseries automatically runs, it selects the best model.
Selecting the best model
Auto Time Series will select the best model with the lowest RMSE score. It shows the model with the lowest error when making predictions.
The best model is:
From the image above, the best model is Facebook Prophet. It also shows an array of actual and forecast values. The model has an RMSE score of 39.91
. It indicates the model can make accurate predictions.
Finally, AutoTS will plot a line graph to show the actual and the forecast values.
Actual vs Forecast values
The line graph output:
From the image above, the red line shows the actual values. The green line shows the forecast values. The model has made predictions for the next six months.
Conclusion
We have learned how to perform time series analysis and forecasting using the Auto Time Series library. The tutorial shows the models that Auto Time Series runs. We also discussed the benefits of the Auto Time Series and how to install it. We used the Auto Time Series library to build an electricity consumption model. It selected Facebook Prophet as the best model. It had the lowest RMSE score and made predictions for six months.
To get the Python code in Google Colab, use this link.
References
 Sales Forecasting with Prophet
 Predicting Covid19 Cases Using NeuralProphet
 Introduction to Time Series
 Time series decomposition in Python/
 Auto Time Series GitHub
Peer Review Contributions by: Wilkister Mumbi