I am performing analysis on some top crypto currencies to see the trend and how they move together.
Will Bitcoin reach $100,000 in 2022? Lets see how that goes lol.
I am getting the data for the analysis from Coinbase pro API. Coinbase is a cryptocurrency exchange and anyone can reproduce the below analysis using the python code and steps l am going to outline below.
# importing necessary libraries
import pandas as pd
import requests
import json
import matplotlib.pyplot as plt
Data Download Section
The code section below is for getting the data from coinbase API. You can try this code in your own local environment if interested.
The fuction load_daily_crypto loads daily History of crypto and it takes a symbol, start date and end date to load data from coinbase api and append it to the global data dataframe
def load_daily_History_crypto(symbol, startdate, enddate):
global data
url = f’https://api.pro.coinbase.com/products/{symbol}/candles?start={startdate}&end={enddate}&granularity=86400′
response = requests.get(url)
if response.status_code == 200: # if response is good the perform below
df1 = pd.DataFrame(json.loads(response.text), columns=[‘unix’, ‘low’, ‘high’, ‘open’, ‘close’, ‘volume’])
# if fail to get data print error
if df1 is None:
print(“Did not return any data from Coinbase for this symbol”)
else:
df1[‘symbol’] = symbol # add the symbol as a column in the dataframe
data = data.append(df1,ignore_index=True)
else:
print(‘failed’)
Lets get data for some cryptos to analyize
symbol = [‘BTC-USD’,’ETC-USD’,’USDT-USD’]
startdate =’2022-01-01′
enddate = ‘2022-08-13’
Create empty data frame called data with columns unix which means timestamp in unix format, low, high,open,close,volume and symbol to refer to the different crypto price metrics for a day.
data = pd.DataFrame(columns=[‘unix’, ‘low’, ‘high’, ‘open’, ‘close’, ‘volume’,’symbol’])
Now Invoke the function load_daily_History_crypto to load the data for each crypto symbol for the given date range ‘2022-01-01’ to ‘2022-08-13’
for s in symbol:
load_daily_History_crypto(s,startdate,enddate)
data.head()
Above is a view of the data downloaded. Here is the meaning of the columns in the data frame.
unix: a unix time or epoc time corresponding to the trading date. This will be converted later
low : the lowest price the crypto sold that date
high : the highest price the crypto sold that date
open : the price at which the crypto opened for sale that day
close : the price at which the crypto closed for that day
volume : the volume of the cryptos sold that day
symbol : the crypto ticker and the currency, for example BTC-USD means Bitcoin US Dollar pair
#Lets take a look at the various cryptos we have data for
data[‘symbol’].unique()
#Lets convert the unix column to date so we can easily read the dates and delete the unix column
data[‘date’] = pd.to_datetime(data[‘unix’], unit=’s’)
del data[‘unix’]
data.head()
how many days of data do we have for each crypto ?
data.groupby(‘symbol’)[‘date’].count().reset_index(name =’NumberOfDaysOfData’)
# Lets now set the date column as index of our data dataframe
data.sort_values(by=[‘symbol’,’date’],ascending = True)
data = data.set_index(‘date’)
# lets remove the volume column we don’t need it
del data[‘volume’]
Our data is a time series data over time and we want to see different graphs of the data to see how it looks. Note that time series is data over some interval of time. so we can for example say time series data for BTC for the low or the high for each day. Each of those columns forms a series that can be explored and since we have multiple series for each crypto we are analyzing the multivariate time series to see how the coin is performing.
Let us Analyze Bitcoin in our data set
BTC = data[data[‘symbol’]==’BTC-USD’][[‘low’,’high’,’open’,’close’]]
BTC.describe()
The above gives the statistical description of the data and can be used to make general assessment like the mean closing price of BTC is 33971.7 in our data set.
let us plot the Visual to see how BTC is doing over the time
graph = BTC.plot(linewidth=2,fontsize=12)
graph.set_xlabel(‘Date’)
graph.legend(fontsize=12)
graph.set_title(‘BTC-USD Data for Jan 2022 to August 2022’,fontsize=12)
we can also plot the graph for all our crypto currencies at once.
symbol = [‘BTC-USD’,’ETC-USD’,’USDT-USD’]
for s in symbol:
df = data[data[‘symbol’]==s][[‘low’,’high’,’open’,’close’]]
graph = df.plot(linewidth=2,fontsize=12)
graph.set_xlabel(‘Date’)
graph.legend(fontsize=12)
graph.set_title(f'{s} Data for Jan 2022 to August of 2022′,fontsize=12)
We can see all the cryptos has crashed recently. USDT was supposed to match the US dollars but it also fell below to about .95 around May.
ETC seems to be recovery faster as of this August.
Forecasting Section
This is the final section and can we tell if Bitcoin can rise to 100,000 USD in 2022. In this section we will be exploring Time series modeling and try to predict the price of bitcoin for future.
I will be using only the close price for each date to make the prediction.
BTC = data[data[‘symbol’]==’BTC-USD’][‘close’]
BTC.index = pd.DatetimeIndex(BTC.index).to_period(‘D’)
BTC = BTC.sort_index()
BTC.head()
For the modelling, l will split the dataset into training and test set. I will use data from 2022-01-01 to 2022-08-01 for the training set then the rest of the data from 2022-08-02 to 2022-08-13 will be use to test what the model predicted for those days and compare it with real data for those days.
train_data = BTC[‘2022-01-01′:’2022-08-01’]
# lets get rest of the data for the test
test_data = BTC[~BTC.index.isin(train_data.index)]
print(‘Size of train Data :’,train_data.size)
print(‘Size of test Data :’,test_data.size)
I will be using classical Holtwinters for the modeling
from statsmodels.tsa.holtwinters import ExponentialSmoothing
model = ExponentialSmoothing(train_data)
model_fit = model.fit()
Now let us predict the close price for the test data
yhat = model_fit.predict(test_data.index.min(),test_data.index.max())
# lets see what is the prediction for the test data
yhat
We see the model predicted 23273.9 for each of those days in the test data. Lets see how the test data looks.
It looks like the prediction is not that bad. There are statistical metrics you can use to evaluate the performance of the model such as Mean Absolute Deviation,
Tracking signal, Running Sum of of forecast error , Mean Forecast Error etc. so you can use different modeling methods and use some of the metrics above to choose which method preforms better.
Now lets say we want to predict what will be the price of BitCoin for the rest of the year 2022?
BTC_2022 = model_fit.predict(‘2022-08-14′,’2022-12-31’)
BTC_2022.tail()
We can see model still predicting same 23273.9 for rest of the year. This model is using very simple exponential smoothing so might not be giving better prediction but the process of analyzing and predicting the price of Bitcoin in 2022 showed the crypto price might not change much this year. A better model such as ARIMA might give a better insight into the performance of Bitcoin and that can be done for further exploration.