4 Application
Here we choose the Robust Adaptive MCMC (Vihola 2012) algorithm as default method when estimating the parameters of the \(\text{CAViaR}\) family of models. The parameter vector estimate \(\boldsymbol{\hat{\theta_t}}\) is subsequently used to get a forecast of quantile \(q_{\alpha}\) at time \(t + 1\). In this section we explain how to set up a computing architecture to serve and manage access to a web application available. This application obtains real time prices from a cryptocurrency exchange through a Websocket API and uses the shiny
package (Chang et al. 2021) to visualize this data dynamically on the web. The same web application uses a standard HTTP API to estimate the \(\text{VaR}_\alpha\) of multiple countries using ETF data as a proxy.
4.1 High frequency
The WebSocket protocol is a modern technology that allows two-way communication between a server and a client. It is designed to address important issues that arise when abusing the HTTP
protocol with multiple calls, required when working with high frequency data. Instead of making a different call every \(k\) seconds, the server will notify the client whenever new data arrives through the same communication channel.
We will estimate the parameters of the T-CAViaR model (2.13) using an input vector \(\mathbf{y}\) corresponding to the daily returns of the Bitcoin/USD crypto currency pair. However, our forecast quantile \(\hat{q}_\alpha\) will not be available in real time since it takes time to get our estimate vector \(\hat{\theta}\). The Bitstamp API offers a maximum of \(1000\) data points of historical data through an HTTP API alongside the above-mentioned real time WebSocket API. Historical data is obtained at a given frequency \(k\), the highest frequency available being \(60\) seconds. We will hence start by timing the estimation of \(\hat{\theta}\) using the microbenchmark
package (Mersmann 2019).
The specific alphanumeric pattern of the unique identifiers (symbols or tickers) depend on the data source. For instance, the cryptocurrency database follows a currency1currency2
pattern while the ETF database uses additional symbols such as ^ = .
depending on the asset class. It’s important to be aware of these differences when working with multiple data sources since these unique identifiers are usually unique and far from standardized.
Historical OHLC data is received in JSON format through the https://www.bitstamp.net/api/v2/
endpoint. The following R code downloads a JSON file with the latest 1000 hourly prices for btcusd
:
<- sprintf(
api_call "https://www.bitstamp.net/api/v2/ohlc/%s/?step=%s&limit=%s",
"btcusd", 86400, 1000
)<- jsonlite::fromJSON(api_call)$data out
Where the JSON data out
has been parsed into an R list
of 2 elements: pair
(BTC/USD) and ohlc
(the prices).
Length | Class | Mode | |
---|---|---|---|
high | 1000 | -none- | character |
timestamp | 1000 | -none- | character |
volume | 1000 | -none- | character |
low | 1000 | -none- | character |
close | 1000 | -none- | character |
open | 1000 | -none- | character |
This data gathering step has been summarized into the get_price_hist
function, which returns by default the same list of two elements. We can then compute the log-returns vector \(\mathbf{y}\) which will be passed as first argument to caviar_methods
when estimating the parameters of the model.
4.2 Low frequency
The HTTP protocol uses a different TCP connection every time a GET
request is made to the server. The highest frequency available through the Yahoo! Finance endpoint is one day.
Yahoo! Finance data can be easily obtained through their website. Alternatively, the following R code downloads the CSV
file of daily data for BTC=F
:
<- paste0(
download_link "https://query1.finance.yahoo.com/v7/finance/download/",
sprintf("BTC=F?period1=1513555200&period2=%s&", as.integer(Sys.time())),
"interval=1d&events=history&includeAdjustedClose=true"
)<- read.csv(download_link)
btc_futures summary.default(btc_futures)
## Length Class Mode
## Date 1100 -none- character
## Open 1100 -none- numeric
## High 1100 -none- numeric
## Low 1100 -none- numeric
## Close 1100 -none- numeric
## Adj.Close 1100 -none- numeric
## Volume 1100 -none- numeric
The resulting data.frame
has a total of 1100 rows and 7 columns as of 2022-04-30 09:42:27. However, several rows need to be ignored since they include no data but a null
message. 12
<- which(btc_futures$Close != "null")
idx <- btc_futures[idx, ]
btc_futures dim(btc_futures)
## [1] 1100 7
After removing the lines with null
data, we end up with a total of 1100 observations.
Figure 4.1 uses a plotly
candlestick chart (Sievert et al. 2021) to better help us visualize the data.
Figure 4.1: Spot and futures prices of the BTC/USD currency pair.
The output of applying the summary
function to the btc_spot
dataset is shown in Table 4.2. Notice that we first convert the numeric
data from character
type by applying the as.numeric
function to each column of btc_spot
, excluding the timestamp
column.
high | volume | low | close | open | |
---|---|---|---|---|---|
Min. : 5353 | Min. : 366 | Min. : 3850 | Min. : 4842 | Min. : 4842 | |
1st Qu.: 9607 | 1st Qu.: 2421 | 1st Qu.: 9216 | 1st Qu.: 9425 | 1st Qu.: 9428 | |
Median :22431 | Median : 4583 | Median :20261 | Median :22049 | Median :20531 | |
Mean :28401 | Mean : 6128 | Mean :26740 | Mean :27656 | Mean :27628 | |
3rd Qu.:45856 | 3rd Qu.: 7727 | 3rd Qu.:43037 | 3rd Qu.:44444 | 3rd Qu.:44431 | |
Max. :69000 | Max. :58513 | Max. :66250 | Max. :67559 | Max. :67547 |
We can also collect additional information such as the mean
and quantiles
for different values of \(\alpha \in {0.0, 0.1, 0.2, 0.3, 0.4, 0.5}\), where the quantile
at level \(\alpha = 0.5\) corresponds to the median of a given vector such as btc_spot$close
. 13 The mean value of this vector is 2.766^{4} while the different quantile levels can be found in Table 4.3.
0% | 10% | 20% | 30% | 40% | 50% |
---|---|---|---|---|---|
4842 | 7930 | 9156 | 9898 | 11370 | 22049 |
This information can also be represented graphically using a plotly
histogram as shown in Figure 4.2.
<- as.numeric(btc_spot$close)
btc_close <- plotly::plot_ly(
close_hist x = btc_close,type = "histogram", name = "BTC/USD Spot close"
%>%
) add_segments(
x = quantile(btc_close, probs = seq(0.1, 0.9, 0.1)),
xend = quantile(btc_close, probs = seq(0.1, 0.9, 0.1)),
y = -10, yend=10, name = paste("BTC/USD - \U03B1:", seq(0.1, 0.9, 0.1)),
line = list(width = 1)
) close_hist
Figure 4.2: Histogram of BTC/USD price quantiles.
And we can also compare this histogram to the corresponding Close
price of the btc_futures
table as seen in Figure 4.3.
<- plotly::plot_ly(alpha = 0.5) %>%
fig add_histogram(x = as.numeric(btc_spot$close),
name = "BTC/USD Spot") %>%
add_histogram(x = as.numeric(btc_futures$Close),
name = "BTC/USD Futures") %>%
layout(barmode = "overlay")
fig
Figure 4.3: Histogram of BTC/USD spot prices overlaying futures prices.
4.3 Modelling
For this step we start by computing the log-returns shown in Figure 4.4 as follows:
<- get_price_hist("btcusd", crypto = T)
out <- diff(log(as.numeric(out$ohlc$CLOSE))) y
Figure 4.4: Log-returns for the BTC/USD pair, prices (close) from Yahoo! Finance.
After which we pass y
to the MCMC sampler:
<- caviarma::caviar(y, nsim = 1e5) out
This step is the bottleneck of the data analysis since it must be repeated multiple times as explained in the Simulation Study section.
4.4 Forecasting
After getting our estimate \(\boldsymbol{\hat{\theta_t}}\), we obtain our forecast for \(t = t + 1\) using Equations (2.9) to (2.13):
<- caviarma::get_forecast(y, out) quantile_forecast
Figure 4.5: VaR (0.05) forecasts using different models.
Figure 4.5 shows sample VaR forecasts obtained using different CAViaR models. We also include the corresponding tGARCH
and sGARCH
forecasts for reference (Ardia et al. 2019). 14
And finally, we show in Table 4.4 the results from the VaR backtest obtained through the GAS::BacktestVaR()
function.
Statistic | P-value | |
---|---|---|
Symmetric Absolute Value | 7.39 | 0.389 |
Asymmetric Slope | 4.10 | 0.768 |
Adaptive | 14.67 | 0.040 |
T-CAViaR | 7.40 | 0.388 |
Web app
We make use of the shiny
and shinyMobile
R packages to build a progressive web app15 (PWA) that allows a user to gather, analyze and visualize the data introduced in the Data section using the models and methods studied in the Forecasting Methods section. To run this application locally, one must first install the simulr
package (Chaparro Sepulveda 2021) and then run the app:
::install_gitlab("cacsfre/simulr")
remotes::run_simulr() simulr
4.5 Reactivity
R is a powerful data-centred programming language offering a mathematically intuitive interactive environment which makes data analysis a joyful experience. The shiny
package offers an intuitive framework for developing web applications using the R
programming language. A shiny
application seamlessly connect user interface (UI) input to the back end running R. This makes it straightforward to create a UI to allow users of our code to interact with it without requiring any programming.
This relationship between the input received from the browser (the client) and the output produced by the server is summarized in Figure 4.6 shiny
handles all the required css/html/javascript
16 to create a communication channel between the browser and the server through R
.
Shiny offers a reactive programming model which makes it easy to make use of an R
function whenever the input
object changes. The logic behind each element rendered in the UI is summarized using shiny modules
. For instance, the code below17 creates a plotly
map (the UI element) whenever input$update_map
changes, i.e., it’s clicked.
<- function(id) {
map_cardUI uiOutput(NS(id, "map_card"))
}
<- function(id) {
map_cardServer moduleServer(id, function(input, output, session) {
<- shiny::reactiveValues(map_data = NULL)
values $map_plot <- plotly::renderPlotly(get_map(values$map_data))
outputobserveEvent(input$update_map, {values$map_data <- get_map_data()})
$map_card <- renderUI({
output::f7Card(
shinyMobiletitle = shiny::actionButton("update_map"),
::plotlyOutput("map_plot")
plotly
)
})
}) }
Figure 4.6: Relationship between user input, server output and reactive values.
In the example above, the server recreates the output$map_plot
session object whenever the user clicks on input$update_map
. This relationship is due to the observeEvent(input$update_map, {values$map_data <- get_map_data()})
expression, which invalidates the reactive relationship between values$map_data
and output$map_plot
whenever input$update_map
is clicked by the user.
4.6 Deployment
Deploying a shiny
application can be pretty simple using the https://www.shinyapps.io/
service by RStudio. This works fine for demos consuming limited resources. However, it requires an upgrade once we need to handle multiple users or greater computing resources. The two typical approaches to scale a shiny app in an enterprise context are Shiny Server and ShinyProxy, where shinyproxy
18 tries to fill the gaps of the open-source version of shiny-server
relying exclusively on the open-source shiny
package. Another advantage of ShinyProxy is its use of docker, offering each user its own independent shiny app environment and R process. This isolation obtained through docker
containers is important both for security and performance reasons. This architecture is outlined in Figure 4.7.
Figure 4.7: Outline of the shinyproxy architecture.
Compared to the open-source version of shiny-server
in Figure 4.8.
Figure 4.8: Outline of the open-source shiny-server architecture.
These observations correspond to the days where the futures market is closed.↩︎
The original data is still saved as
character
.↩︎The
tGARCH
andsGARCH
estimates are obtained using theMSGARCH::FitMCMC()
function.↩︎This package uses
Framwework7
behind the scenes.↩︎Still, you are likely to write some css/html/javascript in order to customize your shiny apps.↩︎
This is a shortened version of the
simulr::map_cardUI()
andsimulr::map_cardServer()
functions.↩︎ShinyProxy is an open-source technology built with
java
, its source code is available athttps://github.com/openanalytics/shinyproxy/tree/master/src/main/java
.↩︎