4 Application
Here we choose the Robust Adaptive MCMC (Vihola 2012) algorithm as default method when estimating the parameters of the \(\text{CAViaR}\) family of models. The parameter vector estimate \(\boldsymbol{\hat{\theta_t}}\) is subsequently used to get a forecast of quantile \(q_{\alpha}\) at time \(t + 1\). In this section we explain how to set up a computing architecture to serve and manage access to a web application available. This application obtains real time prices from a cryptocurrency exchange through a Websocket API and uses the shiny package (Chang et al. 2021) to visualize this data dynamically on the web. The same web application uses a standard HTTP API to estimate the \(\text{VaR}_\alpha\) of multiple countries using ETF data as a proxy.
4.1 High frequency
The WebSocket protocol is a modern technology that allows two-way communication between a server and a client. It is designed to address important issues that arise when abusing the HTTP protocol with multiple calls, required when working with high frequency data. Instead of making a different call every \(k\) seconds, the server will notify the client whenever new data arrives through the same communication channel.
We will estimate the parameters of the T-CAViaR model (2.13) using an input vector \(\mathbf{y}\) corresponding to the daily returns of the Bitcoin/USD crypto currency pair. However, our forecast quantile \(\hat{q}_\alpha\) will not be available in real time since it takes time to get our estimate vector \(\hat{\theta}\). The Bitstamp API offers a maximum of \(1000\) data points of historical data through an HTTP API alongside the above-mentioned real time WebSocket API. Historical data is obtained at a given frequency \(k\), the highest frequency available being \(60\) seconds. We will hence start by timing the estimation of \(\hat{\theta}\) using the microbenchmark package (Mersmann 2019).
The specific alphanumeric pattern of the unique identifiers (symbols or tickers) depend on the data source. For instance, the cryptocurrency database follows a currency1currency2 pattern while the ETF database uses additional symbols such as ^ = . depending on the asset class. It’s important to be aware of these differences when working with multiple data sources since these unique identifiers are usually unique and far from standardized.
Historical OHLC data is received in JSON format through the https://www.bitstamp.net/api/v2/ endpoint. The following R code downloads a JSON file with the latest 1000 hourly prices for btcusd:
api_call <- sprintf(
"https://www.bitstamp.net/api/v2/ohlc/%s/?step=%s&limit=%s",
"btcusd", 86400, 1000
)
out <- jsonlite::fromJSON(api_call)$dataWhere the JSON data out has been parsed into an R list of 2 elements: pair (BTC/USD) and ohlc (the prices).
| Length | Class | Mode | |
|---|---|---|---|
| timestamp | 1000 | -none- | character |
| open | 1000 | -none- | character |
| high | 1000 | -none- | character |
| low | 1000 | -none- | character |
| close | 1000 | -none- | character |
| volume | 1000 | -none- | character |
This data gathering step has been summarized into the get_price_hist function, which returns by default the same list of two elements. We can then compute the log-returns vector \(\mathbf{y}\) which will be passed as first argument to caviar_methods when estimating the parameters of the model.
4.2 Low frequency
The HTTP protocol uses a different TCP connection every time a GET request is made to the server. The highest frequency available through the Yahoo! Finance endpoint is one day.
Yahoo! Finance data can be easily obtained through their website. Alternatively, the following R code downloads the CSV file of daily data for BTC=F:
btc_futures <- as.data.frame(tseries::get.hist.quote(
"BTC=F",
start = "2017-12-18",
end = Sys.Date(),
quote = c("Open", "High", "Low", "Close", "Volume")
))## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## Warning: BTC=F contains missing values. Some functions will not work if objects
## contain missing values in the middle of the series. Consider using na.omit(),
## na.approx(), na.fill(), etc to remove or replace them.
## time series ends 2026-01-30
# Convert row names to a 'Date' column for consistency
btc_futures$Date <- as.Date(rownames(btc_futures))
summary.default(btc_futures)## Length Class Mode
## Open 2045 -none- numeric
## High 2045 -none- numeric
## Low 2045 -none- numeric
## Close 2045 -none- numeric
## Volume 2045 -none- numeric
## Date 2045 Date numeric
The resulting data.frame has a total of 2045 rows and 6 columns as of 2026-02-01 05:12:35.060242. However, several rows need to be ignored since they include no data but a null message. 12
## [1] 2043 6
After removing the lines with null data, we end up with a total of 2043 observations.
Figure 4.1 uses a plotly candlestick chart (Sievert et al. 2021) to better help us visualize the data.
Figure 4.1: Spot and futures prices of the BTC/USD currency pair.
The output of applying the summary function to the btc_spot dataset is shown in Table 4.2. Notice that we first convert the numeric data from character type by applying the as.numeric function to each column of btc_spot, excluding the timestamp column.
| open | high | low | close | volume | |
|---|---|---|---|---|---|
| Min. : 25125 | Min. : 25729 | Min. : 24756 | Min. : 25127 | Min. : 205 | |
| 1st Qu.: 43119 | 1st Qu.: 43843 | 1st Qu.: 42402 | 1st Qu.: 43170 | 1st Qu.:1060 | |
| Median : 68363 | Median : 69543 | Median : 67100 | Median : 68390 | Median :1717 | |
| Mean : 71455 | Mean : 72693 | Mean : 70175 | Mean : 71506 | Mean :1929 | |
| 3rd Qu.: 96835 | 3rd Qu.: 98312 | 3rd Qu.: 95232 | 3rd Qu.: 96837 | 3rd Qu.:2463 | |
| Max. :124728 | Max. :126272 | Max. :123148 | Max. :124728 | Max. :9294 |
We can also collect additional information such as the mean and quantiles for different values of \(\alpha \in {0.0, 0.1, 0.2, 0.3, 0.4, 0.5}\), where the quantile at level \(\alpha = 0.5\) corresponds to the median of a given vector such as btc_spot$close. 13 The mean value of this vector is 7.151^{4} while the different quantile levels can be found in Table 4.3.
| 0% | 10% | 20% | 30% | 40% | 50% |
|---|---|---|---|---|---|
| 25127 | 27993 | 37671 | 55986 | 63316 | 68390 |
This information can also be represented graphically using a plotly histogram as shown in Figure 4.2.
btc_close <- as.numeric(btc_spot$close)
close_hist <- plotly::plot_ly(
x = btc_close,type = "histogram", name = "BTC/USD Spot close"
) %>%
add_segments(
x = quantile(btc_close, probs = seq(0.1, 0.9, 0.1)),
xend = quantile(btc_close, probs = seq(0.1, 0.9, 0.1)),
y = -10, yend=10, name = paste("BTC/USD - \U03B1:", seq(0.1, 0.9, 0.1)),
line = list(width = 1)
)
close_histFigure 4.2: Histogram of BTC/USD price quantiles.
And we can also compare this histogram to the corresponding Close price of the btc_futures table as seen in Figure 4.3.
fig <- plotly::plot_ly(alpha = 0.5) %>%
add_histogram(x = as.numeric(btc_spot$close),
name = "BTC/USD Spot") %>%
add_histogram(x = as.numeric(btc_futures$Close),
name = "BTC/USD Futures") %>%
layout(barmode = "overlay")
figFigure 4.3: Histogram of BTC/USD spot prices overlaying futures prices.
4.3 Modelling
For this step we start by computing the log-returns shown in Figure 4.4 as follows:
out <- get_price_hist("btcusd", crypto = T)
y <- diff(log(as.numeric(out$ohlc$CLOSE)))Figure 4.4: Log-returns for the BTC/USD pair, prices (close) from Yahoo! Finance.
After which we pass y to the MCMC sampler:
out <- caviarma::caviar(y, nsim = 1e5)This step is the bottleneck of the data analysis since it must be repeated multiple times as explained in the Simulation Study section.
4.4 Forecasting
After getting our estimate \(\boldsymbol{\hat{\theta_t}}\), we obtain our forecast for \(t = t + 1\) using Equations (2.9) to (2.13):
quantile_forecast <- caviarma::get_forecast(y, out)Figure 4.5: VaR (0.05) forecasts using different models.
Figure 4.5 shows sample VaR forecasts obtained using different CAViaR models. We also include the corresponding tGARCH and sGARCH forecasts for reference (Ardia et al. 2019). 14
And finally, we show in Table 4.4 the results from the VaR backtest obtained through the GAS::BacktestVaR() function.
| Statistic | P-value | |
|---|---|---|
| Symmetric Absolute Value | 1.79 | 0.971 |
| Asymmetric Slope | 6.10 | 0.528 |
| GARCH | 4.79 | 0.686 |
| Adaptive | 4.00 | 0.779 |
Web app
We make use of the shiny and shinyMobile R packages to build a progressive web app15 (PWA) that allows a user to gather, analyze and visualize the data introduced in the Data section using the models and methods studied in the Forecasting Methods section. To run this application locally, one must first install the simulr package (C. Sepulveda 2021) and then run the app:
remotes::install_gitlab("cacsfre/simulr")
simulr::run_simulr()4.5 Reactivity
R is a powerful data-centred programming language offering a mathematically intuitive interactive environment which makes data analysis a joyful experience. The shiny package offers an intuitive framework for developing web applications using the R programming language. A shiny application seamlessly connect user interface (UI) input to the back end running R. This makes it straightforward to create a UI to allow users of our code to interact with it without requiring any programming.
This relationship between the input received from the browser (the client) and the output produced by the server is summarized in Figure 4.6 shiny handles all the required css/html/javascript16 to create a communication channel between the browser and the server through R.
Shiny offers a reactive programming model which makes it easy to make use of an R function whenever the input object changes. The logic behind each element rendered in the UI is summarized using shiny modules. For instance, the code below17 creates a plotly map (the UI element) whenever input$update_map changes, i.e., it’s clicked.
map_cardUI <- function(id) {
uiOutput(NS(id, "map_card"))
}
map_cardServer <- function(id) {
moduleServer(id, function(input, output, session) {
values <- shiny::reactiveValues(map_data = NULL)
output$map_plot <- plotly::renderPlotly(get_map(values$map_data))
observeEvent(input$update_map, {values$map_data <- get_map_data()})
output$map_card <- renderUI({
shinyMobile::f7Card(
title = shiny::actionButton("update_map"),
plotly::plotlyOutput("map_plot")
)
})
})
}Figure 4.6: Relationship between user input, server output and reactive values.
In the example above, the server recreates the output$map_plot session object whenever the user clicks on input$update_map. This relationship is due to the observeEvent(input$update_map, {values$map_data <- get_map_data()}) expression, which invalidates the reactive relationship between values$map_data and output$map_plot whenever input$update_map is clicked by the user.
4.6 Deployment
Deploying a shiny application can be pretty simple using the https://www.shinyapps.io/ service by RStudio. This works fine for demos consuming limited resources. However, it requires an upgrade once we need to handle multiple users or greater computing resources. The two typical approaches to scale a shiny app in an enterprise context are Shiny Server and ShinyProxy, where shinyproxy18 tries to fill the gaps of the open-source version of shiny-server relying exclusively on the open-source shiny package. Another advantage of ShinyProxy is its use of docker, offering each user its own independent shiny app environment and R process. This isolation obtained through docker containers is important both for security and performance reasons. This architecture is outlined in Figure 4.7.
Figure 4.7: Outline of the shinyproxy architecture.
Compared to the open-source version of shiny-server in Figure 4.8.
Figure 4.8: Outline of the open-source shiny-server architecture.