Single-Asset VaR

December 11, 2021

In mathematical finance, value at risk (VaR) is a statistic that quantifies the level of risk in a position or portfolio over a specific time frame. VaR is a measue of probable maximum loss (PML)—the maximum loss that the position or portfolio would be expected to incur. Specifically, we are interested in answering the question, What are the worst 1% of possible outcomes?

Suppose we have a position in a stock. How might we go about determining the worst 1% of possible outcomes for tomorrow? Well, if PP is a random variable that models tomorrow's closing price, then the 99% probable minimum closing price p0p_0 satisfies the integral equation

p0fP(p)dp=Pr[Pp0]=0.99,\int_{p_0}^\infty f_P(p) \, dp = \text{Pr}[P \geq p_0] = 0.99,

where fPf_P is the probability density function (PDF) of PP. To solve this equation for p0p_0, we need to know fPf_P—but how do we get it? To that end, let p=(p1,,pn)\mathbf{p} = (p_1, \dots, p_n) be the sequence of daily closing prices of the stock for the past nn days. For example, p\mathbf{p} might be the sequence of daily closing prices of Tesla for the past year, plotted below.

TSLA Price

If we assume that tomorrow's closing price will be like past closing prices, then we can use the frequency distribution of p\mathbf{p} to estimate fPf_P. The frequency of a price pp in p\mathbf{p} is just the number of occurrences of pp in p\mathbf{p}, given by i=1nδpip\sum_{i=1}^n \delta_{p_ip}, where δij\delta_{ij} is the Kronecker delta:

δij={0,ij,1,i=j.\delta_{ij} = \begin{cases} 0, & i \neq j, \\ 1, & i = j. \end{cases}

The relative frequency is just the normalized frequency: 1ni=1nδpip\frac{1}{n}\sum_{i=1}^n \delta_{p_ip}. It tells us the probability that a randomly selected pip_i will equal pp. We define the relative frequency function (RFF) of p\mathbf{p} to be a function fp(p)f_\mathbf{p}(p) that gives the relative frequency of pp in p\mathbf{p}. Symbolically,

fp(p)1ni=1nδpip.f_\mathbf{p}(p) \equiv \frac{1}{n}\sum_{i=1}^n \delta_{p_ip}.

fpf_\mathbf{p} for our example is represented below as a histogram.

TSLA Price Frequency

Under our assumptions, fPfpf_P \approx f_\mathbf{p}. But this plot illustrates a clear problem with our approach to estimating fPf_P. Based on our approach, TSLA would be more likely to close around $700—a massive drop—than around the same price as yesterday! This seems wildly wrong. The problem lies in our assumption: tomorrow's closing price will not be like past closing prices—a certain closing price is not necessarily likely just because the stock has closed at that price many times in the past. Thus, we are forced to consider alternative approaches.

A reasonable next step might be to look at changes in price. Let CC be a random variable that models tomorrow's change in price, and define the sequence c=(c1,,cn1)\mathbf{c} = (c_1, \dots, c_{n-1}) via

ci=pi+1pi,c_i = p_{i+1} - p_i,

so that cic_i is the change in price on day i+1i + 1. c\mathbf{c} and fcf_\mathbf{c} for our example are plotted below.

TSLA Price Change

TSLA Price Change Frequency

This looks much more promising. Our RFF looks fairly normal, so suppose that CN(μ,σ2)C \sim \mathcal{N(\mu, \sigma^2)} for some mean μ\mu and variance σ2\sigma^2. The problem with using changes in price arises when we try to estimate μ\mu and σ\sigma by applying maximum likelihood estimation (MLE) to this distribution. If you have studied linear regression, the idea here is similar: we want to optimize the parameters of our function to best fit our data. The difference is that we are working with a Gaussian (or "bell curve") as opposed to a linear function.

To see the problem, consider c\mathbf{c} to be a sample for the sequence of random variables C1,,Cn1C_1, \dots, C_{n-1} that model the change in price on days 2,,n2, \dots, n, respectively, and suppose that each CiC_i is normally distributed, which is reasonable since CC is assumed to be normally distributed and in this context there is nothing special about tomorrow as opposed to, say, last Thursday. The problem here is that we should not assume that changes in price are identically distributed—each CiC_i requires its own parameters: CiN(μi,σi2)C_i \sim \mathcal{N}(\mu_i, \sigma_i^2). The reason identical distribution would be a bad assumption here is that, as anyone who has traded stocks can attest, changes in price tend to scale with the price itself. (This is why stock prices often look exponential over long periods of time: the defining characteristic of an exponential function f(x)f(x) is that it satisfies the differential equation

dfdx=af(x),\frac{df}{dx} = af(x),

where aa is a constant. In this case, the derivative corresponds to the change in price and the function itself corresponds to the price. From a psychological perspective, this can be attributed to what is known as the Weber-Fechner law, which says that the perceived effect of a change depends on the starting point.) Because of this, estimating the μi\mu_is and σi2\sigma_i^2s would not get us any closer to estimating μ\mu and σ2\sigma^2. We can account for this scaling behavior, however—by converting to returns.

Let RR be a random variable that models tomorrow's return, and define the sequence r=(r1,,rn1)\mathbf{r} = (r_1, \dots, r_{n-1}) via

ri=ci+1pi=pi+1pipi,r_i = \frac{c_{i+1}}{p_{i}} = \frac{p_{i+1} - p_i}{p_i},

so that rir_i is the return on day i+1i + 1. r\mathbf{r} and frf_\mathbf{r} for our example are plotted below.

TSLA Return

TSLA Return Frequency

As before, our RFF appears normal, so suppose that RN(μ,σ2)R \sim \mathcal{N}(\mu, \sigma^2) for some μ\mu and σ2\sigma^2 and consider r\mathbf{r} to be a sample for the sequence of random variables R=(R1,,Rn1)\mathbf{R} = (R_1, \dots, R_{n-1}) that model the return on days 2,,n2, \dots, n, respectively. This time around, since we are dealing with percentage changes, we can assume identical distribution: each RiN(μ,σ2)R_i \sim \mathcal{N}(\mu, \sigma^2). Then the PDF of each RiR_i is

fRi(rμ,σ2)=fR(rμ,σ2)=12πσ2e(rμ)2/2σ2.f_{R_i}(r \mid \mu, \sigma^2) = f_R(r \mid \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-(r - \mu)^2/2\sigma^2}.

Furthermore, because we do not expect the return on any given day to depend too much on past returns, we assume that the RiR_is are mutually independent. We can then apply the multiplication rule to obtain their joint cumulative distribution function (JCDF):

FR(r0μ,σ2)Pr[R1r0,1,,Rn1r0,n1μ,σ2]=i=1n1Pr[Rir0,iμ,σ2]=i=1n1r0,ifRi(rμ,σ2)dr,=i=1n1r0,ifR(rμ,σ2)dr,\begin{aligned} F_\mathbf{R}(\mathbf{r_0} \mid \mu, \sigma^2) &\equiv \text{Pr}[R_1 \leq r_{0,1}, \dots, R_{n-1} \leq r_{0,n-1} \mid \mu, \sigma^2] \\ &= \prod_{i=1}^{n-1} \text{Pr}[R_i \leq r_{0,i} \mid \mu, \sigma^2] \\ &= \prod_{i=1}^{n-1} \int_{-\infty}^{r_{0,i}} f_{R_i}(r \mid \mu, \sigma^2) \, dr, \\ &= \prod_{i=1}^{n-1} \int_{-\infty}^{r_{0,i}} f_R(r \mid \mu, \sigma^2) \, dr, \end{aligned}

where r0=(r0,1,,r0,n1)\mathbf{r_0} = (r_{0,1}, \dots, r_{0,n-1}) is an arbitrary sample for R\mathbf{R}. Their joint probability density function is then, using the fundamental theorem of calculus,

fR(r0μ,σ2)n1FRr0,1r0,n1=n1r0,1r0,n1i=1n1r0,ifR(rμ,σ2)dr=i=1n1r0,ir0,ifR(rμ,σ2)dr=i=1n1fR(r0,iμ,σ2)\begin{aligned} f_\mathbf{R}(\mathbf{r_0} \mid \mathbf{\mu}, \mathbf{\sigma^2}) &\equiv \frac{\partial^{n-1}F_\mathbf{R}}{{\partial}r_{0,1}\cdots{\partial}r_{0,n-1}} \\ &= \frac{\partial^{n-1}}{{\partial}r_{0,1}\cdots{\partial}r_{0,n-1}}\prod_{i=1}^{n-1} \int_{-\infty}^{r_{0,i}} f_R(r \mid \mu, \sigma^2) \, dr \\ &= \prod_{i=1}^{n-1} \frac{\partial}{{\partial}r_{0,i}}\int_{-\infty}^{r_{0,i}} f_R(r \mid \mu, \sigma^2) \, dr \\ &= \prod_{i=1}^{n-1} f_R(r_{0,i} \mid \mu, \sigma^2) \\ \end{aligned}

For given parameters μ\mu and σ2\sigma^2, the relative likelihood of observing r\mathbf{r} is given by the likelihood function:

L(μ,σ2)fR(rμ,σ2)=i=1n1fR(riμ,σ2)=i=1n112πσ2e(riμ)2/2σ2=(i=1n112πσ2)(i=1n1e(riμ)2/2σ2)=(12πσ2)(n1)/2exp(i=1n1(riμ)22σ2)=(12πσ2)(n1)/2exp(12σ2i=1n1(riμ)2)\begin{aligned} \mathcal{L}(\mu, \sigma^2) &\equiv f_\mathbf{R}(\mathbf{r} \mid \mu, \sigma^2) \\ &= \prod_{i=1}^{n-1} f_R(r_i \mid \mu, \sigma^2) \\ &= \prod_{i=1}^{n-1} \frac{1}{\sqrt{2\pi\sigma^2}}e^{-(r_i - \mu)^2/2\sigma^2} \\ &= \left(\prod_{i=1}^{n-1} \frac{1}{\sqrt{2\pi\sigma^2}}\right)\left(\prod_{i=1}^{n-1} e^{-(r_i - \mu)^2/2\sigma^2}\right) \\ &= \left(\frac{1}{2\pi\sigma^2}\right)^{(n-1) / 2}\exp\left(\sum_{i=1}^{n-1} -\frac{(r_i - \mu)^2}{2\sigma^2}\right) \\ &= \left(\frac{1}{2\pi\sigma^2}\right)^{(n-1) / 2}\exp\left(-\frac{1}{2\sigma^2}\sum_{i=1}^{n-1} (r_i - \mu)^2\right) \end{aligned}

We want to find values of μ\mathbf{\mu} and σ2\mathbf{\sigma^2} that maximize L\mathcal{L}. That is, we want to satisfy the equations

Lμ=0andLσ2=0.\frac{\partial\mathcal{L}}{\partial\mathbf{\mu}} = 0 \quad \text{and} \quad \frac{\partial\mathcal{L}}{\partial\mathbf{\sigma^2}} = 0.

Because of the exponential in L\mathcal{L}, it is convenient to work with the log-likelihood function:

(μ,σ2)lnL(μ,σ2)=ln[(12πσ2)(n1)/2exp(12σ2i=1n1(riμ)2)]=ln(12πσ2)(n1)/2+lnexp(12σ2i=1n1(riμ)2)=n12ln12πσ212σ2i=1n1(riμ)2.\begin{aligned} \ell(\mu, \sigma^2) &\equiv \ln \mathcal{L}(\mu, \sigma^2) \\ &= \ln\left[\left(\frac{1}{2\pi\sigma^2}\right)^{(n - 1) / 2}\exp\left(-\frac{1}{2\sigma^2}\sum_{i=1}^{n-1} (r_i - \mu)^2\right)\right] \\ &= \ln\left(\frac{1}{2\pi\sigma^2}\right)^{(n - 1) / 2} + \ln\exp\left(-\frac{1}{2\sigma^2}\sum_{i=1}^{n-1} (r_i - \mu)^2\right) \\ &= \frac{n - 1}{2}\ln \frac{1}{2\pi\sigma^2} - \frac{1}{2\sigma^2}\sum_{i=1}^{n-1} (r_i - \mu)^2. \end{aligned}

Since the natural logarithm function is monotonically increasing, \ell is maximized by the same values of μ\mu and σ2\sigma^2 as L\mathcal{L}. Thus, we can look at the equations

μ=0andσ2=0.\frac{\partial\ell}{\partial\mathbf{\mu}} = 0 \quad \text{and} \quad \frac{\partial\ell}{\partial\mathbf{\sigma^2}} = 0.

For the μ\mu equation, we have

0=μ(n12ln12πσ212σ2i=1n1(riμ)2)=12σ2i=1n1μ(riμ)2=1σ2i=1n1(riμ)=1σ2(i=1n1ri(n1)μ)    i=1n1ri(n1)μ=0    (n1)μ=i=1n1ri    μ=1n1i=1n1ri,\begin{aligned} & \begin{aligned} 0 & = \frac{\partial}{\partial\mu}\left(\frac{n - 1}{2}\ln \frac{1}{2\pi\sigma^2} - \frac{1}{2\sigma^2}\sum_{i=1}^{n-1} (r_i - \mu)^2\right) \\ & = -\frac{1}{2\sigma^2}\sum_{i=1}^{n-1} \frac{\partial}{\partial\mu}(r_i - \mu)^2 \\ & = \frac{1}{\sigma^2}\sum_{i=1}^{n-1}(r_i - \mu) \\ & = \frac{1}{\sigma^2}\left(\sum_{i=1}^{n-1} r_i - (n - 1)\mu\right) \end{aligned} \\ & \implies \sum_{i=1}^{n-1} r_i - (n - 1)\mu = 0 \\ & \implies (n - 1)\mu = \sum_{i=1}^{n-1} r_i \\ & \implies \mu = \frac{1}{n - 1}\sum_{i=1}^{n-1} r_i, \end{aligned}

which is just the arithmetic mean. For the σ2\sigma^2 equation, we have

0=σ2(n12ln12πσ212σ2i=1n1(riμ)2)=n12σ2+12σ4i=1n1(riμ)2    (n1)σ2+i=1n1(riμ)2=0    (n1)σ2=i=1n1(riμ)2    σ2=1n1i=1n1(riμ)2.\begin{aligned} & \begin{aligned} 0 & = \frac{\partial}{\partial\sigma^2}\left(\frac{n - 1}{2}\ln \frac{1}{2\pi\sigma^2} - \frac{1}{2\sigma^2}\sum_{i=1}^{n-1} (r_i - \mu)^2\right) \\ & = -\frac{n - 1}{2\sigma^2} + \frac{1}{2\sigma^4}\sum_{i=1}^{n-1} (r_i - \mu)^2 \end{aligned} \\ & \implies -(n - 1)\sigma^2 + \sum_{i=1}^{n-1} (r_i - \mu)^2 = 0 \\ & \implies (n - 1)\sigma^2 = \sum_{i=1}^{n-1} (r_i - \mu)^2 \\ & \implies \sigma^2 = \frac{1}{n - 1}\sum_{i=1}^{n-1} (r_i - \mu)^2. \end{aligned}

With μ\mu and σ\sigma in hand, we are now in a position to consider the integral equation

0.99=r0fR(r)dr=12πσ2r0e(rμ)2/2σ2dr.0.99 = \int_{r_0}^\infty f_R(r) \, dr = \frac{1}{\sqrt{2\pi\sigma^2}}\int_{r_0}^\infty e^{-(r - \mu)^2/2\sigma^2} \, dr.

As it happens, Gaussians are not antidifferentiable, so we cannot compute this integral analytically, but we have, in principle, arrived at the answer to our question. Of course, principle does not help you assess real-world risk, so it is important to know that this integral can be calculated numerically. In fact, because Gaussians are so ubiquitous, they have been well-studied, and numeric approximations of their integrals can be readily found in tables. In this case, the solution is

r0=2.33σ.r_0 = -2.33\sigma.

Therefore, to estimate the 99% probable minimum loss for tomorrow, you can simply multiply the historical standard deviation of the stock by 2.332.33.