How to compute historical stock statistics?

N

nomail1983

(Not exactly an Excel question. But I know a lot of Excel users do
this, and I know there are a number of statistics-savvy participants
in these newsgroups.)

If I have historical daily price data (y1, y2,..., yN) for a stock or
fund, the daily return is computed by y2/y1, y3/y2,..., yN/y[N-1].
Typically, the annualized average rate of return is computed by the
geometric average, namely (yN/y1)^(252/(N-1))-1. The annualized
standard deviation of the rate of return is computed by the antilog of
the standard deviation of the log returns, e.g. exp(stdev(ln({y2/
y1,...,yN/y[N-1]})*sqrt(252))-1. (Note: 252 is used instead of 365
because there are typically 252 trading days per year.)

That method -- especially the annulaized geometric average -- provides
a "warm fuzzy" feeling because when the initial investment is
multiplied by the compounded annualized rate of return, we get the
current value of the investment. The annualized geometric mean is
comparable to the APY for other asset classes, notably cash.

I realize that we must compute the annualized statistics in that
manner when we have insufficient data, e.g. 3-12 months of daily
prices.

But I wonder: if I have sufficient data (i.e. many years of daily
prices), would it be reasonable and arguably better to simply compute
the arithmetic average and arithmetic standard deviation of the year-
over-year daily returns?
 
F

Fred Smith

Absolutely. It would be much better to ignore intermediate prices in computing
the return.

If a stock goes from $10.00 to $12.50 in a year (assuming no dividends), your
return is 25%. It doesn't matter what happened in between. My bet is that
annualizing the daily returns is not giving you correct results. Compare them
and see.
 
N

nomail1983

Absolutely. It would be much better to ignore intermediate prices
in computing the return.

If a stock goes from $10.00 to $12.50 in a year (assuming no dividends), your
return is 25%. It doesn't matter what happened in between. My bet is that
annualizing the daily returns is not giving you correct results. Compare them
and see.

In that case, annualizing the daily returns should give the same
result, I think.

Assume that there are 253 daily prices y1,y2,...,y253, where y253 is
one year after y1 (e.g 12/31 to 12/31). You are computing y253/y1-1.
The conventional way to compute the annualized geometric mean can be
expressed many ways. One way is ((y2/y1)*(y3/y2)*...*(y253/
y252))^(252/252)-1 -- or more simply: (y253/y1)^(252/252)-1.
Obviously that reduces to y253/y1-1 -- the same as your formula.

I think you missed my point: computing the arithmetic mean of every
year-over-year daily return between two dates (not just the first and
last daily return) v. computing the geometric mean. If we have 2
years of daily returns, the arithmetic mean of the year-over-year
returns would be:

average(y505/y253, y504/y252,..., y253/y1) - 1

whereas the annualized geometric mean would be:

(y505/y1)^(252/504) - 1

(Presumably you would compute (y505/y1)^(1/2)-1, which is the same
thing as the second case, of course.)

With some actual data (2833 trading days -- more than 11 years), the
difference is not that great, which surprises me. I am especially
surprised that the standard deviations are so close.

But that is anecdotal. It proves nothing about the correctness (or
not) of the statistic theory.
 
N

nomail1983

In that case, annualizing the daily returns should give the same
result, I think.

I would, however, agree with you with respect to annualizing sub-
annual results. If a stock price goes from $10.00 to $12.50 in 3
months, I think it is misleading to say that the stock price is
increasing at an annualized rate of 144% (1.25^4-1). That is a good
example of where I believe averaging the year-over-year change might
provide a more realistic picture.

I also have difficulty accepting the conventional method of
annualizing the standard deviation based on sub-annual results, for
example daily returns. Multiplying by the square root of time is
based on statistical theory that may or may not apply to specific
data. One online explanation both confirms and dismisses my concern
in the same paragraph. They write, where their definition of
"volatility" is the std dev of the log returns [1]:

"[V]olatilities for different units of time are fundamentally
different notions. There is no direct relationship between, say a
weekly volatility and an annual volatility.

"However, there is an exception to this observation. The exception is
called the square root of time rule. If fluctuations in a stochastic
process from one period to the next are independent (i.e., there are
no serial correlations or other dependencies) volatility increases
with the square root of the unit of time. Any price that follows a
random walk, Brownian motion or geometric Brownian motion satisfies
this independence condition. The square root of time rule is exact if
volatilities are based upon log returns."

Despite my relunctance to accept the square root of time rule applied
to the std dev of the log returns, my empirical experience confirms it
time and time again. That is, as I mentioned in my previous posting,
the standard deviation of the year-of-year (daily) returns is not so
different from the (antilog of) the std dev of the log returns based
on the square root of time rule.

Nevertheless, I did not want to get into a discussion about the
conventional methods of financial engineers. Instead, I want input on
the validity (or not) of the year-over-year statistics -- that is, the
average and standard deviation of many sub-annual year-over-year data
in order to estimate annual statistics.


Endnotes
 
N

nomail1983

I want input on the validity (or not) of the year-over-year statistics
-- that is, the average and standard deviation of many sub-annual
year-over-year data in order to estimate annual statistics.

Well, I am obviously wrong-minded here, at least with respect to the
std dev. Imagine an example where there is significant volatility,
even on annual anniversary dates, but the day-to-day changes almost
identically track the day-to-day changes a year earlier. I might not
be explaining that very well, but the situation arose in a 4-year
subset of the actual data that I am working with. The std dev of the
year-over-year data is deceptively small.
 
M

MartinW

Hi,

Sounds like you are losing your self in the processes and
confusing your goals. I usually need a visual representation
to sort out what I'm getting and what I actually want.

In this situation I would plot all of my data on an XY Scatter
chart, adjust my scales to something that looks usable
and then start analysing the data with dummy series and
trendlines. I'd plot dummy series showing the AVERAGE,
the GEOMEAN (even HARMEAN and TRIMMEAN sometimes
although I doubt they would be any good here). I plot them
as yearly series and monthly series of each maybe even daily
in some cases, also play around with some trendlines some
where in all this trial and error process I usually spot
a relationship in the data that is relevant to what I am
trying to achieve.

HTH
Martin
 
N

nomail1983

Sounds like you are losing your self in the processes and
confusing your goals.

Y'got that right!
I usually need a visual representation
to sort out what I'm getting and what I actually want.

Certainly! But sometimes, especially with math problems, it is
additionally helpful to create "thought experiments" and imagine the
results.

Volatility is a measure of sequential variability -- for example, year
after year after year results. I inadvertently eliminated that
sequential factor when I looked at the std dev of all year-over-year
results.

On the other hand, the std dev of year-over-year results might be good
(dare I say better?) for the purpose of modeling market performance,
if we accept the idea that market behavior is essentially Brownian
motion. (Not everyone agrees with that.)
In this situation I would plot all of my data on an XY Scatter
chart [...]. I plot them
as yearly series and monthly series of each maybe even daily
in some cases, also play around with some trendlines some
where in all this trial and error process I usually spot
a relationship in the data that is relevant to what I am
trying to achieve.

Absolutely! I do that as well. But even if we compare two trendlines
over an appropriate time horizon and we see that one is steeper upward
than the other, we still want a measure of volatility in order to
assess risk.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top