Position Sizing for Practitioners [Part 2: Dealing with Drawdown]

The Problem with Optimal f

What does “optimal” mean, anyway? In the first part of this series, we discovered that the staked fraction of capital that yields the greatest compounded returns also yields a less-than-optimal level of drawdown.  To realize the greatest return on capital, an investor in SPY since its inception should have used over 3x leverage to buy in. This would have yielded the greatest compounded rate of return, but would have induced a 97% (!!!) max drawdown along the way. Since a 20% retracement from peak equity causes most investors to start tossing in their sleep, this approach doesn’t seem very realistic. This post will help traders maximize their gains while still getting their beauty rest!

Read [Part 1: Beyond Kelly] here and feel free to follow along with provided notebook.

Drawdown Curve

We left off with an example showing that investing in SPY at optimal f would cause some extreme discomfort along the way. Let’s see how increasing position size increases drawdown by constructing a curve similar to the one for GHPR for maximum drawdown. We use maximum drawdown as a proxy for risk because this is the number that investors “feel”, which tends to drive allocation decisions for emotional reasons.

An alternative metric is the Ulcer Index, a root-mean-squared  calculation based on drawdown that takes into account both drawdown severity and duration. Source code for this value has been provided as well; feel free to substitute it in.

Python Code:

def equity_curve(returns):
    eq = (1 + returns).cumprod(axis=0)
    # normalized_eq = raw_eq / raw_eq[0]
    return eq

def drawdown(equity_curve):
    eq_series = pd.Series(equity_curve)
    _drawdown = eq_series / eq_series.cummax() - 1
    return _drawdown

def max_drawdown(equity_curve, percent=True):
    abs_drawdown = np.abs(drawdown(equity_curve))
    _max_drawdown = np.max(abs_drawdown)
    if percent == True:
        return _max_drawdown * 100
        return _max_drawdown

def ulcer_index(equity_curve):
    _drawdown = drawdown(equity_curve)
    _ulcer_index = np.sqrt(np.mean(_drawdown**2)) * 100
    return _ulcer_index

def twr(equity_curve):
    eq_arr = np.array(equity_curve)
    _twr = eq_arr[-1] / eq_arr[0]
    return _twr

def ghpr(equity_curve):
    _twr = twr(equity_curve)
    _ghpr = _twr ** (1 / len(equity_curve)) - 1
    return _ghpr

def get_f_dd(returns):
    f_values = np.linspace(0, 0.99, 100)
    max_loss = np.abs(np.min(returns))
    bounded_f = f_values / max_loss
    df = pd.DataFrame(columns=['ghpr', 'drawdown'])
    for f in bounded_f:
        eq = equity_curve(f * returns)
        _ghpr = ghpr(eq) - 1
        _max_drawdown = max_drawdown(eq)
        # _ulcer_index = ulcer_index(eq)
        df.loc[f, 'ghpr'] = _ghpr * 100
        df.loc[f, 'drawdown'] = _max_drawdown
        # df.loc[f, 'ulcer index'] = _ulcer_index
        optimal_f = df['ghpr'].idxmax()
    return {'f_curve':df, 'optimal_f':optimal_f, 'max_loss':max_loss}

def f_dd_plot(f, title=''):
    f_curve = f['f_curve']
    optimal_f = f['optimal_f']
    optimal_f_ghpr = f_curve.loc[optimal_f, 'ghpr']
    limit_dd = f_curve.loc[2 * optimal_f, 'drawdown']

    fig, ax = plt.subplots(1, 1, figsize=(10, 7))
    f_curve.plot(secondary_y='drawdown', ax=ax)

    ax.set_xlim(0, optimal_f * 2)
    ax.set_ylim(0, 1.5 * optimal_f_ghpr)
    ax.set_title(title + ' GHPR vs Max Drawdown')
    ax.set_xlabel('Fraction Staked')
    ax.set_ylabel('GHPR (%)')
    ax.right_ax.set_ylabel('Drawdown (%)')
    ax.axvline(optimal_f, linestyle=':', color='red')

    plt.savefig(title + ' f Drawdown Curve.png')

def f_dd_results(f):
    f_curve = f['f_curve']
    optimal_f = f['optimal_f']
    ghpr = f_curve.loc[optimal_f, 'ghpr']
    drawdown = f_curve.loc[optimal_f, 'drawdown']

    print('Optimal f: {}'.format(np.round(optimal_f, 3)))
    print('Geometric Holding Period Return: {}%'.format(np.round(ghpr, 5)))
    print('Max Drawdown: {}%'.format(np.round(drawdown, 2)))

# SPY.csv file available in GitHub 'data' folder
spy = pd.read_csv('SPY.csv', parse_dates=True, index_col=0)
spy.sort_index(ascending=True, inplace=True)
spy_returns = spy['Adj Close'].pct_change().dropna()

spy_f = get_f_dd(spy_returns)

f_dd_plot(spy_f, 'SPY')
Optimal f: 3.149
Geometric Holding Period Return: 0.06677%
Max Drawdown: 97.23%

We can see from the graph that as position size approaches optimal f, both rate of return and maximum drawdown increase, with drawdown at optimal f well over 90%. As we increase size further from this point, drawdown continues to approach 100%. However, returns begin to decrease. This reinforces the fact that increasing size to increase profits (*cough* 200x leverage on cryptocurrency futures) past optimal f is a fool’s errand. Doing so will only bring further pain.

However, many investors’ pain tolerance is much (MUCH) lower than 97%. Most people have a psychological limit where they’ll take their money off the table if exceeded. Be conservative with this limit; what you can handle when looking at a backtest is probably different than what you can when looking at your account PnL.

The total returns aren’t relevant if one can’t handle the risk. For this reason, we seek to bound the curve at the point where drawdown exceeds our limit. Effectively, the GHPR value for all fractions that exceed the risk limit becomes zero; it’s irrelevant. The peak of this new, bounded, curve occurs at the real optimal fraction. Deploying this fraction of our capital will yield the greatest return, while protecting from the psychological pain of our maximum drawdown.

For instance, let’s say we want to maximize our compounded returns in SPY without letting our drawdown exceed 25%.

def get_f_bounded(returns, drawdown_limit):
    f_values = np.linspace(0, 0.99, 199)
    max_loss = np.abs(np.min(returns))
    bounded_f = f_values / max_loss
    df = pd.DataFrame(columns=['ghpr', 'drawdown'])
    for f in bounded_f:
        eq = equity_curve(f * returns)
        _max_drawdown = max_drawdown(eq)
        if _max_drawdown <= drawdown_limit:
            _ghpr = ghpr(eq)
            _ghpr = 0
        df.loc[f, 'ghpr'] = _ghpr * 100
        df.loc[f, 'drawdown'] = _max_drawdown
    optimal_f = df['ghpr'].idxmax()
    return {'f_curve':df, 'optimal_f':optimal_f, 'max_loss':max_loss}

spy_f_bounded = get_f_bounded(spy_returns, drawdown_limit=25)
f_dd_plot(spy_f_bounded, 'SPY Bounded')

Optimal f: 0.356
Geometric Holding Period Return: 0.01427%
Max Drawdown: 23.04%

In order to protect from the possibility of a 25% drawdown, we'd need to scale back the leverage from our optimal f value by almost 9 times!

Drawdown Caveats

As noted above, drawdown is a good risk metric for practitioners because this it most directly impacts their emotions. In practice, few traders base their decisions whether to follow their strategy on the standard deviation of returns or similar measures of volatility. It's a much more visceral experience to watch one's account balance drop from $100,000 to $50,000; one that can de-rail trading strategies. However, there are limitations to using drawdown as a risk metric.

Order of Returns

Unlike compounded returns, the value for maximum drawdown depends on the order in which returns are realized. We'll demonstrate this below by shuffling the order of S&P returns and examining the resulting equity curves.

def reordered_curves(returns, n_curves):
    curves = pd.DataFrame(index=returns.index)
    for i in range(n_curves):
        reordered = np.random.permutation(returns)
        curves[i] = equity_curve(reordered)
    return curves

spy_curves_10 = reordered_curves(spy_returns, 10)

fig, ax = plt.subplots(1, 1, figsize=(10, 10))
ax.set_title('Equity Curves')
plt.savefig('10 Equity Curves.png')
curve_results = pd.DataFrame()
curve_results['GHPR'] = curves.apply(ghpr)
curve_results['Max Drawdown'] = curves.apply(max_drawdown)

We see that regardless of the order in which daily returns are realized, the GHPR for each curve is the same. Using optimal f as it's traditionally defined would not discriminate between these curves. However, the difference between the largest and smallest drawdowns experienced is over 20%! And this is after generating only 10 curves; we'll see below that the possible values can vary greatly.

spy_curves_5000 = reordered_curves(spy_returns, 5000)
spy_drawdown_5000 = spy_curves_5000.apply(max_drawdown)
plt.title('Possible Max Drawdown Values')
print('Maximum Drawdown: {}%'.format(np.round(spy_drawdown_5000.max(), 3)))
Maximum Drawdown: 73.686%

The maximum drawdown varies from less than 30% to over 70%! This is a huge range of possible values. If we want to use drawdown as a risk metric, we need to deal with its inherent uncertainty. We won't be able to totally constrain our results on expected drawdown, as the value for max drawdown could fall within a wide range. However, we can say with some level of confidence that the maximum drawdown experienced won't exceed our threshold. To do this, we calculate the percentage of scenarios in which maximum drawdown is likely exceed our risk threshold. If this percentage is too great, we must use a smaller position size.

Time Horizon

A second important characteristic of drawdown is that it is time-horizon dependent. In fact,  it can be shown that maximum drawdown is proportional to the square root of time (I won't prove that out here, but feel free to work investigate it yourself!) Essentially, the longer a given series of returns compounds on itself, the greater the maximum drawdown will be. Intuitively, an account's maximum drawdown over the course of the next month will be less than its maximum over the next year. In order to accurately set a threshold for maximum drawdown, we'll need to define our time horizon.

For example, we might specify that drawdown must not exceed 25% over the course of the next year (approx. 250 trading days). Building on our earlier findings, we'd like to state this constraint with a certain degree of certainty, or confidence. A common level for statistical significance is 5%. We could desire to be 95% sure that our drawdown over the coming year will be less than 25%.

To determine the ideal staking fraction under these conditions, we will use the following process for each f value between 0 and (1 / worst loss):

  • Generate many possible equity curves (similar to Monte Carlo)
    • Randomly sample a number of returns from the return distribution according to our desired time horizon
    • Usually pick >1000 to get an accurate picture of drawdown distribution
  • Calculate GHPR and maximum drawdown for each of these curves
  • Determine the drawdown level at our specified confidence level (i.e. 95th percentile of drawdown distribution)
  • Record the median GHPR value (you could choose to use a more favorable/conservative percentile for GHPR but we'll use the median)
  • If the drawdown value at our specified confidence level is lower than our risk threshold, assign the median GHPR as the GHPR value for that f
    • If it is not, set GHPR = 0. It's too risky to trade at this position size
  • Find the point in this new f curve that maximizes the expected value of GHPR

We'll call this point "ideal f" (I guess.. please someone suggest something better!) .

import time

def ideal_f(returns, time_horizon, n_curves, drawdown_limit, certainty_level):
    Calculates ideal fraction to stake on an investment with given 
    return distribution

    returns: (array-like) distribution that's representative of future 
    time_horizon: (integer) the number of returns to sample for each 
    n_curves: (integer) the number of equity curves to generate on each 
    iteration of f
    drawdown_limit: (real) user-specified value for drawdown which must 
    not be exceeded
    certainty_level: (real) the level of confidence that drawdown
    limit will not be exceeded

    'f_curve': calculated drawdown and ghpr value at each value of f
    'optimal_f': the ideal fraction of one's account to stake on an 
    'max_loss': the maximum loss sustained in the provided returns 

    print('Calculating ideal f...')
    start = time.time()

    f_values = np.linspace(0, 0.99, 200)
    ax_loss = np.abs(np.min(returns))
    bounded_f = f_values / max_loss
    f_curve = pd.DataFrame(columns=['ghpr', 'drawdown'])
    for f in bounded_f:
    # Generate n_curves number of random equity curves
    reordered_returns = np.random.choice(f * returns, size= 
        (time_horizon, n_curves))
    curves = equity_curve(reordered_returns)
    curves_df = pd.DataFrame(curves)
    # Calculate GHPR and Maximum Drawdown for each equity curve
    curves_drawdown = max_drawdown(curves_df)
    curves_ghpr = ghpr(curves_df)
    # Calculate drawdown at our certainty level
    drawdown_percentile = np.percentile(curves_drawdown, 
    # Calculate median ghpr value
    ghpr_median = np.median(curves_ghpr)
    if drawdown_percentile <= drawdown_limit:
        _ghpr = ghpr_median
        _ghpr = 0
    f_curve.loc[f, 'ghpr'] = _ghpr * 100
    f_curve.loc[f, 'drawdown'] = drawdown_percentile
    optimal_f = f_curve['ghpr'].idxmax()

    elapsed = time.time() - start
    print('Ideal f calculated in {}s'.format(elapsed))

    return {'f_curve':f_curve, 'optimal_f':optimal_f, 

spy_ideal_f = ideal_f(spy_returns, time_horizon=250, n_curves=1000, drawdown_limit=25, certainty_level=95)

Calculating ideal f...
Ideal f calculated in 2.113638162612915s
Optimal f: 0.859
Geometric Holding Period Return: 0.03295%
Max Drawdown: 24.57%

We see that the curves are a little more jagged using this approach. This is due to the inherent randomness involved when using a Monte Carlo-type simulation. However, we observe the same general relationship between GHPR and drawdown as before. Interestingly, this approach suggests using more than twice as much leverage as the determinisic method shown previously. This is for two reasons.

First, we shortened our time horizon considerably. Whereas the first attempt used the entire price history of the SPY (nearly 25 years), the second approach simulated curves one year out into the future. The shorter the time horizon, the less drawdown we expect. Second, financial time series such as equity index returns tend to exhibit both volatility clustering and autocorrelation. This means that the individual returns aren't totally independent from previous returns, an assumption used in our modeling process. To combat this, the strategy used to input results into the position sizing algorithm should take these factors into account when making buying/selling decisions.

This might be a good time to mention that the returns used in these calculations must be representative of expected future returns. The ideal scenario would be to use real-money returns or results from paper trading. If one must use results from backtesting, it's recommended to use out-of-sample returns if possible. Plugging over-optimized in-sample results into these formulas will underestimate risk and overestimate gain. If your strategy has a sharpe ratio of 8 in a backtest but in live trading it's closer to 0.5, you will be massively over-leveraged and quickly exceed your risk tolerance.

Final Thoughts / Next Steps

We now have a method to determine our ideal position size as a percentage of our total account equity. By simulating a number of possible outcomes for a series of returns, we are able to say with some level of confidence that the amount of capital we are risking will yield the greatest compounded return while remaining within our psychological limit for drawdown.

However, this method only applies to single return streams. As investors, we want to leverage the benefits of diversification. By combining multiple assets and strategies into a portfolio, we gain access to downside protection and additional upside. We'll incorporate this into our approach in the next part of this series, so check back for updates!

If you liked this post (or didn't!) leave a comment below, and/or join the Quant Talk chat at https://t.me/joinchat/GrWzrxH7Z0X_65JD3NLGMw.

Thanks for Reading