Mutual Fund Benchmark Discrepancies Can Fool Investors

Evaluating the performance of actively managed mutual funds generally involves comparing a product’s results with some passive benchmark (the SEC requires that funds select a benchmark for comparison purposes) that follows the same investment “style” as the fund’s portfolio (such as the S&P 500 for large-cap stocks, the S&P MidCap 400 for midcap stocks and the S&P SmallCap 600 for small-cap stocks).

This practice can create problems, as funds can choose benchmarks with less exposure than they do to factors that historically have provided premiums (such as market beta, size, value, momentum, profitability and quality). Thus, the benchmarks have lower expected returns.

This misleading practice is important because, as Berk Sensoy’s study, “Performance Evaluation and Self-Designated Benchmark Indexes in the Mutual Fund Industry,” which appeared in the April 2009 issue of the Journal of Financial Economics, shows, “almost one-third of actively managed, diversified U.S. equity mutual funds specify a size and value/growth benchmark index in the fund prospectus that does not match the fund’s actual style.”

Unfortunately, the same research also shows that when allocating capital, individual investors emphasize comparisons relative to the mutual fund’s self-selected benchmark, not its true, risk-adjusted benchmark. It’s clear that fund companies strategically choose a benchmark that will drive fund flows.

Recent Research

Martijn Cremers, Jon Fulkerson and Timothy Riley contribute to the literature on mutual fund performance relative to self-selected benchmarks with their May 2018 study, “Benchmark Discrepancies and Mutual Fund Performance Evaluation.”

They used a holdings-based procedure to determine whether an actively managed fund has a “benchmark discrepancy”; that is, a benchmark other than the prospectus benchmark that better matches a fund’s actual investment strategy.

To accomplish this objective, they identified the benchmark that has the lowest active share with the fund’s holdings. They considered a fund as having a benchmark discrepancy if its benchmark mismatch was at least 60%. Their calculations revealed 26% of funds in their sample had a benchmark discrepancy. The authors’ data covers the period 1991 through 2015.

Following is a summary of their findings:

If the prospectus benchmark and risk-adjusted benchmark are different in month t, then the probability they will still be different in month t+12 is 86%.
Funds with a benchmark discrepancy tend to be managed more actively, having a higher active share. They also tend to be younger, have fewer assets, and are more expensive. They also tend to be small-cap and midcap funds.
Relative to the prospectus benchmark, funds on average underperform by 0.33% per year, which is not statistically distinguishable from zero (t-stat = -1.06). However, funds underperform by 0.78% per year relative to their appropriate benchmark, which is statistically significant (t-stat = -2.87).
For funds with a benchmark discrepancy, the prospectus benchmark typically understates the fund’s factor exposures. Thus, the prospectus benchmark is easier to beat than a benchmark with the same factor exposures as the fund.
For mismatched funds, the average return of the more appropriate, risk-adjusted benchmark is 1.5% per year (t-stat = 3.20) higher than the prospectus benchmark’s average return.
Traditional factors (market beta, size, value and momentum) explain about a third of the average difference in returns between the more appropriate benchmarks and the prospectus benchmarks. Including nontraditional factors along with traditional factors explains about 87% of the average difference in returns. Among nontraditional factors, the Fama-French profitability factor (RMW) has the largest impact.
The findings were strongest among large-cap funds with a benchmark discrepancy, with the prospectus benchmark overstating risk-adjusted performance by 2.41% per year (t-stat of 2.10), all of which can be explained by exposure to traditional as well as nontraditional factors. In other words, large-cap funds with a benchmark discrepancy tend to own smaller-cap stocks than their prospectus benchmarks indicate.
The benchmark discrepancies have a significant economic impact on performance evaluation as well as capital allocation, as investors generally focus on fund performance relative to the prospectus benchmark when allocating capital, even when a fund has a benchmark discrepancy.

Cremers, Fulkerson and Riley were able to conclude their results “show that a substantial number of funds have a prospectus benchmark that on average is easier to outperform compared to the benchmark implied by their holdings.”

Thus, the authors explain, “investors in general could considerably improve their capital allocations by avoiding funds where the prospectus benchmark is a poor match for the fund’s portfolio. Put another way, investors could improve their capital allocations by focusing on funds where the prospectus benchmark is a good match.”

Today investors have readily available tools, such as the regression analysis tool at Portfolio Visualizer, to enable them to choose the appropriate benchmark when evaluating the performance of an actively managed fund. Such tools also enable investors to determine if the fund actually has generated alpha after adjusting for exposure to common factors.

Constructing Benchmarks From ETFs

The findings in Cremers, Fulkerson and Riley’s study are consistent with those of an October 2016 study, “Investible Benchmarks for Actively Managed Mutual Funds,” that Riley authored alone. While most of the academic literature compares active funds’ returns to risk-adjusted benchmarks using indexes and/or factor regressions, Riley instead used ETFs to build investable benchmarks for actively managed mutual funds. He did so because, while benchmark index returns don’t include implementation costs, ETF returns do; thus, we have a more real-world comparison.

Riley’s benchmarking process did not make assumptions based on a mutual fund’s stated style or benchmark. Rather, he selected the appropriate benchmark based on a fund’s exposure to common factors (market beta, size, value, momentum, investment and profitability) and return history.

He noted: “The benchmarks can be identified in advance, do not require shorting or leverage, and require only annual rebalancing.” Riley also observed that a similar approach has shown that a portfolio of ETFs can be used to replicate the performance of hedge funds. Given that ETFs are relatively new, and that he needed a sufficient number of ETFs to complete the comparison, his study covered the period 2003 through 2014.

Following is a summary of his findings:

The ETFs used in the study had an average expense ratio of 0.31% compared to 1.21% for the actively managed mutual funds. The results using index funds would have been very similar, given their average expense ratio was just slightly higher at 0.36%.
On average, just four ETFs were needed to create a benchmark, with the majority of the benchmarks consisting of a single ETF. Among benchmarks that held six ETFs, the highest-ranking ETF had an average weight of 51.6%. The next two highest-ranking ETFs had a combined weight of 34.2%, and the last three ETFs had a combined weight of only 14.1%.
Benchmarks were effective at replicating the general return pattern and factor exposures of the actively managed mutual funds, with the average correlation between fund daily returns and benchmark daily returns being 0.97, while the average absolute difference in pricing factor exposures (e.g., market beta, size, value) ranged between 0.06 and 0.12.
While there was some variation in replication quality, the benchmarks closely tracked their designated fund even in the tails of the distribution. The 10th percentile of correlation was 0.94 and the 90th percentile of tracking error was 0.43%.
How well a benchmark replicated a fund was highly predictable.
The average actively managed fund underperformed its benchmark (produced a negative alpha versus the benchmark) by 1.03% per year (with a t-stat of 2.81, which indicates statistical significance) after accounting for expenses. The difference in expense ratios explains 90% of the underperformance.
The level of underperformance varies depending on investment style. For example, the average large-cap fund had an alpha of -1.54%, while the average midcap fund had an alpha of -0.42%.
Only 38.7% of fund-years had a positive alpha.
Actively managed mutual funds in the lowest expense ratio decile underperformed by only 0.35% per year, while those with the highest expense ratios underperformed by 1.61% per year.
There was a nearly 1:1 ratio between expense ratio and performance. Thus, if an investor must choose actively managed funds (which, unfortunately, is the case in many 401(k) plans), they should choose ones with the lowest costs within the desired risk profile.
There was no evidence that past performance provides valuable information. The best and worst performers in a given year had performance that differed by only 0.23% in the next year.

Riley concluded that because the actively managed funds in his sample managed about $2.5 trillion at the end of 2014, underperformance of 1% a year represented an opportunity cost of $25 billion per year for investors selecting active management. That’s the cost of the triumph of hype, hope and marketing over wisdom, experience and the evidence. Even that cost understates the true cost for taxable investors, as it’s often the case that, for such investors, the greatest cost of active management is taxes.

Conclusion

As Cremers, Fulkerson and Riley note in their study: “Risk-adjustment is central to performance evaluation. To facilitate that process, mutual funds are legally required to provide a benchmark to investors in the fund prospectus. Given that funds rarely change their prospectus benchmark and market themselves in a competitive environment to investors that often have limited sophistication, we might expect funds to respond strategically when constructing their portfolios.

While most funds appear to have a risk-appropriate prospectus benchmark, we find that a substantial portion of funds have a prospectus benchmark that understates risk and, consequently, overstates relative performance. Further, we show that funds benefit from that overstatement, as investor flows respond to performance relative to the prospectus benchmark even when a fund has a benchmark discrepancy.”

The annual SPIVA reports—which use appropriate, not prospectus-based, benchmarks—persistently demonstrate that using actively managed funds is a loser’s game, though choosing inappropriate benchmarks can make it seem otherwise.

The research, including Eugene Fama and Kenneth French’s study, “Luck versus Skill in the Cross-Section of Mutual Fund Returns,” which was published in the October 2010 issue of The Journal of Finance, has found that far fewer active managers (about 2%) are able to outperform appropriate risk-adjusted benchmarks than would be expected by chance. That 2% figure is even before considering taxes, which is typically the largest expense for taxable investors. If taxes were taken into account, the figure would be significantly lower. That makes active management a loser’s game—one that is possible to win, but the odds of doing so are so low it is simply not prudent to try.

This commentary originally appeared June 22 on ETF.com

By clicking on any of the links above, you acknowledge that they are solely for your convenience, and do not necessarily imply any affiliations, sponsorships, endorsements or representations whatsoever by us regarding third-party Web sites. We are not responsible for the content, availability or privacy policies of these sites, and shall not be responsible or liable for any information, opinions, advice, products or services available on or through them.

The opinions expressed by featured authors are their own and may not accurately reflect those of the BAM ALLIANCE. This article is for general information only and is not intended to serve as specific financial, accounting or tax advice.

AdminJuly 23, 2018