Section 3 Client Investment Recommendations and Strategies

Portfolio Performance Measures

42 min read · Lesson 12 of 12

Portfolio Performance Measures

Performance measurement is heavily tested because it separates investor return from manager skill and exposes where each falls short. Five sections: return calculations (total, real, geometric vs arithmetic, after-tax, yield types); time-weighted vs money-weighted returns including a hands-on calculator showing how cash-flow timing creates divergence between the manager's performance and the investor's experience; risk-adjusted measures (Sharpe, Treynor, Jensen's alpha — formulas canonical in M3.4, M3.12 covers their use), tracking error, and information ratio; benchmark selection with style-matching discipline; and attribution analysis, GIPS compliance, and survivorship bias. By the end you can read a performance report critically and explain to a client why their actual return diverges from the fund's reported return.

Section 1 of 5~8 min · 3 concept checks

Return calculations

Real vs nominal returns, and geometric vs arithmetic returns

Two return calculations regularly confused:

NOMINAL vs REAL RETURNS

  • Nominal return: The actual return earned, unadjusted for inflation. The number on the statement.
  • Real return: Inflation-adjusted return, reflecting actual PURCHASING POWER growth. Approximation: real return ≈ nominal return − inflation rate. Precise (Fisher equation): (1 + nominal) ÷ (1 + inflation) − 1.
  • Example. Nominal return 8%, inflation 3%. Real return ≈ 5% (approximation); precise: 1.08/1.03 − 1 = 4.85%. Over multi-decade periods, even small inflation differences compound to large purchasing-power differences.

GEOMETRIC vs ARITHMETIC RETURNS

  • Arithmetic mean return. Simple average: (R1 + R2 + ... + Rn) ÷ n. Easy to calculate. OVERSTATES actual growth when returns are volatile because it doesn't account for compounding.
  • Geometric mean return (CAGR). ((1+R1) × (1+R2) × ... × (1+Rn))^(1/n) − 1. Reflects the COMPOUND growth actually experienced. The correct measure for multi-period returns.
  • Example. Returns of +50% then −50%. Arithmetic mean = 0%. Geometric: ((1.50)(0.50))^(1/2) − 1 = (0.75)^0.5 − 1 = −13.4%. The investor ended up with $75 from $100 — the geometric mean is correct; arithmetic is misleading.
  • When to use which. Geometric for HISTORICAL performance (what actually happened). Arithmetic for EXPECTED future returns in single-period contexts (academic models).
  • Volatility drag. The gap between arithmetic and geometric mean grows with VOLATILITY. Approximate relationship: geometric ≈ arithmetic − (variance ÷ 2). High-volatility strategies suffer significant volatility drag.

After-tax returns and yield types

AFTER-TAX RETURN. The return remaining after paying federal, state, and (where applicable) local taxes on dividends, interest, and realized capital gains. Materially different from nominal returns for taxable accounts.

  • Calculation: Pre-tax return × (1 − effective tax rate). The effective tax rate depends on the character of income (ordinary, qualified dividend, LTCG, tax-exempt) and the investor's tax bracket.
  • Tax-equivalent yield (TEY). Used to compare TAX-EXEMPT munis to TAXABLE bonds. TEY = tax-exempt yield ÷ (1 − marginal tax rate). Example: 3% muni for an investor in 37% bracket has TEY = 3% / (1 − 0.37) = 4.76%. A taxable bond would need to yield 4.76% to leave the same after-tax amount.
  • After-tax return varies by ACCOUNT TYPE. Tax-deferred accounts (Traditional IRA, 401(k)): no annual tax drag; tax at withdrawal. Tax-free accounts (Roth, HSA for medical): no tax. Taxable accounts: annual tax on dividends and interest; capital gains tax on realized gains.

YIELD TYPES (review)

  • Current yield: Annual income ÷ current price. For a bond: coupon ÷ price. Quick measure but ignores capital appreciation/depreciation toward maturity.
  • Yield to maturity (YTM): Total return assuming the bond is held to maturity, including coupon reinvestment at the same yield. The most comprehensive bond yield measure.
  • Yield to call (YTC): Similar to YTM but assumes the bond is called at the first call date. For premium bonds, YTC is often LOWER than YTM — the conservative (lower) of the two is the “yield to worst” (YTW).
  • Dividend yield: Annual dividend ÷ current stock price. Most useful for income-focused equity investing.

Worked Example — Time-Weighted vs. Dollar-Weighted

Scenario: A client invests $100,000 in a fund on Jan 1.

  • Year 1: Fund returns +20% → value = $120,000
  • Client adds $100,000 on Jan 1 of Year 2 → total = $220,000
  • Year 2: Fund returns −10% → value = $198,000

Time-weighted return: (1.20) × (0.90) − 1 = +8% (over 2 years). Ignores the timing of the $100K addition. Measures the fund manager's performance.

Dollar-weighted return: Much lower — the client had more money invested during the losing year ($220K) than during the winning year ($100K). The actual money-weighted return is approximately +0.9%. This reflects the investor's actual experience.

Key takeaway: The same fund, same manager, same returns — but the investor's experience was dramatically different from the fund's reported performance because of poor timing on the additional investment.

Concept Check

An investor's portfolio has a nominal return of 9% for the year. Inflation during the same period was 4%. The REAL (inflation-adjusted) return, using the precise Fisher equation, is approximately:

REAL RETURN reflects actual PURCHASING POWER growth. FISHER EQUATION (precise): Real return = (1 + nominal) ÷ (1 + inflation) − 1. Here: (1.09 ÷ 1.04) − 1 = 1.0481 − 1 = 0.0481 or 4.81%. APPROXIMATION (subtraction): nominal − inflation = 9% − 4% = 5%. The approximation works well at low inflation but understates the divergence at higher inflation. Over multi-decade periods, even small differences compound. Option B (addition): wrong direction. Option C: invents a formula. Option D (division): not the relationship between nominal, real, and inflation.
Concept Check

A portfolio earned +30% in Year 1 and −20% in Year 2. The ARITHMETIC mean return is 5% per year. The GEOMETRIC mean return (CAGR) is approximately:

GEOMETRIC MEAN (CAGR) for multi-period returns: ((1+R1) × (1+R2) × ... × (1+Rn))^(1/n) − 1. Here: ((1.30) × (0.80))^(1/2) − 1 = (1.04)^0.5 − 1 = 1.98%. Verify: $100 → $130 → $104. Two years of 1.98% compound growth: $100 × (1.0198)^2 = $104. ✓ ARITHMETIC mean (5%) overstates actual growth because it ignores compounding sequence. The gap between arithmetic and geometric mean grows with VOLATILITY (“volatility drag”). For multi-period reporting, ALWAYS use geometric mean. Options A, C, D give wrong values.
Concept Check

An investor in the 37% federal tax bracket is choosing between a TAX-EXEMPT municipal bond yielding 3.5% and a corporate bond. To leave the investor with the SAME AFTER-TAX YIELD, the corporate bond must yield approximately:

TAX-EQUIVALENT YIELD (TEY) formula: TEY = Tax-exempt yield ÷ (1 − marginal tax rate). Here: 3.5% ÷ (1 − 0.37) = 3.5% ÷ 0.63 = 5.56%. A corporate bond yielding 5.56% leaves the investor with 5.56% × (1 − 0.37) = 3.5% after federal tax — matching the tax-exempt muni. State and local muni-tax exemptions further increase the comparable TEY for in-state munis. TEY analysis is fundamental for high-bracket investors deciding between muni and taxable bonds. Options A, B, D apply wrong calculations or wrong direction.
Section 2 of 5~11 min · 3 concept checks

TWRR vs MWRR

Time-Weighted Return (TWRR) — the manager's measure

TWRR ELIMINATES the impact of cash flows (deposits and withdrawals) controlled by the investor. Isolates the PORTFOLIO MANAGER'S investment decisions from the investor's allocation timing.

  • Mechanics. Divide the measurement period into SUB-PERIODS bounded by external cash flows. Calculate the period return for each sub-period. Chain-link (multiply) the sub-period returns to get the total TWRR.
  • Formula: TWRR = ((1+R1) × (1+R2) × ... × (1+Rn))^(1/n) − 1, where each R is a SUB-PERIOD return based on starting and ending values within that sub-period (not the total period).
  • Why it's the manager's measure. The manager doesn't control client deposits/withdrawals. By isolating sub-period returns, TWRR shows what the manager's actual investment decisions produced — comparable across managers and benchmarks.
  • Used by GIPS standards. The CFA Institute's Global Investment Performance Standards (GIPS) require TWRR for performance reporting to fairly compare managers regardless of client cash-flow patterns.
  • Reported on mutual fund prospectuses. Mutual fund “total return” figures are TWRR. Same investor in the same fund can experience very different actual returns based on the timing of their purchases and sales.

Example. Fund returns +20% in Year 1, then −10% in Year 2. TWRR = (1.20)(0.90) − 1 = 8% over 2 years (annualized 3.92%). This is the fund's reported return regardless of when individual investors entered or exited.

Money-Weighted Return (MWRR / IRR) — the investor's actual experience

MONEY-WEIGHTED RETURN (also called Dollar-Weighted Return or Internal Rate of Return / IRR) REFLECTS the impact of cash-flow timing on the investor's actual experience. Larger cash flows have larger weight in the calculation.

  • Mechanics. MWRR is the DISCOUNT RATE that makes the NET PRESENT VALUE of all cash flows (initial investment, additional deposits, withdrawals, ending value) equal to ZERO. It's the IRR of the investor's personal cash-flow series.
  • Mathematically: 0 = CF0 + CF1/(1+r) + CF2/(1+r)^2 + ... + CFn/(1+r)^n. Solved iteratively (financial calculator or spreadsheet).
  • Sensitive to timing. Larger cash flows during favorable periods boost MWRR; larger cash flows during unfavorable periods drag MWRR. The investor's timing skills (or luck) are captured.
  • Reflects the investor's ACTUAL experience. Two investors in the same fund can have very different MWRRs based on when they invested or withdrew. Same fund returns — different personal results.
  • Used for INDIVIDUAL investor performance reporting and PRIVATE EQUITY funds (where cash flows are concentrated and irregular). Buffett's legendary returns are TWRR-style; LP returns in PE funds use IRR.

The cardinal test rule: If the question asks about EVALUATING THE MANAGER — use TWRR. If the question asks about WHAT THE INVESTOR ACTUALLY EARNED — use MWRR.

TWRR vs MWRR calculator — see the divergence

Same fund returns, different cash-flow timing. Watch how big a deposit before a bad year creates a large MWRR-vs-TWRR gap.

Concept Check

A portfolio earned +10% in Q1, −5% in Q2, +8% in Q3, and +3% in Q4 of a given year. There were no external cash flows. The TIME-WEIGHTED RETURN for the full year is:

TIME-WEIGHTED RETURN CHAIN-LINKING: TWRR = (1+R1) × (1+R2) × ... × (1+Rn) − 1, where each R is a SUB-PERIOD return. Here: (1.10) × (0.95) × (1.08) × (1.03) − 1 = 1.0450 × 1.08 × 1.03 − 1 ≈ 1.6340/1 − 1 = 0.1634 or 16.34%. With no external cash flows, TWRR equals the cumulative compound return. With cash flows, the period would be divided into sub-periods bounded by each cash flow, then chain-linked the same way. Critical: TWRR ALWAYS uses geometric (compound) chaining, not arithmetic. Options A (sum), B (avg), C (geometric annualized) apply wrong methodologies.
Concept Check

To evaluate a PORTFOLIO MANAGER'S SKILL independently of client cash flows, the BEST measure is:

TIME-WEIGHTED RETURN (TWRR) is the standard for evaluating manager skill. By dividing the measurement period into sub-periods bounded by external cash flows and chain-linking, TWRR REMOVES the impact of client deposits/withdrawals the manager doesn't control. Required by GIPS; used in mutual fund prospectus returns. Cardinal exam distinction: TWRR for MANAGER; MWRR for INVESTOR'S ACTUAL EXPERIENCE. Option B (MWRR/IRR) reflects investor experience. Options C, D don't isolate manager performance.
Concept Check

A fund's TIME-WEIGHTED RETURN is +12% for the year, but a particular INVESTOR'S MONEY-WEIGHTED RETURN in the same fund is only +3%. The MOST LIKELY explanation is:

MWRR vs TWRR DIVERGENCE always traces to CASH-FLOW TIMING. The fund's TWRR (+12%) is the manager's return regardless of investor activity. The investor's MWRR (+3%) reflects their actual experience given when they added or withdrew. If MWRR < TWRR significantly, the investor added money before bad periods (or withdrew before good periods). Larger balances were exposed to worse returns; smaller to better. THE classic test scenario for MWRR vs TWRR. Options A, C, D mischaracterize: no differential execution; fees rarely create 9pt gaps; margin shows in other metrics.
Section 3 of 5~9 min · 3 concept checks

Risk-adjusted performance measures

Risk-adjusted return measures — interpretation

The formulas for Sharpe, Treynor, and Jensen's alpha are canonical in M3.4 capital-market-theory. This section focuses on their INTERPRETATION and USE in performance evaluation.

  • SHARPE RATIO = (Portfolio return − risk-free rate) ÷ portfolio standard deviation. Measures EXCESS RETURN per unit of TOTAL RISK (volatility). Use when evaluating a STANDALONE portfolio or comparing portfolios with different risk levels. Higher is better. Doesn't require the portfolio to be diversified.
  • TREYNOR RATIO = (Portfolio return − risk-free rate) ÷ portfolio beta. Measures EXCESS RETURN per unit of SYSTEMATIC RISK (market exposure). Use when the portfolio is well-DIVERSIFIED (idiosyncratic risk already eliminated). Higher is better. Less meaningful for poorly-diversified portfolios where beta doesn't fully capture risk.
  • JENSEN'S ALPHA = Portfolio return − expected return predicted by CAPM. Measures EXCESS RETURN BEYOND what CAPM predicts given the portfolio's beta. Positive alpha = MANAGER ADDED VALUE; negative alpha = manager destroyed value; zero alpha = manager performed in line with risk taken. The most direct measure of manager skill.

Use them together. A high alpha with high Sharpe and Treynor strongly suggests genuine manager skill. High alpha with low Sharpe suggests excessive risk-taking. Inconsistency between Sharpe (total risk) and Treynor (systematic risk) suggests the portfolio is poorly diversified.

Important interpretation caveat: all three measures use HISTORICAL data. Past risk-adjusted performance doesn't guarantee future results. Statistical significance requires multiple years of data; short-term Sharpe/alpha figures are noisy.

Tracking error and information ratio

For ACTIVELY MANAGED portfolios benchmarked to an index, two additional measures evaluate the active-management quality:

  • TRACKING ERROR. The STANDARD DEVIATION of the difference between portfolio returns and benchmark returns. Low tracking error = portfolio closely follows benchmark; high tracking error = significant active bets vs benchmark. Index funds: tracking error near zero (by design). Concentrated active funds: tracking error of 5-15% common.
  • INFORMATION RATIO (IR) = (Portfolio return − benchmark return) ÷ Tracking error. Measures EXCESS RETURN per unit of ACTIVE RISK. Similar in concept to Sharpe but uses BENCHMARK (not risk-free rate) as the comparison and TRACKING ERROR (not total volatility) as the risk measure.
  • IR interpretation: IR > 0.5 considered good for an active manager; IR > 1.0 considered exceptional and rare over multi-year periods. Most active managers struggle to consistently produce positive IR after fees.
  • Why IR matters. For active management to be worthwhile, the manager must produce excess return WORTH the active risk taken. A manager who beats the benchmark by 1% with 5% tracking error has IR = 0.2 — close to luck given the active risk. A manager who beats by 1% with 1% tracking error has IR = 1.0 — meaningful skill if sustained.

Active share. A complementary measure: the percentage of portfolio holdings that DIFFER from the benchmark. Higher active share = more genuine active management. Funds with low active share but high fees are “closet indexers” — a focus of recent regulatory scrutiny.

Concept Check

An investor compares Fund A (Sharpe ratio 1.2) and Fund B (Sharpe ratio 0.6) for inclusion as a STANDALONE investment. Both have similar return levels. The RELEVANT INSIGHT is:

SHARPE RATIO = (Portfolio return − Risk-free rate) ÷ Portfolio standard deviation. Measures EXCESS RETURN PER UNIT OF TOTAL RISK. Higher is better. APPROPRIATE for STANDALONE evaluation (doesn't require diversification). Fund A's 1.2 vs B's 0.6: A delivered twice the excess return per unit of risk. Use TREYNOR (beta-based) only when portfolio is added to an already-diversified base. JENSEN'S ALPHA measures excess vs CAPM. Options A, B, D wrong: A misreads Sharpe as risk only; B confuses with Treynor; D invents fee adjustment.
Concept Check

An ACTIVE manager has a return of 12% vs a benchmark return of 10% (excess return 2%). Their TRACKING ERROR is 1%. Their INFORMATION RATIO is:

INFORMATION RATIO (IR) = (Portfolio return − Benchmark return) ÷ Tracking error. Here: (12% − 10%) ÷ 1% = 2.0. Measures EXCESS RETURN per unit of ACTIVE RISK. IR > 0.5 considered good; IR > 1.0 exceptional; IR = 2.0 extraordinary. Similar to Sharpe but uses BENCHMARK (not risk-free rate) and TRACKING ERROR (not total volatility). A manager delivering 2% excess with 5% tracking error has IR = 0.4 — close to luck. Same excess with 1% tracking error is meaningful skill. Options A, B, C apply wrong formulas.
Section 4 of 5~6 min · 2 concept checks

Benchmark selection

Benchmark selection — the SAMURAI properties

The CFA Institute teaches the SAMURAI properties of a good benchmark. An appropriate benchmark should be:

  • S — Specified in advance. Identified BEFORE the period being measured. Choosing a benchmark after the fact is cherry-picking.
  • A — Appropriate. Same asset class, market cap range, geographic exposure, and investment style as the portfolio.
  • M — Measurable. Performance can be calculated and reported on a regular and timely basis.
  • U — Unambiguous. The identity and weights of constituent securities are clearly defined.
  • R — Reflective of current investment opinion. The manager has views (positive, negative, or neutral) on the constituents.
  • A — Accountable. The investor has accepted the manager's use of this benchmark.
  • I — Investable. The investor could actually purchase the benchmark as a passive alternative. If the benchmark isn't investable, performance comparison is theoretical.

Common benchmark mismatches the exam tests:

  • SMALL-CAP fund benchmarked to S&P 500 (large-cap). Mismatched. Use Russell 2000 or S&P 600.
  • VALUE fund benchmarked to broad-market index. Mismatched. Use Russell 1000 Value or S&P 500 Value.
  • INTERNATIONAL fund benchmarked to S&P 500. Mismatched. Use MSCI EAFE or MSCI ACWI ex-US.
  • BALANCED fund benchmarked only to S&P 500. Mismatched. Use blended benchmark (e.g., 60% S&P + 40% Bloomberg Agg).
  • EMERGING MARKETS fund benchmarked to MSCI EAFE (developed). Mismatched. Use MSCI Emerging Markets.

Benchmark mismatch creates ARTIFICIAL alpha during periods when the portfolio's style is favored vs the mismatched benchmark — and DESTRUCTIVE alpha when the style is out of favor. Always confirm benchmark appropriateness when reviewing performance.

Concept Check

An EMERGING MARKETS equity fund focused on Chinese, Indian, and Brazilian stocks should be benchmarked against:

BENCHMARK STYLE-MATCHING: emerging markets equity fund → EMERGING MARKETS EQUITY INDEX. MSCI Emerging Markets is the standard. MSCI EAFE (Option B) is DEVELOPED international — excludes emerging markets. S&P 500 (Option C) is US large-cap — wrong geography. Bond index (Option D) is wrong asset class. The SAMURAI properties require benchmark to match portfolio's style and exposure. Mismatch creates artificial alpha during favorable periods and destructive alpha during unfavorable — obscuring true manager skill.
Concept Check

A SMALL-CAP GROWTH equity fund should be benchmarked against:

BENCHMARK STYLE-MATCHING for small-cap growth fund: RUSSELL 2000 GROWTH INDEX or S&P 600 Growth. Russell 2000 is the standard small-cap universe (smallest 2,000 of the Russell 3000); the Growth sub-index isolates growth-style stocks. Matches the fund's investment universe on BOTH dimensions (cap and style). Wrong benchmarks: S&P 500 (Option A) is LARGE-CAP BLEND — wrong on cap and style. Bloomberg Agg (Option C) is bonds — wrong asset class. Dow Jones (Option D) is 30 large-cap stocks — price-weighted methodology, wrong cap. The cardinal rule: BENCHMARK MUST MATCH THE PORTFOLIO'S CAP RANGE AND STYLE.
Concept Check

A portfolio outperformed its benchmark by 4%. Attribution analysis shows: sector allocation effect +3.5%, security selection effect +0.3%, interaction effect +0.2%. The INSIGHT this provides:

PERFORMANCE ATTRIBUTION decomposes excess return into: SECTOR ALLOCATION (top-down sector bets), SECURITY SELECTION (bottom-up stock picking), INTERACTION (cross-term). Here: allocation +3.5%, selection +0.3%, interaction +0.2%, total +4%. Of the 4% excess, ALLOCATION drove 87% of value-add; selection contributed only 8%. Diagnosis: TOP-DOWN SECTOR ROTATOR, not stock picker. Future returns depend on continued correct sector calls. Same 4% excess can come from very different sources. Options A, B, D misread the decomposition.
Concept Check

A small investment advisory firm wants to advertise its performance to attract institutional clients. To claim GIPS COMPLIANCE, the firm must:

GIPS COMPLIANCE requires: (1) COMPOSITE REPORTING — group all DISCRETIONARY accounts by strategy; no cherry-picking. (2) TWRR for fair manager comparison. (3) MINIMUM 5 YEARS of GIPS-compliant history. (4) SURVIVORSHIP-BIAS AVOIDANCE — closed portfolios stay in composite returns. (5) Required DISCLOSURES (composite description, fees, currency, dispersion, returns, benchmark). GIPS is VOLUNTARY but required by most institutional clients. Options A, B, C invent compliance requirements not in the GIPS framework.
Section 5 of 5~8 min · 3 concept checks

Attribution, GIPS, and pitfalls

Performance attribution — sector vs security selection

PERFORMANCE ATTRIBUTION decomposes a portfolio's excess return (vs benchmark) into the SOURCES of that excess: sector allocation, security selection, and interaction effects. Helps diagnose where a manager added or destroyed value.

  • Sector (allocation) effect. Excess return from being OVERWEIGHT or UNDERWEIGHT sectors relative to the benchmark. If the portfolio was overweight tech and tech outperformed, the allocation effect is positive. Captures top-down macro/sector bets.
  • Security selection effect. Excess return WITHIN each sector from picking individual securities that outperformed the sector index. Captures bottom-up stock-picking skill.
  • Interaction effect. The cross-term: being overweight a sector AND picking outperforming securities within it compounds the impact. Usually a small residual.

Example. Portfolio beats benchmark by 3%. Attribution: sector allocation +1.5% (the manager was overweight a winning sector), security selection +1.2% (picked above-average stocks), interaction +0.3%. Conclusion: roughly equal contribution from top-down and bottom-up — balanced manager.

Brinson-Fachler model. The standard methodology for attribution analysis. Most performance reports break down attribution by sector/region using this framework.

Why attribution matters. Two managers can both beat the benchmark by 3% but with very different drivers. A manager whose alpha comes entirely from sector bets (allocation effect) is a top-down strategist; a manager whose alpha comes from security selection is a bottom-up stock picker. Different styles deserve different evaluation criteria and produce different patterns of future returns.

GIPS — Global Investment Performance Standards

The Global Investment Performance Standards (GIPS) are voluntary, industry-standard guidelines for CALCULATING AND REPORTING investment performance. Maintained by the CFA Institute. Designed to ensure FAIR REPRESENTATION and FULL DISCLOSURE of performance and to enable APPLES-TO-APPLES COMPARISON between investment managers.

  • Voluntary but widely adopted. GIPS compliance is a powerful credibility signal for institutional managers. Most reputable institutional managers claim GIPS compliance; failure to comply is a competitive disadvantage in institutional mandates.
  • Composite reporting. Managers report on COMPOSITES — groupings of all DISCRETIONARY accounts with similar strategies. Cannot cherry-pick best-performing accounts. Composite returns are weighted aggregates of all accounts in the strategy.
  • TWRR required. Performance must be calculated using TIME-WEIGHTED returns to remove the impact of client cash flows. (MWRR may be reported as supplemental information.)
  • Five-year minimum. Compliance requires at least five years of GIPS-compliant performance history (or since inception if less). Managers can't selectively present favorable shorter periods.
  • Survivorship-bias avoidance. Closed or terminated portfolios must remain in composite returns for the period they were active. Cannot retroactively remove discontinued strategies that performed poorly.
  • Required disclosures. Composite description, fees, currency, dispersion of returns, three-year ex-post standard deviation, gross and net returns, benchmark.
  • Verification. Optional but valued: an independent verification firm reviews the manager's GIPS compliance. Verified compliance is a higher signal than self-claimed.

Survivorship bias and other performance pitfalls

Several systematic biases distort performance comparison if not properly handled:

  • SURVIVORSHIP BIAS. When studying historical fund performance, only the funds that SURVIVED are typically available. Funds that closed (often due to poor performance) drop out of databases. The remaining sample looks better than the original universe did, OVERSTATING historical returns. Estimated impact: 1-2% per year overstatement for actively managed equity funds.
  • BACKFILL BIAS. Hedge funds and others can choose when to start reporting to a database. They typically start reporting AFTER a period of strong returns, then have those returns BACKFILLED into the database history. Inflates apparent historical performance.
  • SELECTION BIAS. Voluntary reporting databases include only managers who CHOOSE to report. Poor-performing managers may withdraw, leaving the dataset skewed positive.
  • END-OF-PERIOD BIAS. Performance can be cherry-picked by choosing a start and end date favorable to the conclusion. Reputable analyses use multiple start dates or rolling periods.
  • FEE TREATMENT. Comparing GROSS returns (before fees) of one fund to NET returns (after fees) of another distorts comparison. Always compare like-to-like.
  • BENCHMARK SHIFTING. Managers occasionally change benchmarks after a poor period. Reputable performance reporting flags benchmark changes and shows both old and new comparison.

The exam tests recognition of these biases and the principle that PERFORMANCE COMPARISONS REQUIRE CARE — standardized methodologies like GIPS exist to address these distortions.

Concept Check

A research study claims actively managed equity mutual funds returned 9% per year over the past 20 years — outperforming the S&P 500's 8% return. The DATABASE used includes only funds CURRENTLY OPERATING. The conclusion most likely suffers from:

SURVIVORSHIP BIAS is the systematic distortion when failed/closed funds drop out of databases. The remaining sample of currently-operating funds is biased toward survivors — which tended to perform better than the eventually-closed funds. Studies using survivor-only databases OVERSTATE returns by 1-2%/year for active equity funds. The result: any survivor-only study overstates active-management performance vs benchmark. Reputable studies use SURVIVORSHIP-BIAS-FREE databases. Options B, C, D describe different biases; survivorship is the specific one from excluding failed funds.
Concept Check

A financial adviser presents their performance: “Over the past 5 years, my portfolio recommendations returned 11% annually vs the S&P 500's 9%.” To CRITICALLY EVALUATE this claim, the prospect should ask:

CRITICAL PERFORMANCE EVALUATION addresses multiple distortion mechanisms: (1) BENCHMARK APPROPRIATENESS — was S&P 500 right for the strategy? (2) FEE TREATMENT — gross or net? (3) PERIOD SELECTION — 5 years may be cherry-picked. (4) SAMPLE SELECTION — only winning accounts shown? GIPS requires composite reporting. (5) VERIFICATION — self-reported or independently verified? Reputable performance reporting (GIPS-compliant) addresses all. Single-question evaluations (Options A, B, C) miss the multidimensional nature of performance distortion.
SummaryCram aid & consolidated traps

Chapter summary

Types of returns — baseline overview

Understanding different return calculations is critical for evaluating portfolio performance:

  • Total return: Includes both income (dividends, interest) and capital appreciation. The most comprehensive return measure.
  • Holding period return: Total gain or loss over the period an investment is held.
  • Annualized return: Holding period return converted to a per-year basis for comparison across investments.
  • Cumulative return: Total return over the entire period (not annualized). Useful for long-term comparisons.

Time-weighted vs. dollar-weighted returns — baseline

Time-Weighted Return (TWRR)

  • Eliminates the impact of cash flows (deposits and withdrawals)
  • Best measure of a portfolio manager's performance
  • Used by the CFA Institute's GIPS standards

Dollar-Weighted Return (IRR / MWRR)

  • Reflects the impact of the timing and size of cash flows
  • Best measure of the investor's actual experience
  • Equivalent to the internal rate of return (IRR) on the cash flow series

Other performance concepts — baseline

  • Expected return: Probability-weighted average of possible outcomes.
  • Inflation-adjusted (real) return: Nominal return minus inflation. Reflects actual purchasing power growth.
  • After-tax return: Return after accounting for taxes on dividends, interest, and capital gains.
  • Current yield: Annual income (dividends or interest) divided by current price.
  • Yield to maturity (YTM): Total return assuming a bond is held to maturity, including reinvested coupons.
Time-weighted vs. dollar-weighted — the test rule: If asked which return is better for evaluating a fund manager, the answer is always TIME-WEIGHTED. If asked which reflects the investor's actual return, it's DOLLAR-WEIGHTED. This distinction is one of the most frequently tested concepts on the Series 66.

Returns — complete reference

Return MeasureWhat It CapturesWhen to Use
Total returnIncome + capital appreciationMost comprehensive measure; standard for reporting
Time-weightedManager's investment skill (removes cash-flow effects)Evaluating fund/portfolio managers; GIPS standard
Dollar-weighted (IRR)Investor's actual experience (includes cash-flow timing)Evaluating client's actual return; private equity LP returns
Real returnNominal return adjusted for inflationLong-horizon planning; purchasing-power analysis
After-tax returnReturn net of taxes on income and gainsComparing taxable vs tax-advantaged accounts

Benchmarking — baseline

An appropriate benchmark should be:

  • Investable: A real, accessible alternative the investor could choose instead
  • Style-matched: Same asset class, market cap range, and investment style as the portfolio
  • Consistent: Applied over time without cherry-picking favorable comparisons
  • Specified in advance: Identified before the period being measured
  • Unambiguous: Clear constituent securities and weights

A SMALL-CAP GROWTH fund benchmarked to the S&P 500 (large-cap blend) creates artificial alpha when small-caps outperform large-caps regardless of manager skill — a benchmark-mismatch trap. Use the Russell 2000 Growth Index for small-cap growth funds.

Exam essentials · cram aid
TWRR
Manager's skill; GIPS standard; removes cash flows
MWRR
Investor's actual return; = IRR; cash-flow-sensitive
Real return
Nominal − inflation (approximation)
Geometric
CAGR for multi-period; less than arithmetic when volatile
Sharpe
Excess return ÷ total risk (StdDev)
Treynor
Excess return ÷ beta; use when diversified
Jensen alpha
Excess return over CAPM prediction
Tracking error
StdDev of portfolio vs benchmark returns
Info ratio
Active return ÷ tracking error
SAMURAI
Specified, Appropriate, Measurable, Unambig., Reflective, Accountable, Investable
GIPS
Composite reporting, TWRR, 5+ yrs, no survivor cleanup
Survivor bias
Failed funds drop out; inflates apparent returns
Common traps the exam plants
  • “Use TWRR to report individual investor performance.” WRONG — TWRR evaluates the manager. MWRR/IRR reflects the investor's actual experience.
  • “Arithmetic mean is the correct measure of historical multi-period return.” WRONG — geometric mean (CAGR) reflects compounding. Arithmetic overstates for volatile returns.
  • “Sharpe ratio uses beta as the risk measure.” WRONG — Sharpe uses TOTAL risk (standard deviation). Treynor uses beta.
  • “Positive Jensen's alpha is automatic if the portfolio outperformed.” WRONG — alpha measures excess return BEYOND CAPM's prediction given the portfolio's beta. A portfolio with high beta should outperform in up markets even without manager skill.
  • “Low tracking error means the manager added value.” WRONG — low tracking error means the manager closely followed the benchmark. Index funds have near-zero tracking error by design. High tracking error with positive excess return is meaningful active management.
  • “The S&P 500 is an appropriate benchmark for any US equity fund.” WRONG — benchmark must MATCH the fund's style (cap, growth/value). Small-cap funds use Russell 2000; growth funds use a growth index, etc.
  • “GIPS compliance is required by law for US investment advisers.” WRONG — GIPS is VOLUNTARY. Required only if the firm claims compliance. Institutional clients often require it.
  • “Survivorship bias inflates HEDGE FUND returns more than mutual funds.” WRONG — both are affected, hedge funds severely. But the exam tests recognition of the concept, not relative magnitude.
  • “Tax-equivalent yield is used to compare two taxable bonds.” WRONG — TEY is for comparing a TAX-EXEMPT muni to a taxable bond. Two taxable bonds compare directly on stated yields.
  • “A manager with positive sector allocation effect must also be a good stock picker.” WRONG — allocation effect and selection effect are SEPARATE. A manager can be skilled at one but not the other.
Concept Check

A fund's time-weighted return is 12% but a particular investor's dollar-weighted return in the same fund is only 3%. The MOST likely explanation is:

Dollar-weighted return reflects the timing of cash flows. If an investor adds money just before poor performance (or withdraws before strong performance), their actual return will be lower than the fund's time-weighted return. The manager did well (12%), but the investor's timing was poor (3% actual experience).
Practice what you just learned

Test yourself with exam-style questions on this topic.

Practice Questions