Backtest Journal

Backtested Property Predictions in Australia — What the Numbers Actually Showed

Every property hotspot list has a methodology. Most of them were never tested against real outcomes. The suburb that made the list in 2021 — did it actually outperform? The infrastructure project that was supposed to drive demand — did it move the median? Almost nobody checks.

Backtesting changes that. It takes a formula, applies it to historical data on suburbs with known outcomes, and measures accuracy against what actually happened. Simple in theory. Almost nobody does it rigorously in property. Here’s what we found when we did — across 78 suburbs, three different approaches, and 12,360 postcode-month observations.

What Backtesting Actually Means for Property

In share markets, backtesting is standard practice before any serious money follows a strategy. You define your rules, run them over historical prices, and measure what the returns would have been. The strategy either worked or it didn’t. The market is the referee.

Property makes this harder. Transactions are slow. Data is patchy at suburb level. Feedback loops take years to close — you need to know what a suburb did over the following 12–36 months before you can score a prediction right or wrong. Most “hotspot formulas” are assembled from what sounds logical — population growth, infrastructure announcements, rental yield — without ever checking whether those inputs would have picked the right suburbs in the past.

Our approach: take 78 Australian suburbs with known outcomes. 28 boomed. 50 didn’t. Apply each formula using only data available at the time — no future information — and measure how often it got the answer right. Then keep what worked and discard what didn’t, regardless of how convincing it looked in theory.

The First Approach: Prediction

The obvious starting point was the conventional playbook — the metrics that fill property investment webinars and buyer’s agent slide decks. Infrastructure spending. Population growth. Building approvals. Vendor discounts. Rental yield trends.

The goal was to predict booms before they started. Identify the suburb 12–18 months ahead, buy before the crowd arrives, capture the full run. That’s the investor dream. A leading indicator that lets you be genuinely early.

Backtest accuracy55%

Suburbs tested78 (28 boomed, 50 controls)

COIN FLIP

55% accuracy. The same result you’d get from flipping a coin. Seven inputs, years of data, and the formula was no better than guessing.

The reasons become clear in hindsight. Infrastructure spending affects whole corridors over decades — it tells you almost nothing about which suburb within that corridor will move first, or when. Population growth is measured at LGA level, far too coarse for suburb selection, and shifts on decade-long timescales that don’t map to investment horizons. Building approvals reflect state planning trends, not individual suburb demand dynamics.

And the metrics that do exist at suburb level — population projections, announced infrastructure budgets — are already widely known and priced in by the time they become public. If a rail extension is announced, the market has already moved.

What this showed

Infrastructure, population growth, and building approvals are interesting context. They are not suburb-level timing tools. Prediction — identifying a boom before it starts — doesn’t work with publicly available data at suburb granularity.

Want to see what actually survived the backtest?

BoomAU scores 393 suburbs fortnightly using only the signals that passed 78-suburb validation. Join the wishlist.

The Pivot: Detection Instead of Prediction

Prediction failed. The next approach asked a different question entirely: instead of trying to identify booms before they start, what if we just detect them once they’ve started?

At first this feels like cheating. Isn’t the point to be early? But when you examine actual boom trajectories across Australian suburbs, the picture changes. Booms are multi-year events. Catching one 6–12 months after it starts still captures 60–85% of the total gains — with dramatically higher confidence than any prediction formula could produce. The question “is this suburb booming right now?” is answerable with current market data. The question “will it boom?” is not.

The detection formula has five components, each weighted by its contribution to boom identification:

Detection formula — 5 components

✓Momentum (30%) — Is price growth accelerating or decelerating?
✓Growth Strength (25%) — Annual growth rate scored directly
✓Tightness (20%) — Days on market combined with rental vacancy rate
✓Sustainability (15%) — Rental yield plus vacancy trend direction
✓Affordability Headroom (10%) — Suburb median price relative to capital city median

Four hard filters must all be met before a suburb is scored at all: annual growth at or above 5%, days on market at or below 45 days, vacancy rate at or below 2%, and median price at or below $800K. Every single boom in the backtest dataset was in a suburb priced well below the capital city median — the $800K cap reflects that reality.

Backtest accuracy85.7%

False positives0%

Separation gap (booms vs non-booms)20.2 points

Suburbs tested78 (28 boomed, 50 controls)

PASSED

85.7% accuracy. Zero false positives. The 20.2-point separation gap between genuine booms and non-booming suburbs means the formula doesn’t just get the right answer — it gets it with conviction. Borderline scores are uncommon. The formula sees a clear difference between a suburb that’s booming and one that isn’t.

What this showed

Detecting a boom in progress is tractable. Predicting one before it starts is not — at least with publicly available data. The same metrics that failed as predictors (DOM, vacancy, growth rate) work reliably as detection signals once framed around the right question. The question matters as much as the data.

The Ranking Test: Which Booming Suburb Wins?

Detection tells you which suburbs are booming. It doesn’t tell you which of those booming suburbs will outperform the others. So we built a separate ranking model — one that attempted to forecast 3-year forward capital growth and rank suburbs by predicted performance within any given period.

The model used six inputs: annual growth rate, affordability headroom, boom detection signal, 5-year price momentum, national market conditions, and growth acceleration. It was run in walk-forward format across 12,360 postcode-month observations — meaning no future information fed the predictions, only what was available at the time of each forecast.

When rankings were measured across all periods together, the headline number looked strong. Boom years produced both higher predictions and higher actual returns. The overall correlation between predicted and actual performance appeared convincing — a Spearman rank correlation of 0.42, which is high by quantitative standards.

Overall ranking correlation (all periods combined)0.42

Ranking within any single period-0.08

80% confidence interval actual coverage46%

DROPPED

Then we asked the harder question: within any single month, does the suburb ranked #1 actually outperform suburb #2 in the same month? That’s the real test for an investor. You don’t buy “the property market in aggregate” — you choose one suburb, in one period. The ranking either helps you pick the right one or it doesn’t.

Within any given period, the ranking was −0.08. Worse than random. A coin flip would have picked better.

The apparent 0.42 was a market-cycle illusion. Boom years drove both higher predictions and stronger real returns simultaneously — so the model looked accurate when it was actually tracking whether it was a boom year, not which suburb within that year would win. Strip out that time-period effect and the ranking told you nothing useful about relative suburb performance.

The confidence intervals were just as unreliable. An 80% confidence interval should contain the real outcome 80% of the time — ours hit 46%. At the highest-confidence predictions, where coverage should be tightest, it dropped further still. The model was most confident precisely where it was most wrong.

We dropped the ranking model entirely. Not a revision — a complete removal. The underlying reason: growth phase doesn’t predict which suburb outperforms within that phase. When the tide rises, it lifts all boats. Past momentum, acceleration, and 5-year history don’t tell you which boat rises most. Mean reversion dominates — past outperformers tend to underperform going forward.

What this showed

A high overall correlation figure can be deeply misleading if it’s driven by market-cycle effects rather than genuine suburb discrimination. The only reliable test is within any single period: does the ranking actually pick the better suburb against a real peer in the same market conditions? If not, it has no investor utility regardless of what the headline number looks like.

Built on backtesting. No forecasts we couldn't validate.

BoomAU's tier labels come from signals that survived the full within-period test. 393 suburbs scored fortnightly. Join the wishlist.

Two Signals That Actually Survived

After the ranking model failed, we went back to the underlying data and asked: is there anything that predicts relative suburb outperformance after cancelling the market-wide tide? We tested every metric we had. Most failed. Two survived.

1. Affordability headroom

How a suburb’s median price compares to its capital city median. Suburbs priced below the city median have consistently outperformed after cancelling the market tide. Suburbs priced above 1.5× the city median have consistently underperformed. The effect is perfectly monotonic — more headroom equals better outcome, without exception across every subsample tested. This is the only cross-suburb ranking signal that survived.

2. Boom timing

Specifically, how much of the affordability gap has already been consumed since the boom started. A suburb early in its boom — less than 30% of the affordability gap closed — still has most of the runway remaining. Detection catches booms 6–12 months after they start, at which point 60–85% of total gains still lie ahead. The earlier the entry within a detected boom, the more upside remains.

These two signals feed into the tier labels BoomAU produces fortnightly. The walk-forward tier discrimination results, tested across 12,360 postcode-months with no lookahead:

Tier	Excess return	Beat market	n
Strong Buy	+7.5pp	71%	2,103
Buy	+1.3pp	55%	3,349
Watch	−0.7pp	47%	5,788
Pass	−6.4pp	28%	1,120

Walk-forward backtest, 12,360 postcode-months. No lookahead. Excess return = suburb 12-month growth minus market median growth. Full methodology →

Perfectly monotonic. Strong Buy outperforms by 7.5 percentage points and beats the market 71% of the time. Pass underperforms by 6.4 percentage points — only 28% of Pass suburbs beat the market. A 13.9-point spread between the best and worst tiers, in the same asset class, over the same time period.

This is what genuine backtested property predictions look like. Not a ranked list of hotspots. Not a forecast built on untested inputs. A tier system with demonstrated within-period discrimination across thousands of observations, where every layer of the structure points in the right direction.

What You Can Check Yourself

The two signals that survived are public knowledge. You don’t need a subscription to understand them — though applying them across hundreds of suburbs every fortnight is another matter.

1. Check affordability headroom

Domain publishes capital city median prices quarterly. Find your target suburb’s median on Domain or YIP. If the suburb sits below the city median, it has headroom — the strongest boom precondition the backtest found. If it’s priced above 1.5× the city median, it consistently underperformed peers in the walk-forward data.

2. Check the detection signals

Annual growth rate and days on market are available free on YIP (backed by CoreLogic). Vacancy rate by postcode is available free on SQM Research, with 16 years of monthly history. If growth is above 5%, DOM is under 45 days, and vacancy is below 2%, the suburb clears the hard filters and is worth scoring further on the detection components.

That’s the honest framework. Two signals, both free to check. The hard part is applying them consistently across every qualifying suburb in the country, every fortnight, tracking which ones are early in their boom trajectory versus which ones have already consumed most of their headroom. BoomAU currently tracks 393 suburbs — every suburb under $800K that passes the growth filters — and produces fortnightly Strong Buy / Buy / Watch / Pass labels split by budget band.

The full backtest results — 78-suburb detection validation, the walk-forward forecaster failure, and the tier discrimination across 12,360 postcode-months — are published on our proof page. No gating, no email required. Check the maths yourself before deciding whether this is the kind of analysis you want for your next suburb decision.

Join the Wishlist

We'll email you when BoomAU launches — starting with the budget range you care about.

Be first in line

✓Fortnightly Strong / Good / Fair / Weak signal labels per suburb
✓Filtered to your budget band
✓Built on a backtest of 12,360 postcode-months

Your budget range