Formula Journal

Property Investment Mistakes to Avoid — What Backtesting Actually Showed

Most property investment advice sounds authoritative. Infrastructure in the pipeline. Population growth trending up. Past performance confirming the thesis. The suburb has “all the fundamentals.” Most of it is wrong — not wrong in theory, but wrong in the data. Wrong in a way that costs investors real money.

We know this because we built five versions of a suburb-scoring formula and backtested each one against real Australian outcomes across 12,360 postcode-months. Metric after metric failed. What survived was not what most investors check. Here are the five biggest mistakes the data exposed — and what actually worked instead.

Mistake 1

Treating Infrastructure as a Buy Signal

Infrastructure announcements are the single most-cited “catalyst” in property investing circles. A new train station. A highway extension. A hospital precinct. The logic seems airtight: amenity improves, demand rises, prices follow.

We included infrastructure spending, population growth, and building approvals in our first formula. We weighted them thoughtfully. We applied them to historical suburb data. Then we checked the results.

v1 formula accuracy (prediction)55%

FAILED — COIN FLIP

55% accuracy. A coin flip with extra steps.

The reason isn’t that infrastructure doesn’t matter to property prices — it’s that it affects whole corridors over decades, not suburbs over investment-relevant time horizons. Population growth data is too coarse and too slow-moving to time an entry. Building approvals at the suburb level are sparse. By the time infrastructure spending becomes legible in public data, the market has already priced it in — or it hasn’t, and won’t for fifteen years.

These signals are interesting context. They are not buy signals. The backtest was unambiguous.

What the data says

Infrastructure spending, population growth, and building approvals failed as suburb-level predictors in backtesting. They had near-zero predictive power at the suburb level over investment-relevant time horizons.

Mistake 2

Trying to Predict Instead of Detect

The goal every investor wants is prediction: identify the boom before it starts, buy early, capture maximum upside. It’s the right goal. The problem is that prediction at suburb granularity, with publicly available data, doesn’t work.

Our v1 formula tried to predict booms using leading indicators. 55%. We rebuilt it with different leading indicators. Still 55%. The conclusion wasn’t that we needed better predictors. The conclusion was that suburb-level prediction with free data is a fundamentally broken approach.

The pivot changed everything. Instead of asking “will this suburb boom?” we started asking “is this suburb currently booming?” Detection instead of prediction. It feels like cheating — until you look at the data.

Booms are multi-year events. Catching one 6–12 months after it starts still captures 60–85% of total gains. With dramatically higher confidence than prediction ever achieved.

Detection accuracy (v2.3, 78 suburbs)85.7%

False positives0%

Separation gap20.2 points

Prediction accuracy (v1)55%

85.7% accuracy. Zero false positives. A 20.2-point gap between the scores of real booms and false signals — meaning the formula doesn’t just get the right answer, it gets it with conviction.

The mistake isn’t wanting to be early. It’s assuming that prediction is the only way to be early. Catching a boom 6 months after it starts is still early. It’s just a different kind of early — one the data can actually support.

What the data says

Prediction failed at suburb granularity with free data (55%). Detection — identifying booms already underway — achieved 85.7% accuracy. Catching a boom 6–12 months after it starts still captures 60–85% of total gains.

We track 393 suburbs using detection, not prediction.

Fortnightly Strong / Good / Fair / Weak signal labels, filtered to your budget. No prediction guesswork. Join the wishlist.

Mistake 3

Trusting Pooled Statistics

This is the most dangerous mistake on this list, because it arrives dressed as rigour.

After our detection formula was working, we wanted more. Detection tells you whether a suburb is booming. We wanted a model that told us which booming suburb would outperform the others— a forward 3-year capital growth forecaster with ranked confidence tiers.

We built it. Six input features. Walk-forward backtest across 28,049 scored rows. Then we looked at the headline number.

Spearman rank correlation (pooled)0.42

For context: most quantitative equity factors celebrate a pooled IC of 0.05–0.10. We had 0.42. It looked like a genuine edge.

Then we did something most backtests never do. We split the correlation within each scoring period— asking not “how well does the model rank across all time?” but “within any given month, how well does it rank suburbs against each other?”

Within-date rank correlation−0.08

80% CI coverage (should be 80%)46%

Top-decile confidence coverage20%

STATISTICAL ILLUSION — ABANDONED

Negative 0.08. Worse than random.

The pooled 0.42 was a statistical illusion. Boom years had both higher model predictions and higher realised returns. The model was ranking time periods, not suburbs. Within any given month, it couldn’t tell you which suburb would outperform — at all.

The confidence intervals confirmed the problem. An 80% confidence interval should contain the real outcome 80% of the time. Ours covered 46% of outcomes overall — and at the rows where the model was most confident, coverage dropped to 20%. The model was most wrong precisely when it was most sure.

We deleted the entire forecaster. Every line of code. The lesson: always check within-period discrimination, not just pooled correlation. A number that looks like a breakthrough can be completely useless for the question you actually care about.

What the data says

A pooled Spearman IC of 0.42 collapsed to −0.08 within-date. Growth phase does not predict relative suburb outperformance — the tide lifts all boats. Pooled statistics that span different market conditions can be dangerously misleading.

Mistake 4

Chasing Past Performers

The argument for chasing past performers sounds like pattern recognition: suburbs that boomed once have the fundamentals. The infrastructure is in. The demographic shift happened. The momentum is established. Buy the proven winner.

The data says the opposite. Mean reversion dominates. Past outperformers tend to underperform going forward.

In our backtesting, 5-year momentum survived pooled correlation tests — it looked like a signal. Then it failed within-date. The momentum was tracking the era, not the suburb. The acceleration ratio and repeat-boomer history failed the same way.

Boom size reinforces this. The post-2020 environment produced median boom returns of 16.2%. The pre-2015 era produced median boom returns of 1.3%. A suburb that “outperformed” post-2020 may have simply been in the right era at the right time — the underlying suburb dynamics were secondary to the macro regime.

Pre-2015 median boom return1.3%

Post-2020 median boom return16.2%

A 12× difference in boom size between eras. This is era dependency, not suburb quality. Past outperformers carry the premium of the cycle that powered them — not a forward edge.

The implication for suburb selection: prior boom status is not a screening criterion. A suburb that boomed 2017–2020 now has a higher median price, less affordability headroom, and a mean-reversion tendency the data says is real. That’s a worse starting position, not a better one.

What the data says

Mean reversion dominates: past outperformers tend to underperform going forward. Boom size is era-dependent (1.3% pre-2015 vs 16.2% post-2020). Growth phase does not predict relative outperformance within a period.

Avoid the mean-reversion trap.

BoomAU scores current boom conditions — not past performance. Detection formula + affordability filter, updated fortnightly. Join the wishlist.

Mistake 5

Ignoring Affordability Headroom

After all the failures — the prediction formula, the forecaster, the momentum signals — we ran a clean test. Cancel the market tide. Compute excess returns: each suburb’s growth minus the median growth across all peers in the same period. Then test every available feature against those tide-cancelled returns.

One signal survived.

Affordability headroom

How a suburb’s median price compares to its capital city median. Suburbs priced below the city median consistently outperform after cancelling the tide. Suburbs priced above 1.5× the city median consistently underperform. The effect is monotonic and survived every subsample we tested.

The 78-suburb detection backtest made this concrete: every boom in the dataset was led by suburbs priced well below their city median. Not some booms — every boom. The $800K hard cap on our scoring formula exists for exactly this reason.

Most investors instinctively want the premium suburb. The established area. The one with the “better” schools and cafes. The data says that premium is already in the price. High-priced suburbs have less headroom. Less headroom means the one cross-suburb ranking signal that actually survived backtesting is working against you.

This doesn’t mean buying the cheapest suburb in the country. The formula requires passing hard filters: growth above 5%, days on market below 45, vacancy below 2%. Affordability headroom is the ranking signal once a suburb passes those thresholds — not a standalone justification to buy anything cheap.

What the data says

Affordability headroom is the only cross-suburb ranking signal that survived tide cancellation in backtesting. Every boom in the 78-suburb dataset was led by suburbs priced well below their city median. Suburbs above 1.5× the city median consistently underperform.

What Actually Survived

Two signals. That’s the honest answer.

1. Affordability headroom (what to buy)

Suburb median price vs. capital city median. The only cross-suburb ranking signal that survived tide cancellation. Below city median outperforms. Above 1.5× city median underperforms. Monotonic. Robust.

2. Boom timing via detection (when to buy)

Is the suburb currently in a detected boom, and how early is it? Measured by how much of the affordability gap has already been consumed. Less than 30% consumed means the boom is early. Detection catches booms 6–12 months after start, still capturing 60–85% of total gains, at 85.7% accuracy.

Combine these into a scoring system and run a walk-forward backtest across 12,360 postcode-months. The tier discrimination looks like this:

Tier	Excess return	Beat market	n
Strong	+7.5pp	71%	2,103
Good	+1.3pp	55%	3,349
Fair	−0.7pp	47%	5,788
Weak	−6.4pp	28%	1,120

Walk-forward backtest, 12,360 postcode-months, 2012–2026. No lookahead. Excess return = suburb 12-month growth minus market median growth. Full methodology →

Perfectly monotonic. Strong Signal outperforms Good Signal outperforms Fair Signal outperforms Weak Signal. The spread between Strong Signal and Weak Signal is 13.9 percentage points — on the same asset class, the same country, the same market conditions. The difference is suburb selection.

The five mistakes above all share the same root: using metrics that sound right but weren’t tested against outcomes. Infrastructure announcements feel like leading indicators. Pooled correlations feel like validation. Chasing past performers feels like following the evidence. None of them held up.

The two signals that survived aren’t secrets. Affordability headroom is free to check: look up your capital city’s median on Domain and compare it to the suburb you’re considering. Boom detection is free to approximate: check annual growth on YIP, days-on-market, and vacancy via SQM Research. The hard part is doing this for hundreds of suburbs every two weeks and filtering out the noise from thin markets. That’s what BoomAU automates.

The full backtest methodology, the 78-suburb validation dataset, and the walk-forward tier results are published on our proof page. No gating, no email required. Check the maths yourself.

The Mistake Scorecard

Mistake	Verdict	Evidence
Treating infrastructure as a buy signal	FAILED	Infrastructure spend, population growth, building approvals all failed as suburb-level predictors
Trying to predict instead of detect	FAILED	Prediction formula hit 55% (coin flip). Detection achieved 85.7% accuracy
Trusting pooled statistics	FAILED	Pooled IC 0.42 collapsed to ‒0.08 within-date. Model was ranking time periods, not suburbs
Chasing past performers	FAILED	Mean reversion dominates. Past outperformers tend to underperform going forward
Ignoring affordability headroom	COSTLY	Every boom in the 78-suburb backtest was led by suburbs below city median. Weak Signal tier: −6.4pp excess return
Affordability headroom (doing it right)	WORKS	Only cross-suburb ranking signal that survived tide cancellation. Monotonic and robust
Boom detection timing (doing it right)	WORKS	85.7% accuracy, 0% false positives, 20.2-point separation gap on 78 suburbs

Join the Wishlist

We'll email you when BoomAU launches — starting with the budget range you care about.

Be first in line

✓Fortnightly Strong / Good / Fair / Weak signal labels per suburb
✓Filtered to your budget band
✓Built on a backtest of 12,360 postcode-months

Your budget range