Research

Data-Driven Property Investing — What It Actually Means

The “data-driven” label appears in almost every property research tool, buyer’s agent pitch, and hotspot list published in Australia. Infrastructure analysis. Population projections. Growth forecasts with confidence intervals. These things sound rigorous. Some of them are.

But using data is not the same as using data that predicts outcomes. The real test is harder: does this metric — when it is high or low — actually predict whether a suburb will outperform the market over the next 12 months? Most metrics fail that test. We know because we ran it.

We tested every signal investors typically rely on across 78 Australian suburbs and 12,360 postcode-months of walk-forward data. Here is what survived — and what got dropped.

The Problem

Numbers That Sound Right But Aren’t

There is a standard set of metrics that shows up in almost every property research deck. They make intuitive sense. A council approves a new hospital — property nearby should rise. Population is growing — demand should follow. Building approvals are low — a supply squeeze must be coming.

The logic is reasonable. The data is available from public sources. And yet when you test these metrics against actual Australian suburb outcomes, they produce an uncomfortable result.

Metrics tested — prediction formula

Backtest accuracy across 78 suburbs55%
COIN FLIP

Fifty-five percent accuracy. If you flipped a coin before every suburb selection, you’d expect 50%. These five metrics — the ones most investors consider essential due diligence — added five percentage points over a coin toss.

The reasons are consistent with how markets actually work. Infrastructure announcements affect whole corridors over decades. There is no reliable way to time which specific suburb captures the uplift, or when. Population growth data is released with a significant lag and operates at a geographic scale that is far too broad for suburb-level decisions. Building approvals tell you something about future supply — but almost nothing about which suburb is seeing demand outstrip supply right now.

Vendor discount data barely exists at suburb level from free sources. And days on market — a genuine signal inside the right kind of formula — fails completely when used as a prediction input. Knowing that a suburb had a 30-day median six months ago says nothing useful about whether it will boom in the next twelve.

Takeaway

Infrastructure spending, population growth, and building approvals are useful context for understanding a region. They are not useful for predicting which specific suburb will outperform the broader market over the next 12 months. The backtest is unambiguous on this.

We score 393 suburbs using only what survived backtesting.

Fortnightly Strong Buy / Buy / Watch / Pass labels, filtered to your budget band. Join the wishlist.

The Trap

When Sophisticated Analysis Gets the Right Answer for the Wrong Reason

Once you drop the basic prediction metrics, the next move seems obvious: build a more sophisticated model. A multi-factor forecasting approach that produces a suburb-level growth estimate with a confidence range. The kind of output that looks like genuine quantitative rigour.

We built one. Six inputs: annual growth rate, affordability headroom, boom signal, five-year momentum, national market conditions, and a growth acceleration measure. We tested it against real outcomes using only data that would have been available at each point in time — no looking ahead.

The headline number was striking. Across the full dataset, the model’s rankings correlated with actual outcomes at 0.42 — a level that would be considered a genuinely strong result in most quantitative investment contexts.

Ranking correlation across all periods (pooled)0.42

On the surface, the model appeared to rank suburbs in line with their actual returns. A result this strong would normally be worth building a product on.

Then we asked a different question. Not “does the model rank good years higher than bad years?” but “within any single month, does the model correctly identify which suburb will beat the others in that same period?”

That is the test that actually matters for an investor. If you are choosing between two suburbs in August 2021, the model needs to tell you which one outperforms the other — not that 2021 will be a strong year overall.

Ranking correlation within any given month−0.08
80% confidence interval coverage (should be 80%)46%
ABANDONED

Negative 0.08. Worse than random. Within any given month, the model was actively misleading about which suburb would outperform the others in that same period.

The strong pooled result was a statistical illusion. Boom years had both higher model scores and higher actual returns across the board — so the model appeared to work when all periods were combined. What it was actually doing was detecting whether the overall market was hot. Remove that market tide, and the model had almost no signal at all.

The confidence intervals were equally broken. When a model claims an 80% confidence range, real outcomes should land inside that range 80% of the time. Ours hit 46%. At the rows where the model expressed the highest confidence, coverage dropped to 20%. The model was most certain precisely when it was most wrong.

Growth acceleration, five-year momentum, and prior growth rate — the metrics most investors focus on when comparing suburbs — did not predict which suburb would outperform within any given period. Mean reversion dominates: past outperformers tend to underperform going forward. The market tide lifts all boats. Once you remove the tide, those signals add nothing.

We dropped the forecaster entirely.

Takeaway

A model that looks accurate overall may simply be detecting whether the market is hot — not which suburb to buy. The test that matters: within any single period, can it correctly rank suburbs against each other? Growth phase, momentum, and acceleration failed this test. If a data-driven tool doesn’t disclose within-period accuracy, ask why.

Signal One

Affordability Headroom

After removing the tide effect — measuring each suburb’s growth against the median across all suburbs in the same period — one signal consistently predicted relative outperformance.

How a suburb’s median house price compares to the capital city median. Suburbs priced well below the city median consistently outperform after removing the tide. Suburbs priced above 1.5 times the city median consistently underperform. The effect is monotonic — the cheaper a suburb is relative to the city, the stronger its tendency to outperform.

The market mechanics behind this are straightforward. When affordability pressure from established suburbs pushes buyers outward, they move toward the next-cheapest alternatives. Demand spills down the price ladder, not up. A suburb priced well below the city median has a large pool of buyers who might choose it as a more accessible entry point. A suburb priced at 200% of the city median has already absorbed that spill — it has run, and the buyers who drove the move have moved on.

The finding

Every single boom in the 78-suburb backtest was led by a suburb priced well below its capital city median. Not most booms — every boom. Affordability headroom is the strongest precondition for outperformance that backtesting identified.

This is also why BoomAU applies a hard $800K median price cap before scoring any suburb. The data is clear that booms happen below the city median, not above it. Scoring expensive suburbs would add noise, not signal.

You can check affordability headroom for free right now. Domain publishes capital city median house prices quarterly. Find the city median for whichever capital you’re researching, then compare it against any suburb you’re evaluating. If the suburb is priced well below, headroom exists. If it’s already above the city median — especially above 1.5 times — the backtest says the odds are against outperformance.

Affordability headroom, ranked and filtered to your budget.

BoomAU scores 393 suburbs fortnightly using both signals that survived backtesting. Join the wishlist.

Signal Two

Boom Detection — Not Prediction

The second signal is about timing. Not predicting when a suburb will boom — that approach delivered 55% accuracy. But detecting whether it is already booming, and whether you’re early or late to the move.

This distinction matters more than it might seem at first. Australian suburb booms are multi-year events. Entering a boom 6–12 months after it has started — once it is clearly detectable in the data — still captures 60–85% of total gains. The cost of waiting for confirmation is far lower than the cost of acting on a false signal.

Before any suburb can be scored, it must pass four hard filters. These are not scoring inputs. They are gates — a suburb that fails any one of them does not qualify and is not scored at all:

Hard filters — must pass all four

Suburbs that pass all four are then scored across five components: momentum (30%), growth strength (25%), tightness — days on market and vacancy combined (20%), sustainability — rental yield and vacancy trend (15%), and affordability headroom (10%). Scores above 80 are Boom. Above 65 is Early Boom. Above 50 is Warming. Below 50 is No Boom.

Backtest accuracy85.7%
False positives0%
Separation gap between real booms and false signals20.2 points
Suburbs validated78 (28 boomed, 50 controls)
VALIDATED

Zero false positives matters as much as the accuracy percentage. The formula didn’t just get most booms right — it never fired on a suburb that wasn’t actually booming. The 20.2-point separation gap between real booms and false signals means the formula is not hovering near borderline scores. It gets the right answer with conviction.

One important caveat on tightness data: days-on-market figures from thin markets can mislead. In suburbs with fewer than roughly 30 annual sales, a single fast transaction can pull the median DOM to 10 days; one slow listing can push it to 150. Below about 15 annual sales, the DOM figure is not usable as a signal. Boom detection only works reliably in suburbs with enough transaction volume to produce stable medians — and volume is easy to check on YIP before drawing any conclusions from tightness numbers.

The Numbers

What the Tier Data Shows

The walk-forward backtest across 12,360 postcode-months produced four tiers. The spread between them is the clearest evidence that the two-signal system works.

TierExcess returnBeat marketn
Strong Buy+7.5pp71%2,103
Buy+1.3pp55%3,349
Watch−0.7pp47%5,788
Pass−6.4pp28%1,120

Walk-forward backtest, 12,360 postcode-months, 2012–2026. No lookahead. Excess return = suburb 12-month growth minus market median growth. Full methodology →

Perfectly monotonic. Strong Buy outperforms Buy, which outperforms Watch, which outperforms Pass. The spread between the top and bottom tier is 13.9 percentage points of excess return — same asset class, same broader market, just different suburbs. The difference between good and poor suburb selection is not marginal.

The Pass tier warrants attention. These suburbs underperformed by 6.4 percentage points and beat the market only 28% of the time. That is not a neutral outcome. That is a consistent drag on returns. Avoiding the wrong suburbs matters as much as finding the right ones — which is why the labels include Pass, not just a ranking of positive opportunities.

One important context for the absolute numbers: boom era matters. Pre-2015, the median boom gain across the backtest dataset was 1.3%. Post-2020, it was 16.2%. The tiers always discriminated correctly against each other — the relative ordering held in both eras — but the absolute size of the gains shifted dramatically with market conditions. Data-driven investing identifies where to be. The market cycle determines how much you make when you’re positioned correctly.

Fortnightly suburb scores, filtered to your budget band.

Strong Buy through Pass labels, built on two backtested signals. Join the BoomAU wishlist.

What You Can Check for Free

The two signals that survived are not locked behind a product. You can check them yourself for any suburb you’re evaluating.

1. Affordability headroom

Find the capital city median house price on Domain (published quarterly, free). Compare it against the suburb’s median. If the suburb is priced well below the city median, headroom exists. If it is already above the city median — especially above 1.5 times — the backtest says the odds of outperformance are against you.

2. Boom detection signals

Annual growth rate from YIP (yourinvestmentpropertymag.com.au, CoreLogic-backed, free). Postcode vacancy rate from SQM Research (16 years of free monthly history). Days on market from YIP or Domain. If annual growth is at or above 5%, DOM is 45 days or below, and vacancy is 2% or below, you’re looking at the hard-filter profile of a potential boom. All four criteria must be met before any further analysis is worth doing.

3. Check the transaction volume first

Before trusting any tightness signal, look at annual sales volume for the suburb on YIP. Fewer than roughly 30 sales per year means the DOM median can be moved by a single transaction. In suburbs below 15 annual sales, the DOM figure is not usable at all. Thin markets require a higher evidentiary bar before acting on any signal.

The hard part is doing this consistently across the 8,417 suburbs in Australia on a fortnightly basis — catching booms within weeks of when the data crosses the threshold, filtering thin-market noise, and ranking by affordability headroom within a specific budget band.

That is what BoomAU automates. We currently score 393 suburbs that pass the growth and affordability filters, updated fortnightly across three budget bands: 35 suburbs under $400K, 149 under $600K, and 204 under $800K. The fortnightly labels — Strong Buy, Buy, Watch, Pass — are built entirely on the two signals that survived rigorous testing. Nothing else made the cut.

Full backtest methodology, the 78-suburb validation, and the walk-forward tier discrimination results are published on our proof page. No gating, no email required. The maths is open — check it yourself before deciding whether the two signals that survived are worth acting on.

Join the Wishlist

We'll email you when BoomAU launches — starting with the budget range you care about.

Be first in line

  • Fortnightly Strong / Good / Fair / Weak signal labels per suburb
  • Filtered to your budget band
  • Built on a backtest of 12,360 postcode-months