the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The impact of the Canterbury earthquakes on household income and expenditure in the Canterbury region in New Zealand
Abstract. Using New Zealand's Integrated Data Infrastructure (IDI), we evaluate the impact of the 2010–2011 Canterbury earthquakes on household economic behaviour, focusing on changes in income and expenditure. Using nationally representative data from the Household Economic Survey (HES) linked to measures of earthquake intensity, we implement a difference-in-differences design comparing pre- and post-earthquake outcomes for earthquake-affected households with a matched comparison group. We find that, relative to matched comparison households, total household income in high-intensity areas increases by about NZD 7,600 in the post-earthquake period. Total expenditure shows no clear average DiD effect, but expenditure composition shifts markedly: receipts and refunds, which capture insurance reimbursements and related inflows, more than double, and diary-recorded day-to-day spending rises by about 14 %. Spending also increases in transportation, travel, fees and subscriptions, and social insurance contributions. Additional analysis shows that households that relocated out of Canterbury faced substantially higher housing costs (around NZD 25,000) and lower mortgage and loan repayments (around NZD 9,500) than households that remained, while average incomes are similar across the two groups. These findings provide evidence on household economic adjustment to disasters and offer policy-relevant insights into post-disaster financial support, social security design, and the management of population movements.
- Preprint
(2253 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2026-1005', Anonymous Referee #1, 13 Mar 2026
-
AC1: 'Reply on RC1', Quanfu Zhang, 11 Apr 2026
We thank Referee #1 for the careful reading of our paper and for the constructive comments. We appreciate the referee’s positive assessment of the paper as competent and useful, and we are grateful for the two specific suggestions. Below we respond to each point in turn.
Comment 1: “Please add a specification with the MMI, rather than its discretization into three classes (zero, low, high).”
Our reason for using grouped intensity categories rather than a continuous MMI specification is that, in our setting, the household economic response to earthquake intensity is unlikely to be well captured by a simple linear dose-response relationship. We are interested in whether households exposed to meaningfully different levels of earthquake intensity exhibit different post-earthquake economic adjustments, and this is naturally consistent with a threshold-based or regime-based interpretation rather than a strictly linear one.
In addition, the categorisation used in the paper is not intended to be ad hoc. The thresholds are informed by the relevant literature and by the substantive interpretation of different levels of earthquake exposure. For this reason, we view the zero/low/high specification as a more interpretable way to capture meaningful exposure heterogeneity in this context.
We will clarify this motivation more explicitly in the revised manuscript, and, if needed, assess the robustness of the results to alternative reasonable threshold definitions.
Comment 2: “Please add a balancing test between waves. You match treatment and control. Are there changes between waves that could endanger identification?”
In our design, propensity score matching is conducted separately within each survey wave, so treated households are compared with contemporaneously matched controls rather than with a single pooled control group. This already helps reduce concerns about between-wave imbalance.
To further assess whether compositional changes across waves could threaten identification, we conducted additional placebo-style balance tests in the matched sample using key observed household characteristics as pseudo-outcomes in a DID framework. Specifically, we estimated the interaction term Treat×Post for the reference person’s age, household size, sex, tenure status, and education, including survey-wave fixed effects.
The estimated interaction coefficients are uniformly small and statistically insignificant. This suggests that the matched treated and control samples do not exhibit systematic differential compositional shifts across waves in these observable characteristics. We will report these results in the revised manuscript to make the identification argument clearer.
Once again, we thank the referee for these helpful comments. We believe they improve the clarity of the paper, especially in explaining our treatment-intensity specification and in strengthening the discussion of balance and identification across survey waves.
Citation: https://doi.org/10.5194/egusphere-2026-1005-AC1
-
AC1: 'Reply on RC1', Quanfu Zhang, 11 Apr 2026
-
RC2: 'Comment on egusphere-2026-1005', Anonymous Referee #2, 01 Apr 2026
The authors are working on an important topic with a reasonable approach and a data set that appears to have some very important advantages. Having read many papers in this area, very few can differentiate between earned income, which can be impacted by a disaster, and insurance payouts. Many studies have reported surprisingly little financial hardship around disaster events, and those authors could only speculate that public or private insurance payments smoothed consumption. The disaggregated expenditure data is also uncommon. Considering how extensive the literature on disaster impacts on households is at this point, more of the literature should probably be cited, including papers whose limitations you are overcoming.
I have a few suggestions for improving the analysis or clarifying what was already done.
The authors have linked records, which enables them to do analysis on people who stay in place vs. those who migrate. However, I didn’t understand how persistent the sample is. The observation counts in table A.1 vary by over 66% between high and low years for group 1 and 100% in group 4. Do the people in the lowest year persist all the way through while others come and go? Are there people with all combinations of years observable? For example, could someone be observed only in 2006, 2009 and 2016 and another person observed only in 2010, 2011, and 2012? With the modest sample sizes, it seems like people entering and exiting the sample could have substantial impacts on the estimates and precision. If this is more like a repeated cross section than a balanced individual panel, you would need to approach it with different methods.
I was puzzled by the use of covariates in the specification. As the authors discuss in lines 719-729, no one would expect age to have a linear relationship with income or expenditures. Why put it in the model as linear when you could include it in a more flexible and realistic form? Similarly with the household structure. The relationship between household size, income and expenditures will be very different moving from one to two working adults versus adding a third or fourth child. Why not have indicators for a handful of common household types (single working person, dual income no kids, dual income with kids, single income with kids, retirees, etc.)?
The event study graphs in Appendix C do not improve my confidence in the results. It would not be fully transparent to place them in an appendix that most readers wouldn’t see. My concern is that all the pre-treatment coefficients are substantially above zero. The wide confidence intervals include zero, but four of the seven post-event coefficients are at the same levels, as if the earthquake had no impact. The three periods of higher income appear 4 to 7 years after the event. Impacts persisting seven years later is plausible, but if these are driven by insurance payouts and rebuilding, shouldn’t we see something in post-years 1, 2 and 3?
Did the authors try inverse probability weighting to see if it is any better than PSM at reducing the pre-treatment coefficients or improving precision (I apologize if I missed it)? Is there any way of expanding the pool of potential controls, which would increase the quality of the matches and give us more confidence in the results?
Finally, I’ll request a few minor clarifications for readers who have never had the privilege of visiting New Zealand. What are the important differences between the North and South Islands, and why should I be comfortable with the control group being drawn entirely from a different island? Consider replacing “Contribution Schemes” with “Retirement Savings.” I didn’t know what specifically was being contributed to until you mentioned it in the text. I expect all readers would understand “retirement.”
Citation: https://doi.org/10.5194/egusphere-2026-1005-RC2 -
AC2: 'Reply on RC2', Quanfu Zhang, 01 Jun 2026
We thank Referee 2 for the careful and thoughtful comments. We are especially encouraged by the referee’s recognition of two features of our data that are uncommon in this literature: first, our ability to distinguish earned income from inflows related to insurance and other receipts; and second, the availability of disaggregated expenditure data. We agree that these features should be highlighted more clearly in relation to the existing household-disaster literature, and we will strengthen the literature review accordingly, particularly by citing studies whose limitations our data help overcome.
Below we respond to each of the referee’s main comments.
1. Sample persistence and the structure of the linked sample
We agree that the persistence structure of the sample needs to be described much more clearly. The HES data used in the current paper are repeated cross-sectional survey data rather than a balanced individual panel, and we will revise the manuscript to make this explicit. We will also add a descriptive table on sample persistence across waves, including how many observations appear in only one wave, in multiple waves, and across both pre- and post-earthquake periods. This should make the structure of the linked sample much more transparent.
At the same time, we would like to clarify that a DID framework remains appropriate in this setting. Difference-in-differences does not require a balanced individual panel and is commonly used with repeated cross-sectional data as well as with unbalanced panel data. This is standard in the econometrics literature and in classic empirical applications (Wooldridge, 2010; Card and Krueger, 1994). Pischke’s DID notes also state explicitly that repeated cross-sections are sufficient for DID. We will revise the text to make the empirical design and data structure align more clearly.
We will also connect this clarification more directly to our identification strategy. Matching is conducted separately within each survey wave, so treated households are compared with contemporaneously matched controls rather than with a single pooled control group. In addition, we have conducted placebo-style balance checks using key observed household characteristics as pseudo-outcomes, and these checks do not indicate systematic differential compositional shifts across waves between matched treated and control observations.
2. Functional form of covariates
We agree that age is unlikely to have a simple linear relationship with income or expenditures, and that a linear specification is restrictive. In the revision, we will therefore try a more flexible treatment of age, for example using age bins such as under 25, 25–45, 45–65, and 65+, subject to whether the sample sizes in these cells are sufficient. At a minimum, we will include both age and age squared.
We also agree in principle that household structure is not well captured by a single linear household-size term. The referee’s suggested household types are very sensible. To clarify the context, the current paper is the first paper in the broader project and is based on HES survey data, which provide rich household income and expenditure information but have relatively limited sample size at the level needed for finely stratified household-type estimation. A related companion paper in progress uses a much larger linked administrative data framework to build a richer household classification system and study household structure more directly. In that sense, we are already working on the broader issue the referee raises, but the current paper faces objective sample-size constraints that limit how finely we can classify households without creating very small cells and unstable estimates. We will make this limitation more explicit in the paper. At the same time, where feasible, we will move toward a more flexible but still parsimonious treatment of household composition.
3. Event-study graphs and confidence in the dynamic results
We appreciate this concern. In the original draft, the event-study graphs were placed in the appendix because they were intended as supporting technical material rather than part of the paper’s main narrative. That said, we agree that a clearer discussion of their implications would be helpful.
In the revision, we will revisit the presentation and interpretation of the event-study results. In particular, we will report and discuss the pre-treatment coefficients more explicitly, rather than relying only on the fact that the confidence intervals include zero. We also agree that the post-earthquake time profile deserves a more cautious interpretation. If the stronger effects emerge only several years after the event, then the mechanism should not be described too mechanically as an immediate insurance-payout effect. Instead, the discussion should allow for delayed adjustment through rebuilding, labour-market changes, insurance settlement timing, and related processes.
We will also explore whether the dynamic patterns can be improved by strengthening the comparison group design. In particular, we will examine whether expanding the pool of potential controls improves match quality and event-study performance. In addition, we will consider inverse probability weighting (IPW) as a robustness check to assess whether weighting-based approaches improve pre-treatment balance or precision relative to PSM.
4. Alternative balancing approaches and the control pool
Thank you for raising the possibility of inverse probability weighting. Our baseline approach uses wave-specific propensity score matching because it provides a transparent way to construct comparable controls within each survey wave. That said, we agree that IPW is a useful alternative and may be informative as a robustness check, especially if it helps improve balance or precision. We will explore this possibility in revision.
We also take seriously the referee’s question about whether the control pool can be expanded. We will revisit this issue and assess whether a broader set of candidate controls can be used without compromising comparability.
5. Why the control group is drawn from the North Island
This is a helpful point, especially for international readers unfamiliar with New Zealand. We will revise the paper to explain more clearly why the control group is drawn from the North Island and why we believe this is a reasonable comparison group after matching. In particular, we will better explain the rationale for avoiding potentially affected South Island comparison areas and discuss the key economic and geographic differences between the North and South Islands so that readers can more easily assess the choice of controls.
6. Minor clarification on terminology
Thank you for this suggestion. We agree that “Retirement Savings” is clearer than “Contribution Schemes,” and we will revise this terminology accordingly.
Once again, we thank the referee for these constructive comments. We believe they will help us improve the transparency of the sample description, the presentation of the identification strategy, and the interpretation of the dynamic results.
References
Card, David, and Alan B. Krueger. 1994. “Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania.” American Economic Review 84 (4): 772–793.
Pischke, Jörn-Steffen. “Difference-in-Differences.” Lecture notes, London School of Economics. (These notes explicitly state that repeated cross-sections are sufficient for DID and that a panel is not required.)
Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data. 2nd ed. Cambridge, MA: MIT Press.
Citation: https://doi.org/10.5194/egusphere-2026-1005-AC2
-
AC2: 'Reply on RC2', Quanfu Zhang, 01 Jun 2026
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 661 | 391 | 74 | 1,126 | 60 | 90 |
- HTML: 661
- PDF: 391
- XML: 74
- Total: 1,126
- BibTeX: 60
- EndNote: 90
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This is a competent and useful but modest paper.
I would add two things. First, please add a specification with the MMI, rather than its discretization into three classes (zero, low, high).
Second, please add a balancing test between waves. You match treatment and control. Are there changes between waves that could endanger identification?