Boosting Ensembles for Statistics of Tails at Conditionally Optimal Advance Split Times
Abstract. Climate science needs more efficient ways to study high-impact, low-probability extreme events, which are rare by definition and costly to simulate in large numbers. Rare event sampling (RES) and ensemble boosting use small perturbations to turn moderate events into a severe ones, which otherwise might not come for many more simulation-years, and thus enhance sample size. But the viability of this approach hinges on two open questions: (1) are boosted events representative of the yet-unrealized events? (2) How does this depend on the specific form of perturbation, i.e., timing and structure? Timing in particular is crucial for sudden, transient events like precipitation. In this work, we formulate a concrete optimization problem for the advance split time (AST) hyperparameter, and study it on an idealized but physically informative model system: passive tracer fluctuations in a turbulent channel, which captures key elements of midlatitude storm track dynamics. Three major questions guide our investigation: (1) Can RES methods, in particular "ensemble boosting" equipped with a probability estimator and "trying-early adaptive multilevel splitting", accurately and efficiently sample extreme events? (2) What is the optimal AST, and how does it depend on the event definition, in particular the target location and surrounding flow conditions? (3) Can the AST be optimized "online" while running RES?
Our answers support RES as a viable method: (1) RES can meaningfully improve tail estimation, using (2) an optimal AST of 1-3 eddy turnover timescales depending on location. (3) A "thresholded entropy" statistic is a good proxy for AST optimality, bypassing the tedious threshold-setting that often hinders RES methods. Our work clarifies aspects of the response function of transient extreme events to perturbations, giving a guide for designing efficient, reliable sampling strategies.