the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Human Decision-Making in Crowds in a Virtual Flood Scenario
Abstract. Flood evacuation outcomes are critically shaped by human behaviour, yet empirical data on individual decision-making remain scarce due to the dangers and logistical challenges of collecting data during real disasters. To address this gap, this study used Virtual Reality (VR) to examine how social cues, specifically crowd behaviour, interact with factors such as crowd size, clarity of the safe destination, and floodwater level to influence evacuation choices and delays. Four within-subjects VR experiments were conducted with 84 participants, systematically testing these variables in an immersive flood scenario. Results showed that crowd behaviour strongly determined both route choice and evacuation latency, often outweighing other factors. Participants tended to follow crowds into floodwater, demonstrating herding behaviour. However, this influence weakened when water levels were very high, indicating a threshold where physical danger overrides social cues. Larger crowds and unclear destination information further increased reliance on social information and pre-movement times. These findings highlight the powerful role of social dynamics in emergency decision-making and underscore the need to integrate realistic human behaviour, particularly social influence, into flood risk models, public warnings, and evacuation planning to improve community resilience and safety.
- Preprint
(953 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-5312', John Drury, 23 Nov 2025
-
AC1: 'Reply on RC1', Booloot Arshaghi, 27 Jan 2026
We would like to thank the reviewer for providing constructive comments to improve the manuscript. We provided a detailed answer to each comment in the following response.
1- Abstract. Consider changing ‘a threshold where physical danger overrides social cues’ to ‘a threshold where more obvious physical danger overrides social cues.
We thank to the reviewer for their comments. We agree that specifying “more obvious” physical danger more accurately reflects the intended meaning. We revised the sentence in the abstract accordingly in the revised version of the manuscript. As per your suggestion, the updated wording is:
“...a threshold where more obvious physical danger overrides social cues.”2- Is it necessary to use the term ‘herding’, which is more suitable for animals and their instincts, has been criticised in the literature (see Haghani et al., 2019)? ‘Social influence’ (also used) is a more neutral alternative term.
Thank you for your insightful comment. We agree that the term herding may reflect instinctive and animal-like behaviour and note that it has been criticised in the literature. To avoid this unintended connotation, we replaced herding with alternatives including social influence and social cue (where appropriate), in the manuscript, which more accurately and naturally describe the deliberative, cognitively mediated processes suggested by our findings.3- Observing the behaviour of the majority of people is a good guide for how one should behave (Gigerenzer, 2008), particularly when those people are judged to be self-relevant in some way (Spears, 2021).
We agree that these considerations are relevant to the scenarios of our VR studies. The participant will observe the crowd behaviour and include it in the individual decision process. We also agree that this influence is stronger when the observed people are self-relevant. Therefore, we have included these points and the citations in the discussion of the revised version of the manuscript.
4- The term ‘natural disasters’ is criticized in the disaster’s literature, and the term ‘hazards’ is suggested instead (with disaster being the social effects of a hazard). See for example UNDRR https://www.undrr.org/our-impact/campaigns/no-natural-disasters.
We agree that the phrase ‘natural disasters’ is not conceptually accurate and, for this reason, has been widely criticised in the literature. As highlighted in the UNDRR No Natural Disasters campaign, disasters are not “natural” events, but the outcome of a natural hazard interacting with social vulnerability, exposure, and insufficient protective measures.
In line with contemporary literature and the UNDRR position, we replaced the term natural disasters with natural hazards throughout the revised version of the manuscript to ensure that our terminology reflects the established conceptual distinction.
5- How was the interview data analysed?
Thank you for mentioning this. To address this comment and provide additional clarification on data collection and analysis, a more detailed description of the behavioural measures and data analysis has been added to the end of Procedure section as follow:
“Participants completed a series of standard and custom questionnaires designed by authors at key stages of the experiment, including a demographic survey, the Simulator Sickness Questionnaire (Kennedy, Lane et al. 1993), the Igroup Presence Questionnaire (IPQ) (Schubert 2003), and a brief Likert-scale decision-making questionnaire designed by authors (appendix A), assessing the influence of environmental and social factors on route choice. Qualitative data were also collected through post-condition interviews (appendix B).
In addition to self-reported data, behavioural measures including path choice and pre-movement time were extracted for analysis after each experimental condition. Pre-movement time, also referred to as response time or pre-evacuation time, describes the interval between a stimulus and the initiation of action (Adrian, Amos et al. 2025). In this study, pre-movement time was defined as the interval between the onset of the simulated flood warning within the VR environment and the participant’s initiation of evacuation movement. Specifically, it was measured as the duration spent observing and assessing the environment before initiating movement toward the designated safe destination and was extracted from VR screen recordings.
Quantitative data from questionnaires, behavioural measures, and pre movement times were analysed using descriptive statistics to summarise central tendency and variability (mean/SD), followed by inferential statistical analyses to test differences across experimental conditions. Qualitative interview data were analysed using reflexive thematic analysis (Braun and Clarke, 2006), following an inductive approach. Interviews were transcribed verbatim and repeatedly reviewed to achieve familiarisation with the data. Meaningful segments related to participants’ perceptions, decision-making processes, and experiences during the VR flood evacuation scenarios were systematically coded. Codes were then reviewed and grouped into candidate themes, which were iteratively refined through comparison across participants and experimental conditions. “
Reference added:
BRAUN, V. & CLARKE, V. 2006. thematic analysis in psychology. Qualitative research in psychology, 3, 77-101.6- Study 1. A large effect size is expected – but why? The fact that the authors suggest post hoc that the sample size was too small indicates that this assumption was unwarranted in this case.
Thank you for this comment. We agree that the assumption of a large effect size requires clarification. VR1 was designed as a feasibility study, with the primary aim of testing experiment task flow, VR usability etc. rather than conducting confirmatory hypothesis testing.
In this context, the assumed large effect size (f = 0.40) was used pragmatically to determine a minimum sample size, consistent with common research practice in pilot studies (Kunselman 2024). Our later observation that the sample size was insufficient to detect smaller effects does not contradict this assumption but rather demonstrates the intended limitation of VR1 and informed the increased sample sizes used in VR2-VR4. We revised the manuscript (Participants, under Method section) to clarify this rationale, the revised paragraph is as follow:
“VR1 was conducted as a feasibility study to assess procedural feasibility, VR usability, and the suitability of the experimental measures, and to guide the design of subsequent experiments. Accordingly, the study was not intended for confirmatory hypothesis testing or precise effect size estimation. A power analysis was conducted using GPower 3.1.9.7 (Faul et al., 2007) to estimate a minimum sample size, assuming a large effect (f = 0.40) for a one-way repeated-measures ANOVA with three conditions (α = 0.05, power = 0.80), resulting in a required sample of N = 12. “
Reference:
Kunselman, A. R. (2024). "A brief overview of pilot studies and their sample size justification." Fertility and Sterility 121(6): 899-901.7- Can the authors provide a link to footage/ moving visualisation?
A link to the video footage of the VR experiment will be made available and stored on the University of Nottingham repository, following FAIR data sharing guidelines, or the Journal repositor permanently.8- Was the questionnaire developed by the authors themselves or did they use established items? We should at least see some example items (and preferably there should be a link to the whole questionnaire, so the wording of items can be seen).
The questionnaire was developed by the authors, and it is added to the Appendix of the revised version of the manuscript. It is also clarified in the revised manuscript, in the Procedure section, as below:
“Participants completed a series of standard and custom questionnaires designed by authors at key stages of the experiment, including a demographic survey, the Simulator Sickness Questionnaire (Kennedy, Lane et al. 1993), the Igroup Presence Questionnaire (IPQ) (Schubert 2003), and a brief Likert-scale decision-making questionnaire designed by authors (appendix A), assessing the influence of environmental and social factors on route choice. Qualitative data were also collected through post-condition interviews (appendix B).”
The Appendices have been modified as follows:
“Appendix A- After Experiment Questionnaires
The following table presents the questionnaires which participants completed after experiments. This questionnaire provided insight into the influence of decision factors on their decision making on choosing route to the safe destination in Likert scale.
Table A. Post Experiment Questionnaires : Please rate to what extent the following factors influenced your decision-making in choosing your route to the safe destination: 1= not at all to 5 = very high. `Decision Factors 1 2 3 4 5 VR Experiment Presence of the crowd 1 – 2 – 3 – 4 Crowd choice of path 1 – 2 – 3 – 4 Size of the crowd 2 Flood water overall condition 1 – 2 – 3 – 4 Flood water level 1 – 2 – 3 – 4 Trust on the crowd 2 – 3 – 4
Appendix B- Interview Questions
The following table (table B) shows the questions that participants responded to after the VR experiments
Table B. Interview QuestionsN Question VR Experiment 1 Did you notice the people walking toward the safe destination? please explain. Do you believe that you were “consciously” following others/avoiding following them, to reach the destination/rescue team? 1 - 2- 3- 4 2 How did you assess that which route you need to go to reach the safe destination? 2- 3- 4 3 Did you notice the size of the crowd present in the scene? How was it? Did the size of the crowd affect your decision to the destination? please explain how. 2- 3- 4 4 Did you notice the level of water (water hight) before you chose your path to the destination? Did you have any concern to walk through the flood water? 1 - 2- 3- 4 5 Were you aware of the risk of passing through the water first when you decided on your path to the safe destination? If yes, to what extend do you think it affect your action in this experiment? 1 - 2- 3- 4 6 Did you notice the distance from the destination when you were deciding which path you want to go through? 1 - 2- 3- 4 7 What other factors influenced your decision on choosing the path to the destination? 1 - 2- 3- 4 8 Do you think you could trust the crowd and the route they were taking to reach the safe destination? 2- 3- 4 "
9- Table 2 post hoc column seems to indicate that conditions were compared across experiments for VR3 and VR4, which is incorrect.
Thank you for this good point. This was a labelling mistake, which is corrected in the revised version of the manuscript .10- Questionnaire tables should include notes reminding us what the A, B, C, D conditions are.
Thanks for your suggestion. The descriptions of labels are now provided in the caption of all Decision Factors results table as presented for Table C below. Please note that due to suggestion made by the Reviewer2, these tables are moved to appendix to increase the readability of the paper.
“Table C: VR1 Decision Factors Questionnaire Results (1A= Crowd Behaviour: Safe; 1B = Crowd Behaviour: Risky; 1C = No Crowd). “
11- Page 18. It is unclear what is meant by ‘chaotic crowd behaviour’ in the analysis of study 4, as there is no indication earlier that ‘chaotic behaviour’ would be varied in the
This term was reported by participants when describing crowd behaviour and was therefore intended to be included as a quotation. However, due to the revision to the Discussion section, this part, including the term, has been removed.12- The discussion makes a strong claim that social cues are more important than environmental cues, even for deep floodwater. However, the analysis of VR4 could make it much more clear whether there was a significant main effect of flood water level (rather than just the interaction/ tests across the four conditions).
We agree that the original wording of the Discussion could be interpreted as overstating the dominance of social cues, without sufficiently distinguishing the main effect of floodwater level in VR4 study. We revised the structure of discussion and divided it into main three subsections (suggested by the Reviewer2) for more clarity and readability. To address this comment, we also revised the relevant parts in this section to clarify that floodwater level exerted a meaningful main effect on route choice in VR4, with high water levels significantly discouraging risky route selection. The revised text below (in bold) emphasises that social cues strongly influence behaviour under conditions of uncertainty or moderate risk, but that their influence is constrained when environmental danger becomes visibly severe. These revisions better reflect the VR4 findings and clarify the interaction between social and environmental factors The following demonstrates the final changes in the discussion in Bold for more clarification:
“4 Discussion4.1 Key Findings
This study examined how social cues shape evacuation decision-making during flood events using a series of four immersive VR experiments. Overall, the results consistently highlighted the strong influence of social information, particularly crowd behaviour, on both route choice and decision latency, extending prior research on social influence in emergencies (Helbing, Farkas et al. 2002, Petrucci 2022, Wang, Zhuang et al. 2024).
Across the first three experiments (VR1–VR3), decision-making was shaped more dominantly by social dynamics than by physical environmental characteristics. Although the physical hazard indicators, such as floodwater depth (moderate, around waist level), were rated as important, they did not consistently produce independent behavioural effects. Instead, crowd-related cues, particularly behaviour, trust, and perceived intent, emerged as the primary determinants of route selection and pre-movement time.
Findings from the VR4 experiment introduce an important nuance by highlighting the role of environmental severity. Specifically, floodwater level exerted a meaningful effect on route choice, with high water levels (around shoulder/chest level) significantly discouraging risky route selection. Importantly, this effect interacted with social cues: while risky crowd behaviour in earlier scenarios typically reduced safe route selection, this influence weakened when floodwater levels were low (around ankle level) and became substantially constrained when floodwater was visibly deep. This suggests a boundary condition in which objective environmental risk can override or diminish the impact of social cues on evacuation behaviour.
Participants frequently relied on the behaviour of virtual crowds as heuristic indicators of safety, consistent with following others and social influence theories (Helbing, Farkas et al. 2002, Wang, Zhuang et al. 2024). Safe crowd behaviour increased selection of safer routes, while risky crowd behaviour encouraged following flooded paths, particularly under uncertainty, such as unclear destinations (VR3) or seemingly manageable water levels (VR4). These patterns align with previous work on following others in emergencies and highlight the role of perceived group consensus in shaping individual risk perception (Wang, Zhuang et al. 2024).
Crowd size also played a moderating role. While participants in VR1 frequently reported that crowd size affected perceived risk, VR2 showed that large crowds amplified social influence primarily when exhibiting safe behaviour. Large risky crowds increased uncertainty, reduced safe route selection, and prolonged pre-movement time, while also eliciting mixed emotional responses. These findings echo evidence that large groups can be perceived as either protective or threatening depending on context (Haghani, Sarvi et al. 2019, Kinateder and Warren 2021). Notably, large safe-behaving crowds produced the most consistent shift toward safe decisions, whereas crowd size alone was less effective when behaviour was risky. This pattern suggests that participants did not follow the majority automatically but instead evaluated the observed behaviour of others as informative cues for action, particularly when the crowd was perceived as relevant or credible.
Destination visibility further moderated reliance on social cues. In VR3, visible safe destinations increased confidence and reduced dependence on crowd behaviour, whereas ambiguous spatial conditions intensified social influence. This aligns with spatial cognition research showing that environmental legibility reduces reliance on external cues (Gärling, Böök et al. 1986) and mirrors findings from fire evacuation studies (Fu, Liu et al. 2024).
Environmental factors were consistently rated as influential but were experienced subjectively. Participants often inferred flood severity from others’ behaviour rather than direct appraisal, reinforcing evidence that hazard perception is socially modulated (Becker, Taylor et al. 2015, Bernardini, Camilli et al. 2017). Even in VR4, where high water levels discouraged risky choices more effectively, social influence remained context dependent.
Pre-movement time reflected internal conflict and uncertainty. In all studies except VR4, participants took significantly longer to act when exposed to risky social cues. VR4’s lack of significant variation in pre-movement time may reflect a boundary effect, whereby the physical extremity of floodwater made the danger sufficiently salient that social influence played a secondary role in shaping response timing.”13- Adrian, J., M. Amos, C. Appert-Rolland, M. Baratchi, N. Bode, M. Boltes, T. Chatagnon, M. Chraibi, A. Corbetta and A. Cuesta (2025). "Glossary for Research on Human Crowd Dynamics. This needs to be properly cited as the second edition.
The reference has been corrected in the revised version of the manuscript as shown below:
ADRIAN, J., AMOS, M., APPERT-ROLLAND, C., BARATCHI, M., BODE, N., BOLTES, M., CHATAGNON, T., CHRAIBI, M., CORBETTA, A. & CUESTA, A. 2025. Glossary for Research on Human Crowd Dynamics. Collective Dynamics, Second edition 1-32.
-
AC1: 'Reply on RC1', Booloot Arshaghi, 27 Jan 2026
-
RC2: 'Review on "Human Decision-Making in Crowds in a Virtual Flood Scenario"', Anonymous Referee #2, 16 Dec 2025
The paper investigates human behaviours in flood scenarios using virtual reality, by using some relevant testing conditions in terms of number of users (how the crowd is large), final gathering points, and interaction with low/high floodwater levels. Results are relevant for the audience and the scientific community, as well as for stakeholders, butsome minor issues are needed to improve the presentation quality:
- an overall framework of the experiments is recommended, i.e. to be placed at the beginning of Section 2. This can guide the interpretation of methods as well as of the results. In addition, the framework can clarify how future works could use the same approach to test other scenarios and experiments
- the way data are collected and analysed should be clarified in the methods rather than in the results, i.e. pre-movement timings, data from qualitative interviews
- rationale of each VR test is expressed in the results, but I suppose it is part of methods. In fact, in the current version, it is not clear the difference between previous works and the VR results
- tables in the result sections are quite difficult to read. I suggest the authors to focus on the most relevant results and then to simplify the main table contents by adding supplementary/appendix materials to corroborate results (e.g. comments could be limited)
- discussions are very complex and verbose. I suggest to divide into 3 sections to: 1) provide a shortlist of key findings, 2) define the implications of the research for policy makers and 3) clarify limitations in view of future works
- some limitations, in respect to surrouding conditions simulated for each individual, should me mentioned, e.g. qualitative water levels, still/flowing waters, qualitative crowd dimension, actions of the rescue team. These issues could be better discussed to pave the way for future tests
- moreover, the virtual scenario is very specific and could be associated only with some typologies of built environment and transportation scenarios. The implication of the methods in urban scenarios should be better discussed, considering the relevance in the activation of proper behaviors where more users are present in view of public intended uses of buildings/outdoor areas
- some terms should be better defined, i.e. herding, chaotic crowd behaviour or unclear group
- other relevant views of claimed behaviours in VR frames from test are encouraged, to bettere clarify the user response
Citation: https://doi.org/10.5194/egusphere-2025-5312-RC2 -
AC2: 'Reply on RC2', Booloot Arshaghi, 27 Jan 2026
We would like to thank the reviewer for providing constructive comments to improve the manuscript. We provided a detailed answer to each comment in the following response.
1- an overall framework is of the experiments is recommended, i.e. to be placed at the beginning of Section 2. This can guide the interpretation of methods as well as of the results. In addition, the framework can clarify how future works could use the same approach to test other scenarios and experiments.
We thank the reviewer for this comment. To address this comment the following has been added under Method section in the revised version of the manuscript:
“2.1 Experimental Framework
This study used a VR framework to examine how social cues, particularly crowd behaviour, influence human decision-making during flood evacuation. Four within-subject VR experiments (VR1–VR4) were conducted using a consistent evacuation scenario in which participants chose between a safe route and a risky route to reach a designated safe destination.
“Crowd behaviour was the primary variable across all experiments and was systematically combined with additional factors in a stepwise manner: crowd size (VR2), clarity of the safe destination (VR3), and floodwater level (VR4). VR1 served as a feasibility and baseline study which guided the rest of the research. This sequential design enabled isolation of individual effects to increase experimental control and allowing clear interpretation of how social and environmental cues together affect evacuation choices and decision latency. Figure 1 demonstrates the overall sequential experimental design, illustrating the manipulated variables and contextual factors across the four VR experiments. Crowd behaviour was manipulated across all experiments, with crowd size (VR2), destination clarity (VR3), and water level (VR4) varied sequentially.
For Figure 1 please see the attached file.2- the way data are collected and analysed should be clarified in the methods rather than in the results, i.e. pre-movement timings, data from qualitative interviews.
To address this comment and provide additional clarification on data collection and analysis, a more detailed description of the behavioural measures and data analysis has been added to the end of Procedure section (note that the definition of pre-movement time has been moved from the beginning of VR1 in Result section) as follows:
“Participants completed a series of standard and custom questionnaires designed by authors at key stages of the experiment, including a demographic survey, the Simulator Sickness Questionnaire (Kennedy, Lane et al. 1993), the Igroup Presence Questionnaire (IPQ) (Schubert 2003), and a brief Likert-scale decision-making questionnaire designed by authors (appendix A), assessing the influence of environmental and social factors on route choice. Qualitative data were also collected through post-condition interviews (appendix B).
In addition to self-reported data, behavioural measures including path choice and pre-movement time were extracted for analysis after each experimental condition. Pre-movement time, also referred to as response time or pre-evacuation time, describes the interval between a stimulus and the initiation of action (Adrian, Amos et al. 2025). In this study, pre-movement time was defined as the interval between the onset of the simulated flood warning within the VR environment and the participant’s initiation of evacuation movement. Specifically, it was measured as the duration spent observing and assessing the environment before initiating movement toward the designated safe destination and was extracted from VR screen recordings.
Quantitative data from questionnaires, behavioural measures, and pre movement times were analysed using descriptive statistics to summarise central tendency and variability (mean/SD), followed by inferential statistical analyses to test differences across experimental conditions. Qualitative interview data were analysed using reflexive thematic analysis (Braun and Clarke, 2006), following an inductive approach. Interviews were transcribed verbatim and repeatedly reviewed to achieve familiarisation with the data. Meaningful segments related to participants’ perceptions, decision-making processes, and experiences during the VR flood evacuation scenarios were systematically coded. Codes were then reviewed and grouped into candidate themes, which were iteratively refined through comparison across participants and experimental conditions.”3- rationale of each VR test is expressed in the results, but I suppose it is part of methods. In fact, in the current version, it is not clear the difference between previous works and the VR results.
Thank you for this insightful comment. In agreement with your suggestion, a subsection, titled “Rationales” has been added under Methods section, and all rationales of VR1 to VR4 have been moved to this new subsection.
4- tables in the result sections are quite difficult to read. I suggest the authors to focus on the most relevant results and then to simplify the main table contents by adding supplementary/appendix materials to corroborate results (e.g. comments could be limited)
Thanks for this comment. In agreement with your suggestion, table 3 to 6 have been moved to appendices in the revised version of the manuscript while the main results from those tables are already reported in the manuscript.5- discussions are very complex and verbose. I suggest to divide into 3 sections to: 1) provide a shortlist of key findings, 2) define the implications of the research for policy makers and 3) clarify limitations in view of future works.
Thank you for this insightful suggestion. We agree that the original Discussion section readability can be improved. Following your recommendation, the Discussion has been restructured into three subsections: (1) Key findings, (2) Implications for Risk Management, and (3) Limitations and future work. Please note that the “Future Work” section from the conclusion has been moved to the “Limitations and future work” section. This reorganization improves clarity and readability and better highlights the contributions and practical relevance of the study. Please see below for the revised Discussion:
“4 Discussion
4.1 Key Findings
This study examined how social cues shape evacuation decision-making during flood events using a series of four immersive VR experiments. Overall, the results consistently highlighted the strong influence of social information, particularly crowd behaviour, on both route choice and decision latency, extending prior research on social influence in emergencies (Helbing, Farkas et al. 2002, Petrucci 2022, Wang, Zhuang et al. 2024).
Across the first three experiments (VR1–VR3), decision-making was shaped more dominantly by social dynamics than by physical environmental characteristics. Although the physical hazard indicators, such as floodwater depth (moderate, around waist level), were rated as important, they did not consistently produce independent behavioural effects. Instead, crowd-related cues, particularly behaviour, trust, and perceived intent, emerged as the primary determinants of route selection and pre-movement time.
Findings from the VR4 experiment introduce an important nuance by highlighting the role of environmental severity. Specifically, floodwater level exerted a meaningful effect on route choice, with high water levels (around shoulder/chest level) significantly discouraging risky route selection. Importantly, this effect interacted with social cues: while risky crowd behaviour in earlier scenarios typically reduced safe route selection, this influence weakened when floodwater levels were low (around ankle level) and became substantially constrained when floodwater was visibly deep. This suggests a boundary condition in which objective environmental risk can override or diminish the impact of social cues on evacuation behaviour.
Participants frequently relied on the behaviour of virtual crowds as heuristic indicators of safety, consistent with following others and social influence theories (Helbing, Farkas et al. 2002, Wang, Zhuang et al. 2024). Safe crowd behaviour increased selection of safer routes, while risky crowd behaviour encouraged following flooded paths, particularly under uncertainty, such as unclear destinations (VR3) or seemingly manageable water levels (VR4). These patterns align with previous work on following others in emergencies and highlight the role of perceived group consensus in shaping individual risk perception (Wang, Zhuang et al. 2024).Crowd size also played a moderating role. While participants in VR1 frequently reported that crowd size affected perceived risk, VR2 showed that large crowds amplified social influence primarily when exhibiting safe behaviour. Large risky crowds increased uncertainty, reduced safe route selection, and prolonged pre-movement time, while also eliciting mixed emotional responses. These findings echo evidence that large groups can be perceived as either protective or threatening depending on context (Haghani, Sarvi et al. 2019, Kinateder and Warren 2021). Notably, large safe-behaving crowds produced the most consistent shift toward safe decisions, whereas crowd size alone was less effective when behaviour was risky. This pattern suggests that participants did not follow the majority automatically but instead evaluated the observed behaviour of others as informative cues for action, particularly when the crowd was perceived as relevant or credible.
Destination visibility further moderated reliance on social cues. In VR3, visible safe destinations increased confidence and reduced dependence on crowd behaviour, whereas ambiguous spatial conditions intensified social influence. This aligns with spatial cognition research showing that environmental legibility reduces reliance on external cues (Gärling, Böök et al. 1986) and mirrors findings from fire evacuation studies (Fu, Liu et al. 2024).
Environmental factors were consistently rated as influential but were experienced subjectively. Participants often inferred flood severity from others’ behaviour rather than direct appraisal, reinforcing evidence that hazard perception is socially modulated (Becker, Taylor et al. 2015, Bernardini, Camilli et al. 2017). Even in VR4, where high water levels discouraged risky choices more effectively, social influence remained context dependent.
Pre-movement time reflected internal conflict and uncertainty. In all studies except VR4, participants took significantly longer to act when exposed to risky social cues. VR4’s lack of significant variation in pre-movement time may reflect a boundary effect, whereby the physical extremity of floodwater made the danger sufficiently salient that social influence played a secondary role in shaping response timing.
Qualitative findings further indicated that trust and familiarity shaped responses. Some participants followed others for social validation, while others resisted crowd cues due to mistrust or prior knowledge. Repeated exposure increased confidence in ignoring misleading social information, suggesting adaptive learning effects across trials.
4.2 Implications for Risk Management
These findings have important implications for flood risk management, evacuation planning, and emergency communication. Current evacuation models and warning systems often assume rational, hazard-based decision-making; however, the present results show that individuals frequently infer risk from the behaviour of others, particularly under uncertain conditions.
The strong influence of crowd behaviour suggests that unmanaged social cues may propagate unsafe decisions during evacuation. Conversely, visible guidance from trained personnel, coordinated group movement, and clearly signposted safe routes and destinations may help counteract risky social influence. These findings highlight the importance of managing not only physical hazards but also social information during flood emergencies.
The moderating role of floodwater severity further suggests that evacuation strategies should be adaptive. In low-to-moderate flood conditions, where social influence is strongest, targeted guidance and communication may be particularly critical. In high-severity scenarios, salient environmental cues may reduce reliance on social heuristics.
From a modelling perspective, the results support the integration of empirically grounded social behaviour dynamics into flood evacuation models, including agent-based and socio-hydrological frameworks (Simonovic and Ahmad, 2005; Alonso Vicario et al., 2020). Accounting for context-dependent social influence can improve behavioural realism and predictive accuracy.
4.3 Limitations and Future Work
Several limitations should be acknowledged. First, the sample size was relatively small and demographically narrow, which may limit the generalisability of the findings. Second, while immersive VR provides a safe and controlled environment for studying hazardous situations, questions remain regarding ecological validity, and caution is required when extrapolating results to real-world flood events.
Furthermore, the virtual scenario represented a specific typology of built environment and transportation setting, which may not capture the full diversity of urban and public-use contexts. Behavioural responses observed in this study may therefore differ from those occurring in more complex urban environments or indoors where higher user density, multiple destinations, and competing social cues could influence decision-making and the activation of appropriate behaviours. This scenario specificity represents an additional limitation, and the applicability of the findings to dense urban areas and public-use buildings or outdoor spaces with larger crowds should be interpreted with caution.
The within-subject design may also have introduced learning or carryover effects, despite counterbalancing. Participants appeared more confident and less reliant on social cues in later trials, suggesting that familiarity with the environment influenced decision-making. Future studies could address this through between-subject designs or varied scenario sequencing.
Additionally, the Likert-scale questionnaire used to assess decision influences requires further validation. Future work should refine and validate measurement tools and explore additional factors such as emotional states, prior flood experience, and group-level interactions.
Future research should also examine more diverse populations, particularly individuals from flood-prone communities, and integrate VR-derived behavioural data with real-world observations and computational models. Such approaches would further strengthen the ecological validity and practical relevance of this research. “
6- some limitations, in respect to surrounding conditions simulated for each individual, should me mentioned, e.g. qualitative water levels, still/flowing waters, qualitative crowd dimension, actions of the rescue team. These issues could be better discussed to pave the way for future tests.
We agree that the surrounding conditions simulated for each participant should be more clearly described and their qualitative nature explicitly acknowledged to contextualize the findings and guide future studies.
The Discussion now acknowledges these qualitative choices as limitations, suggesting future studies could include calibrated water depths, dynamic flood flows, adaptive crowd behaviours, and interactive rescue actions to enhance ecological validity.
The revised version of discussion is as below:
“Several limitations should be acknowledged. First, the sample size was relatively small and demographically narrow, which may limit the generalisability of the findings. Second, while immersive VR provides a safe and controlled environment for studying hazardous situations, questions remain regarding ecological validity, and caution is required when extrapolating results to real-world flood events.
Furthermore, the virtual scenario represented a specific typology of built environment and transportation setting, which may not capture the full diversity of urban and public-use contexts. In addition, limitations related to the simulated and qualitative representation of environmental factors, including floodwater depth and crowd size, should be taken into account. Behavioural responses observed in this study may therefore differ from those occurring in more complex urban environments or indoor settings, where higher user density, multiple destinations, and competing social cues could influence decision-making and the activation of appropriate behaviours. This scenario specificity represents an additional limitation, and the applicability of the findings to dense urban areas and public-use buildings or outdoor spaces with larger crowds should therefore be interpreted with caution.
The within-subject design may also have introduced learning or carryover effects, despite counterbalancing. Participants appeared more confident and less reliant on social cues in later trials, suggesting that familiarity with the environment influenced decision-making. Future studies could address this through between-subject designs or varied scenario sequencing.
Additionally, the Likert-scale questionnaire used to assess decision influences requires further validation. Future work should refine and validate measurement tools and explore additional factors such as emotional states, prior flood experience, and group-level interactions.
Future research should also examine more diverse populations, particularly individuals from flood-prone communities, and integrate VR-derived behavioural data with real-world observations and computational models. Such approaches would further strengthen the ecological validity and practical relevance of this research.”
To support clarity across experimental conditions also VR Configurations section is now revised as below:
“Rescue Team: This refers to the group of NPCs wearing yellow vests red helmets, standing still, positioned around an ambulance, with an emergency helicopter located above the bridge, representing the safe destination.
Safe Destination: The designated end point where the rescue team is located, including an ambulance, an emergency helicopter, and several NPCs in yellow vests as rescue team. The following describe the states of the rescue team:
- Visible/Known Safe Destination: The condition in which the safe destination is clearly visible to participants from the starting point, located across the flooded road and elevated above the bridge.
- Invisible/Unknown Safe Destination: A condition in which participants start the experiment without seeing the safe destination and need to find it. The designated safe destination is at the end of the road over the bridge, on the opposite side of the experiment’s starting point.
Crowd: NPCs around participants at the start point take the route to the safe destination based on the experimental condition. The crowd behaviours are categorized as Risky and Safe. In the simulation, NPCs were allocated different speeds, with some running and others walking, to create a more realistic scenario. The different size of crowd is described as below:
- Medium/relatively large crowd: A group of fifteen NPCs departing from the starting point toward the designated safe destination. The number of NPCs was determined by the researchers based on an intuitive assessment of the environmental context in the first simulation as a base line in the first simulation design. The Large and Small size were defined based on the Medium/relatively large size and participants’ feedback after VR simulations.
- Large Crowd: The group of twenty NPCs departing from the starting point toward the safe destination. After the VR1 experiment, participant feedback indicated that a crowd of fifteen NPCs was perceived as “medium” or “relatively large”. Therefore, the number of NPCs was increased to twenty when designing the Large Crowd condition for the VR2 experiment.
- Small Crowd: The group of five NPCs taking the same route from the start point (Figure 2). This number chose by researchers for VR2 experiment base on the feedback by participants on the group of fifteen NPCs in VR1experiment.
Path choice: The route that NPCs (crowd) and participants take in experiment. Following describes the two choices of path designed in the VR scene:
- Risky Route: The direct path to the safe destination that requires participants and NPCs to cross the floodwater.
- Safe Route: The alternate path leading to the same destination via a hilly route and over a bridge, allowing to avoid floodwater.
Flooded Road: A road under the bridge, flooded with varying, relatively stagnant water levels depending on the experimental scenario. Following describes describe different water level:
- Medium Water Level: Water depth reaches approximately the NPCs’ hip to waist level and above the tyres of nearby flooded vehicles, depending on vehicle size. This qualitative medium water level was defined following the categorization proposed by Quagliarini, Romano et al. (2023), who classified flood water depths into four qualitative levels ranging from ankle height to above chest height. To represent an intermediate condition between the lowest and highest levels, water reaching the hip-to-waist region was designated as the medium water level for the first scenario.
- High Water Level: Water depth reaches approximately chest to shoulder height of NPCs and rises above the windows of nearby flooded vehicles (Figure 4).
- Low Water Level: Water depth is shallower than previous level, reaching ankle height of NPCs and the base of the cars’ wheels.
Pre-test Simulation: the simplified version of the VR scene (without floodwater, vehicles, crowd, or rescue elements) used for participant familiarisation. “7- moreover, the virtual scenario is very specific and could be associated only with some typologies of built environment and transportation scenarios. The implication of the methods in urban scenarios should be better discussed, considering the relevance in the activation of proper behaviours where more users are present in view of public intended uses of buildings/outdoor areas.
Thank you for this comment. We acknowledge that the virtual scenario represents a specific built environment and transportation setting, which may not capture the diversity of urban contexts. To address this comment the following is integrated in the Limitations and Future Work under Method section:
“Furthermore, the virtual scenario represented a specific typology of built environment and transportation setting, which may not capture the full diversity of urban and public-use contexts. Behavioural responses observed in this study may therefore differ from those occurring in more complex urban environments or indoors where higher user density, multiple destinations, and competing social cues could influence decision-making and the activation of appropriate behaviours. This scenario specificity represents an additional limitation, and the applicability of the findings to dense urban areas and public-use buildings or outdoor spaces with larger crowds should be interpreted with caution.”
8- some terms should be better defined, i.e. herding, chaotic crowd behaviour or unclear group.
Thank you for your insightful comment. While the term herding may reflect instinctive and animal-like behaviour and has been criticised in the literature. To avoid this unintended connotation, we replaced herding with alternatives including social influence and social cue (where appropriate), in the manuscript, which more accurately and naturally describe the deliberative, cognitively mediated processes suggested by our findings.
The term “chaotic crowd behaviour” was reported by participants in interviews when describing crowd behaviour and was therefore intended to be included as a quotation. However, due to revisions to the Discussion section, this part, including the term, has been removed from the manuscript.
“Unclear group goal” is replaced by “unclear crowd destination” for more clarity and understanding as below:“In some cases, some found comfort and direction in following others. In contrast, large effect size or unclear crowd destination led some participants to intentionally diverge, expressing distrust or a preference for quicker or drier routes.”
9- other relevant views of claimed behaviours in VR frames from test are encouraged, to better clarify the user response.
Thank you for this comment. To clarify user responses in the VR scenarios, we have revised the VR Configuration under Methods section (mentioned in question 6 response) to provide a more detailed description of the simulated crowd and rescue team. Terms such as Rescue Team and different crowd sizes and behaviours are now more explicitly defined to improve understanding of the present social cues and information which guided participant decisions.
Additionally, to further enhance transparency, video footage of the VR scenarios will be provided in the final version of the manuscript, allowing readers to directly observe the simulated crowd behaviours and rescue team roles. It will be made available and stored on the University of Nottingham repository, following FAIR data sharing guidelines, or the Journal repositor permanently. Together, these revisions support a clearer interpretation of participants’ responses across experimental conditions.
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 279 | 212 | 29 | 520 | 19 | 26 |
- HTML: 279
- PDF: 212
- XML: 29
- Total: 520
- BibTeX: 19
- EndNote: 26
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This paper reports a series of experiments using virtual reality technology to test effects of environmental and social cues on individuals’ decisions to enter into flood water in an emergency evacuation. The topic is an important one, and the research would be of interest to both researchers and practitioners (emergency planners) who plan for floods. I am aware of the wider literature on crowd dynamics in evacuations, but there is much less on behaviour in floods, so the study is a welcome contribution. The use of virtual reality technology, while artificial does allow for experimental control of scenarios and has been used for other types of emergency evacuations. The role of social influence is crucial in these events as previous research has shown, and these studies suggest strongly that it matters in floods, in interaction with certain environmental factors. The study was situated in an appropriate and up to date review of the relevant literature. The experiments and analysis were conducted competently, and the study clearly written. Overall therefore I welcome this paper.
Comments, questions and suggestions:
Abstract. Consider changing ‘a threshold where physical danger overrides social cues’ to ‘a threshold where more obvious physical danger overrides social cues
Is it necessary to use the term ‘herding’, which is more suitable for animals and their instincts, has been criticized in the literature (see Haghani et al., 2019)? ‘Social influence’ (also used) is a more neutral alternative term. Observing the behaviour of the majority of people is a good guide for how one should behave (Gigerenzer, 2008), particularly when those people are judged to be self-relevant in some way (Spears, 2021). The authors’ finding that pre-movement time was longer in risky conditions suggest that people are thinking about the examples they observe, rather than following others instinctively.
The term ‘natural disasters’ is criticized in the disasters literature, and the term ‘hazards’ is suggested instead (with disaster being the social effects of a hazard). See for example UNDRR https://www.undrr.org/our-impact/campaigns/no-natural-disasters
How was the interview data analysed?
Study 1. A large effect size is expected – but why? The fact that the authors suggest post hoc that the sample size was too small indicates that this assumption was unwarranted in this case
Can the authors provide a link to footage/ moving visualization?
Was the questionnaire developed by the authors themselves or did they use established items? We should at least see some example items (and preferably there should be a link to the whole questionnaire, so the wording of items can be seen).
Table 2 post hoc column seems to indicate that conditions were compared across experiments for VR3 and VR4, which is incorrect.
Questionnaire tables should include notes reminding us what the A, B, C, D conditions are.
Page 18. It is unclear what is meant by ‘chaotic crowd behaviour’ in the analysis of study 4, as there is no indication earlier that ‘chaotic behaviour’ would be varied in the visualization.
The discussion makes a strong claim that social cues are more important than environmental cues, even for deep floodwater. However, the analysis of VR4 could make much more clear whether there was a significant main effect of flood water level (rather than just the interaction/ tests across the four conditions).
Adrian, J., M. Amos, C. Appert-Rolland, M. Baratchi, N. Bode, M. Boltes, T. Chatagnon, M. Chraibi, A. Corbetta and A. Cuesta (2025). "Glossary for Research on Human Crowd Dynamics. This needs to be properly cited as the second edition.
References
Gigerenzer, G. (2008). Why heuristics work. Perspectives on Psychological Science, 3, 20-29. doi 10.1111/j.1745-6916.2008.00058.x
Haghani, M., Cristiani, E., Bode, N. W., Boltes, M., & Corbetta, A. (2019). Panic, irrationality, and herding: three ambiguous terms in crowd dynamics research. Journal of advanced transportation, 2019(1), 9267643.
Spears, R. (2021). Social influence and group identity. Annual Review of Psychology, 72(2021), 367-390.