the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Quantifying gender gaps in seismology authorship
Abstract. According to 2018 demographic data of the American Geophysical Union Fall Meeting, seismology is among the Geoscience fields with the lowest female representation. To understand whether this reflects seismology more generally, we investigate female authorship of peer-reviewed publications, a key factor in career advancement. Building upon open-source tools for web-scraping, we create a database of bibliographic information for seismological articles published in 14 international journals from 2010 to 2020. We use the probabilities of author names being either male or female-gendered to analyse the representation of female authors in terms of author position and subsequently per journal, year, and publication productivity. The results indicate that: 1) The overall probability of the first (last) author being female is 0.28 (0.19); 2) With the calculated rate of increase from 2010 to 2020, equal probabilities of female and male authorship would be reached towards the end of the century; 3) Compared to the overall probability of male authorship (0.76), single-authored papers in our database are disproportionately published by male authors (with probability 0.83); 4) Female representation decreases among highly productive authors; 5) Rather than being random, the composition of authorship appears to be influenced by gender: Firstly, all-male author teams are more common than what would be expected if teams were composed randomly. Secondly, the probability that first or co-authors are female increases when the last author is female, but first female authors have a low probability of working with female co-authors.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(594 KB)
-
Supplement
(74 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(594 KB) - Metadata XML
-
Supplement
(74 KB) - BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-810', Anonymous Referee #1, 27 Oct 2022
Paper investigates female authorship of peer-reviewed publications in seismology. The topic of the paper is high significance as authorship of scientific peer-reviewed papers remains an important criterion for assessing researchers’ performance and consequent career advancement. Therefore, any biases or underrepresentation of any demographic groups may lead to lower chance of recognitions (job opportunities, career progressions, funding, etc.).
I will not comment on the probabilistic approach of determining the gender of authors based on the first name nor on the statistical method. However, the size of the sample used is sound in terms of statistical significance. The overall reasoning and justifications of various decisions is very appropriate.
The results are quite relevant and of interest. The representation by journal is very useful and (potentially) an eye-opener to both female and male authors.
In the discussion, the authors compared their results to those of the European Commission. They observed a correlation between 24% of authorship by women and 30% women representation in natural sciences. I am not sure it adds much to the discussion as this comparison is very difficult and thus any correlation is oversimplified.
In the same line, there are attempts to draw correlations between different fields (i.e. life sciences), which may oversimplify the unique dynamics of each scientific field. However, the lack of more data specific to geosciences in general and seismology in particular explains the comparison with other fields – even if I would prefer to see some more cautiousness in the next stage. In fact, perhaps this limitation of the interpretation / comparisons would deserved to be better highlighted.
Correlations with and extrapolation from EGU data is an important asset of the paper, as it represents an important reliable data specific to geosciences and seismology. However, EGU data is rather complex. There are membership datasets, registration at General Assembly (GA) datasets, and there have been changes on how data is collected (i.e. gender changed from optional or mandatory field a few times…). The last two years of GA were also severely impacted by COVID-19 restrictions (online vs. onsite vs hybrid) which adds other layers of complexity to its data. In conclusion, I would strongly encourage the authors to pursue comparisons with EGU data but either in a dedicated chapter or even paper.
The paper conclusions are interesting and the conclusions are of interest for all other fields . Especially, the last bullet “Those evaluating research performance should remain aware that there are, as of now, gender gaps in high-productivity, solo, and high-impact authorship in seismology. ”point that deserves a large dissemination. The authors could go even further in terms of ambition in the set of recommendations.
Citation: https://doi.org/10.5194/egusphere-2022-810-RC1 - AC1: 'Reply on RC1', Laura Ermert, 10 Feb 2023
- AC3: 'Editor and community comments', Laura Ermert, 10 Feb 2023
-
RC2: 'Comment on egusphere-2022-810', Benjamin Fernando, 09 Nov 2022
SUMMARY:
This is an excellent paper which provides a sound evidential basis for an issue of under-representation which affects the entire geosciences community. Only a few minor changes (mostly further explanation or clairification) before being ready for publication in my opinion.
TECHNICAL CHANGES:
- line 24 - 'role models' -> 'a lack of role models'?
- line 67 (and elsewhere) - is the apostrophe for separating numbers house style?
- Section 3- did I miss AAGR being defined somewhere? Possibly…
- line 250 (and elsewhere): random capitalisations e.g. ‘As’
MINOR CHANGES:
- lines 20-21: 'the attrition of female graduates' - there is plenty of evidence for this at a pre-graduate (school) stage too in some of the precursor subjects to seismology (chemistry, physics, maths). I would add in a reference to an appropriate paper to highlight that this is a long-standing, societal issue.
- line 45: 'assuming a 1:1 gender ratio' - I think that it is worth commenting, even briefly, that a 50:50 assumption produces an underestimate of the scale of the problem as the population ratio isn't quite 50:50.
- line 48: ‘graduation rates in the US’ - is this for just undergraduates, or graduates too? Can you add in a reference for other countries if possible?
- line 150 (and elsewhere): can you please define exactly what you mean by ‘negative bias’ - overall representation being poor (<50%), poor relative to another benchmark, or something else?
- lines 149 and line 153: do the 93% and 95% probabilities agree? Or are they different measures? This is unclear to me.
- line 232: ‘corresponds reasonably well’ is not a particularly descriptive statement - is this directly in relation to the following lines? If so I would delete this sentence.
(Slightly more) MAJOR CHANGES:
- line 61: ’14 international journals’ - can you give a more rigorous description of why you chose these 14? SCOPUS entries? Web of Sciences indices? Line 312 sounds a bit… glib.
- lines 114 - 121: the statements that ‘we assume the genders of the authors in the article are independent’ and ‘we derive conditional probabilities of the first author gender [which show correlation]’ seem to me to be mutually exclusive - is the mathematics correct in its detail here?
- Figure 2b: what is the statistical significance of the variations from bar to bar (it appears to level off quite quickly, but there’s a peak/trough at 8 authors. How many papers have 8 authors and if not many, is a trend line plotted on top also helpful?
- Sec 4: s there another paper that could be written following individual authors (and their publishing trends) through time that could be reported in an anonymised and ethical way? That may be an interesting way of predicting future trends by looking at whether early-career researchers’ co-authorship profiles are changing more quickly than those of their more senior colleagues?
- Sec 4 (end): I think that the discussion of the limitations of APIs is good, but needs to be more thorough. This is especially true if this is going to be a well-read paper which informs people who are not experts, which I assume it will be. Things to consider: Is there any data or suggestion for what would happen if you treat gender (as a concept rather than a probability) as non-binary? Do we see big changes, or is there just not enough study at the moment? Are there ethnicities or countries for which the chosen APIs are known to perform particularly badly (e.g. I’ve seen potential suggestion of names from Eastern Asia being particularly poorly sorted).
- Data availability - I don't know what the data that the authors are offering to share is - although it is publically derived, it is worth considering if it is identifable and if so whether any GDPR constraints (or the like) apply if it is aggregated in a novel way? Not my area of expertise.
Citation: https://doi.org/10.5194/egusphere-2022-810-RC2 - AC2: 'Reply on RC2', Laura Ermert, 10 Feb 2023
- AC3: 'Editor and community comments', Laura Ermert, 10 Feb 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-810', Anonymous Referee #1, 27 Oct 2022
Paper investigates female authorship of peer-reviewed publications in seismology. The topic of the paper is high significance as authorship of scientific peer-reviewed papers remains an important criterion for assessing researchers’ performance and consequent career advancement. Therefore, any biases or underrepresentation of any demographic groups may lead to lower chance of recognitions (job opportunities, career progressions, funding, etc.).
I will not comment on the probabilistic approach of determining the gender of authors based on the first name nor on the statistical method. However, the size of the sample used is sound in terms of statistical significance. The overall reasoning and justifications of various decisions is very appropriate.
The results are quite relevant and of interest. The representation by journal is very useful and (potentially) an eye-opener to both female and male authors.
In the discussion, the authors compared their results to those of the European Commission. They observed a correlation between 24% of authorship by women and 30% women representation in natural sciences. I am not sure it adds much to the discussion as this comparison is very difficult and thus any correlation is oversimplified.
In the same line, there are attempts to draw correlations between different fields (i.e. life sciences), which may oversimplify the unique dynamics of each scientific field. However, the lack of more data specific to geosciences in general and seismology in particular explains the comparison with other fields – even if I would prefer to see some more cautiousness in the next stage. In fact, perhaps this limitation of the interpretation / comparisons would deserved to be better highlighted.
Correlations with and extrapolation from EGU data is an important asset of the paper, as it represents an important reliable data specific to geosciences and seismology. However, EGU data is rather complex. There are membership datasets, registration at General Assembly (GA) datasets, and there have been changes on how data is collected (i.e. gender changed from optional or mandatory field a few times…). The last two years of GA were also severely impacted by COVID-19 restrictions (online vs. onsite vs hybrid) which adds other layers of complexity to its data. In conclusion, I would strongly encourage the authors to pursue comparisons with EGU data but either in a dedicated chapter or even paper.
The paper conclusions are interesting and the conclusions are of interest for all other fields . Especially, the last bullet “Those evaluating research performance should remain aware that there are, as of now, gender gaps in high-productivity, solo, and high-impact authorship in seismology. ”point that deserves a large dissemination. The authors could go even further in terms of ambition in the set of recommendations.
Citation: https://doi.org/10.5194/egusphere-2022-810-RC1 - AC1: 'Reply on RC1', Laura Ermert, 10 Feb 2023
- AC3: 'Editor and community comments', Laura Ermert, 10 Feb 2023
-
RC2: 'Comment on egusphere-2022-810', Benjamin Fernando, 09 Nov 2022
SUMMARY:
This is an excellent paper which provides a sound evidential basis for an issue of under-representation which affects the entire geosciences community. Only a few minor changes (mostly further explanation or clairification) before being ready for publication in my opinion.
TECHNICAL CHANGES:
- line 24 - 'role models' -> 'a lack of role models'?
- line 67 (and elsewhere) - is the apostrophe for separating numbers house style?
- Section 3- did I miss AAGR being defined somewhere? Possibly…
- line 250 (and elsewhere): random capitalisations e.g. ‘As’
MINOR CHANGES:
- lines 20-21: 'the attrition of female graduates' - there is plenty of evidence for this at a pre-graduate (school) stage too in some of the precursor subjects to seismology (chemistry, physics, maths). I would add in a reference to an appropriate paper to highlight that this is a long-standing, societal issue.
- line 45: 'assuming a 1:1 gender ratio' - I think that it is worth commenting, even briefly, that a 50:50 assumption produces an underestimate of the scale of the problem as the population ratio isn't quite 50:50.
- line 48: ‘graduation rates in the US’ - is this for just undergraduates, or graduates too? Can you add in a reference for other countries if possible?
- line 150 (and elsewhere): can you please define exactly what you mean by ‘negative bias’ - overall representation being poor (<50%), poor relative to another benchmark, or something else?
- lines 149 and line 153: do the 93% and 95% probabilities agree? Or are they different measures? This is unclear to me.
- line 232: ‘corresponds reasonably well’ is not a particularly descriptive statement - is this directly in relation to the following lines? If so I would delete this sentence.
(Slightly more) MAJOR CHANGES:
- line 61: ’14 international journals’ - can you give a more rigorous description of why you chose these 14? SCOPUS entries? Web of Sciences indices? Line 312 sounds a bit… glib.
- lines 114 - 121: the statements that ‘we assume the genders of the authors in the article are independent’ and ‘we derive conditional probabilities of the first author gender [which show correlation]’ seem to me to be mutually exclusive - is the mathematics correct in its detail here?
- Figure 2b: what is the statistical significance of the variations from bar to bar (it appears to level off quite quickly, but there’s a peak/trough at 8 authors. How many papers have 8 authors and if not many, is a trend line plotted on top also helpful?
- Sec 4: s there another paper that could be written following individual authors (and their publishing trends) through time that could be reported in an anonymised and ethical way? That may be an interesting way of predicting future trends by looking at whether early-career researchers’ co-authorship profiles are changing more quickly than those of their more senior colleagues?
- Sec 4 (end): I think that the discussion of the limitations of APIs is good, but needs to be more thorough. This is especially true if this is going to be a well-read paper which informs people who are not experts, which I assume it will be. Things to consider: Is there any data or suggestion for what would happen if you treat gender (as a concept rather than a probability) as non-binary? Do we see big changes, or is there just not enough study at the moment? Are there ethnicities or countries for which the chosen APIs are known to perform particularly badly (e.g. I’ve seen potential suggestion of names from Eastern Asia being particularly poorly sorted).
- Data availability - I don't know what the data that the authors are offering to share is - although it is publically derived, it is worth considering if it is identifable and if so whether any GDPR constraints (or the like) apply if it is aggregated in a novel way? Not my area of expertise.
Citation: https://doi.org/10.5194/egusphere-2022-810-RC2 - AC2: 'Reply on RC2', Laura Ermert, 10 Feb 2023
- AC3: 'Editor and community comments', Laura Ermert, 10 Feb 2023
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
514 | 159 | 17 | 690 | 37 | 7 | 1 |
- HTML: 514
- PDF: 159
- XML: 17
- Total: 690
- Supplement: 37
- BibTeX: 7
- EndNote: 1
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Laura Anna Ermert
Naiara Korta Martiartu
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(594 KB) - Metadata XML
-
Supplement
(74 KB) - BibTeX
- EndNote
- Final revised paper