14 Dec 2022
 | 14 Dec 2022

Linking satellites to genes with machine learning to estimate major phytoplankton groups from space

Roy El Hourany, Juan Pierella Karlusich, Lucie Zinger, Hubert Loisel, Marina Levy, and Chris Bowler

Abstract. Ocean color remote sensing offers two decades-long time series of information on phytoplankton abundance. However, determining the structure of the phytoplankton community from this signal is not straightforward, and many uncertainties remain to be evaluated, despite multiple intercomparison efforts of the different available algorithms. Here, we use remote sensing and machine learning to infer the abundance of seven phytoplankton groups at a global scale based on a new molecular method from Tara Oceans. Our dataset is to our knowledge the most comprehensive and complete, available to describe phytoplankton community structure at a global scale using a molecular marker that defines relative abundances of all phytoplankton groups simultaneously. The methodology shows satisfying performances to provide robust estimates of phytoplankton groups using satellite data, with few limitations regarding the global generalization of the method. Furthermore, this new satellite-based methodology allows a valuable global intercomparison with the pigment-based approach used in in-situ and satellite data to identify phytoplankton groups. Nevertheless, these datasets show different, yet coherent information on the phytoplankton, valuable for the understanding of community structure. This makes remote sensing observations excellent tools to collect Essential Biodiversity Variables and provide a foundation for developing marine biodiversity forecasts.

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Short summary
Satellite observations offer valuable information on phytoplankton abundance and community structure. Here, we employ satellite observations to infer seven phytoplankton groups at a global scale based on a new molecular method from Tara Oceans. The link has been established using Machine Learning approaches. The output of this work provides excellent tools to collect Essential Biodiversity Variables and provide a foundation to monitor the evolution of marine biodiversity.