<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" specific-use="SMUR" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">EGUsphere</journal-id>
<journal-title-group>
<journal-title>EGUsphere</journal-title>
<abbrev-journal-title abbrev-type="publisher">EGUsphere</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">EGUsphere</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub"></issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/egusphere-2026-1512</article-id>
<title-group>
<article-title>CloudMViT: Cloud Classification Using Ground-Based Remote Sensing Imagery and a Lightweight Hybrid Architecture</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Xu</surname>
<given-names>Wei</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Wu</surname>
<given-names>Ningning</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Feng</surname>
<given-names>Lin</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>School of Electronic and Information Engineering, Nanjing University of Information Science &amp; Technology, Nanjing  210044, China</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>Jiangsu Key Laboratory of Meteorological Observation and Information Processing, Nanjing University of Information  Science &amp; Technology, Nanjing 210044, China</addr-line>
</aff>
<pub-date pub-type="epub">
<day>11</day>
<month>05</month>
<year>2026</year>
</pub-date>
<volume>2026</volume>
<fpage>1</fpage>
<lpage>27</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2026 Wei Xu et al.</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://egusphere.copernicus.org/preprints/2026/egusphere-2026-1512/">This article is available from https://egusphere.copernicus.org/preprints/2026/egusphere-2026-1512/</self-uri>
<self-uri xlink:href="https://egusphere.copernicus.org/preprints/2026/egusphere-2026-1512/egusphere-2026-1512.pdf">The full text article is available as a PDF file from https://egusphere.copernicus.org/preprints/2026/egusphere-2026-1512/egusphere-2026-1512.pdf</self-uri>
<abstract>
<p>Ground-based remote sensing cloud image data can be used to analyze regional cloud type variation trends, thereby predicting future water resource supply capacity. However, existing cloud classification methods based on ground-based remote sensing imagery often suffer from limited recognition accuracy due to insufficient fine-grained feature extraction, and their large model parameter counts hinder deployment on embedded terminals. To address these issues, this study proposes CloudMViT, a lightweight hybrid network architecture fusing a dual-pooling channel attention module and cross-scale self-attention, which enhances both local and global feature representation of cloud images while optimizing computational efficiency. Specifically, the model suppresses sky background interference and strengthens cloud edge features via the dual-pooling channel attention module that combines global average pooling (GAP) and global max pooling (GMP); captures cross-channel detailed features (e.g., cirrus fibril structures and stratocumulus shadows) using depthwise separable convolution and a decoupling mechanism; and further reduces model parameters by introducing CloudGhost cascade compression technology through linear feature redundancy elimination.Experiments on the World Meteorological Organization (WMO)-compliant HBMCD (10 standard cloud genera) and GCD (7 sky conditions) datasets demonstrate that CloudMViT achieves classification accuracies of 98.81 % and 95.13 %, respectively, significantly outperforming lightweight models such as MobileViT and EfficientNet. Ablation experiments validate the effectiveness of the dual-pooling channel attention module (improving accuracy by 5.31 %) and the CloudGhost module (increasing inference speed by 50 %). When deployed on the RK3588 embedded platform, the INT8-quantized CloudMViT enables real-time inference, maintaining an accuracy of 94.79 % with only 0.47 MB of memory occupation. The proposed cloud classification method and hardware acceleration strategy provide a feasible solution for the development of portable ground-based cloud observation and classification devices.</p>
</abstract>
<counts><page-count count="27"/></counts>
<funding-group>
<award-group id="gs1">
<funding-source>National Natural Science Foundation of China</funding-source>
<award-id>U2268217</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body/>
<back>
</back>
</article>