✨ TL;DR
This paper introduces CHASM, a dataset of 4,992 annotated instances from Chinese social media for detecting covert advertisements disguised as regular posts. The study reveals that current multimodal large language models fail to reliably identify these deceptive advertisements, highlighting a critical gap in content moderation capabilities.
Covert advertisements on social media platforms pose a significant ethical and legal threat by disguising promotional content as authentic user posts to deceive consumers into making purchases. Current benchmarks and evaluation frameworks for large language models in social media moderation completely overlook this emerging threat, leaving platforms vulnerable to sophisticated deceptive marketing practices. This gap is particularly acute on platforms like Rednote, where product experience sharing posts can closely resemble covert advertisements, making detection challenging.
The authors created CHASM, a high-quality, manually curated dataset of 4,992 instances collected from the Chinese social media platform Rednote. The dataset was compiled under strict privacy protection and quality control protocols with careful anonymization. The authors evaluated multiple multimodal large language models (MLLMs) under zero-shot and in-context learning settings, then conducted fine-tuning experiments on open-source MLLMs to assess performance improvements and identify persistent challenges in covert advertisement detection.
What the paper shows.
Evaluation of current MLLMs revealed inadequate performance in detecting covert advertisements across both zero-shot and in-context learning settings. Fine-tuning experiments on open-source MLLMs showed noticeable performance gains when trained on the CHASM dataset, indicating the dataset's utility for model improvement. However, the models continue to struggle with subtle contextual cues in comments and distinguishing between visual and textual structural differences that characterize covert advertisements.
The study is limited to the Chinese social media platform Rednote, potentially restricting generalizability to other platforms and languages. The paper does not provide specific quantitative performance metrics for the evaluated models, making it difficult to assess the magnitude of performance gaps. The dataset's focus on product experience sharing posts may not capture the full diversity of covert advertisement strategies. Additionally, the paper acknowledges but does not fully resolve challenges in detecting subtle cues and visual-textual distinctions, indicating incomplete solutions to the core problem.
✨ Generated by Claude · Apr 25, 2026 · Read the PDF for authoritative content.