Multi-Granularity Semantic Inconsistency Modeling for Detecting LLM-Generated Fake Reviews in Online Marketplaces

Tianyi Luo; Wenhao Xu; Ruichen Tang

doi:10.54097/qgwkw932

Authors

Tianyi Luo
Wenhao Xu
Ruichen Tang

DOI:

https://doi.org/10.54097/qgwkw932

Keywords:

Fake review detection, LLM-generated text, Semantic inconsistency, Multi-granularity modeling, Online marketplaces, Transformer-based detection

Abstract

The rapid advancement of large language models (LLMs) has fundamentally altered the landscape of online review manipulation, enabling the mass production of synthetic reviews that closely mimic authentic human writing. Existing fake review detection methods, predominantly designed for manually crafted deceptive content, exhibit significant performance degradation when confronted with LLM-generated text. This paper proposes a multi-granularity semantic inconsistency modeling (MGSIM) framework that captures latent contradictions embedded within LLM-generated reviews across word, sentence, and discourse levels. The framework integrates a hierarchical encoder with cross-granularity attention alignment and an inconsistency scoring module trained on contrastive review pairs collected from major e-commerce platforms. Experimental results on three benchmark datasets demonstrate that MGSIM achieves an F1 score of 91.3%, outperforming state-of-the-art baselines by an average margin of 6.8 percentage points. Ablation studies confirm that discourse-level inconsistency signals contribute the most discriminative power, particularly for reviews generated by instruction-tuned LLMs. This work offers both a practical detection tool and a theoretical characterization of the structural artifacts introduced by LLM generation, with implications for platform governance and consumer trust.

Downloads

Download data is not yet available.

References

[1] Zhao, W., Chen, T., Yang, J. S., & Qiu, L. (2026). AutoML-Pipeline: A RAG-enhanced code generation framework with pre-validation for cloud-native machine learning workflows. IEEE Access, 14, 1-15.

[2] Zhang, S., Qiu, L., & Zeng, Z. (2026). Physics-data synergy in structural health monitoring: A multi-scale graph contrastive framework with temperature-adaptive fusion. IEEE Access, 14, 1-18.

[3] Ding, J., Shen, Z., & Liu, W. (2026). Game-theoretic cost-sensitive adversarial training for robust cloud intrusion detection against GAN-based evasion attacks. Applied Sciences, 16(8), 3944. https://doi.org/10.3390/app16083944

[4] Ping, W., Jiao, Y., Fan, H., & Zhang, X. (2026). Multimodal fraud detection in financial statements: A trimodal attention network with contrastive evidence chain construction. IEEE Access, 14, 1-16.

[5] Uchendu, A., Ma, Z., Le, T., Zhang, R., & Lee, D. (2021). TuringBench: A benchmark environment for Turing test in the age of neural text generation. In Findings of the Association for Computational Linguistics: EMNLP 2021 (pp. 2001-2016). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.findings-emnlp.172

[6] Hancock, J. T., Naaman, M., & Levy, K. (2020). AI-mediated communication: Definition, research agenda, and ethical considerations. Journal of Computer-Mediated Communication, 25(1), 89-100. https://doi.org/10.1093/jcmc/zmz022

[7] Zellers, R., Holtzman, A., Rashkin, H., Bisk, Y., Farhadi, A., Roesner, F., & Choi, Y. (2019). Defending against neural fake news. Advances in Neural Information Processing Systems, 32, 9051-9062.

[8] Barbado, R., Araque, O., & Iglesias, C. A. (2019). A framework for fake review detection in online consumer electronics retailers. Information Processing & Management, 56(4), 1234-1244. https://doi.org/10.1016/j.ipm.2019.03.002

[9] Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 4171-4186). Association for Computational Linguistics. https://doi.org/10.18653/v1/N19-1423

[10] Chernyavskiy, A., Ilvovsky, D., & Nakov, P. (2021). Transformers: “The end of history” for natural language processing? In M. H. B. A. M. F. (Ed.), Machine Learning and Knowledge Discovery in Databases: Applied Data Science Track (pp. 677-693). Springer. https://doi.org/10.1007/978-3-030-86523-8_44

[11] Gehrmann, S., Strobelt, H., & Rush, A. M. (2019). GLTR: Statistical detection and visualization of generated text. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (pp. 111-116). Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-3019

[12] Ippolito, D., Duckworth, D., Callison-Burch, C., & Eck, D. (2020). Automatic detection of generated text is easiest when humans are fooled. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 1808-1822). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.164

[13] Kirchenbauer, J., Geiping, J., Wen, Y., Katz, J., Miers, I., & Goldstein, T. (2023). A watermark for large language models. In International Conference on Machine Learning (pp. 17061-17084). PMLR.

[14] Hu, X., Chen, P.-Y., & Ho, T.-Y. (2023). RADAR: Robust AI-text detection via adversarial learning. Advances in Neural Information Processing Systems, 36, 15077-15095.

[15] Guo, B., Zhang, X., Wang, Z., Jiang, M., Nie, J., Ding, Y., Yue, J., & Wu, Y. (2023). How close is ChatGPT to human experts? Comparison corpus, evaluation, and detection. arXiv, arXiv:2301.07597. https://doi.org/10.48550/arXiv.2301.07597

[16] Mitchell, E., Lee, Y., Khazatsky, A., Manning, C. D., & Finn, C. (2023). DetectGPT: Zero-shot machine-generated text detection using probability curvature. In International Conference on Machine Learning (pp. 24950-24962). PMLR.

[17] Wang, Y., Mansurov, J., Ivanov, P., Su, J., Shelmanov, A., Tsvigun, A., Afonina, T., … & Nakov, P. (2024). M4: Multi-generator, multi-domain, and multi-lingual black-box machine-generated text detection. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1369-1407). Association for Computational Linguistics.

[18] He, X., Shen, X., Chen, Z., Backes, M., & Zhang, Y. (2024). MGTBench: Benchmarking machine-generated text detection. In Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security (pp. 2251-2265). ACM.

[19] Bian, T., Xiao, X., Xu, T., Zhao, P., Huang, W., Rong, Y., & Huang, J. (2020). Rumor detection on social media with bi-directional graph convolutional networks. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 1, pp. 549-556). AAAI Press. https://doi.org/10.1609/aaai.v34i01.5398

[20] Lin, H., Yi, P., Ma, J., Jiang, H., Luo, Z., Shi, S., & Liu, R. (2023). Zero-shot rumor detection with propagation structure via prompt learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, No. 4, pp. 5213-5221). AAAI Press. https://doi.org/10.1609/aaai.v37i4.25644

[21] Cohan, A., Beltagy, I., King, D., Dalvi, B., & Weld, D. S. (2019). Pretrained language models for sequential sentence classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 3693-3699). Association for Computational Linguistics. https://doi.org/10.18653/v1/D19-1384

[22] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT pretraining approach. arXiv, arXiv:1907.11692. https://doi.org/10.48550/arXiv.1907.11692

[23] Shu, K., Mahudeswaran, D., Wang, S., Lee, D., & Liu, H. (2020). FakeNewsNet: A data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data, 8(3), 171-188. https://doi.org/10.1089/big.2020.0062

[24] Liu, Y., & Wu, Y. F. B. (2020). FNED: A deep network for fake news early detection on social media. ACM Transactions on Information Systems, 38(3), Article 25. https://doi.org/10.1145/3386253

[25] Krishna, K., Song, Y., Karpinska, M., Wieting, J., & Iyyer, M. (2023). Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense. Advances in Neural Information Processing Systems, 36, 27469-27500.

[26] Baishya, D., Deka, J. J., Dey, G., & Singh, P. K. (2021). SAFER: Sentiment analysis-based fake review detection in e-commerce using deep learning. SN Computer Science, 2(6), 479. https://doi.org/10.1007/s42979-021-00883-1

[27] Teimoori, Z., Salehi, M., Ranjbar, V., Shehnepoor, S. R., & Najari, S. (2022). Detecting group review spammers in social media. Journal of AI and Data Mining, 10(2), 269-283.

[28] Chen, J., Liu, J., Liang, Y., & Zhou, M. (2026). KE-MLLM: A knowledge-enhanced multi-sensor learning framework for explainable fake review detection. Applied Sciences, 16(6), 2909. https://doi.org/10.3390/app16062909