Cross-model evaluation of LLMs for generating formal specification of distributed Industrial Control Systems
Raptis, George E., Khan, Muhammad Taimoor ORCID: https://orcid.org/0000-0002-5752-6420, Koulamas, Christos and Serpanos, Dimitrios
(2025)
Cross-model evaluation of LLMs for generating formal specification of distributed Industrial Control Systems.
Procedia Computer Science, 270.
pp. 5846-5854.
ISSN 1877-0509 (Online)
(doi:10.1016/j.procs.2025.10.053)
Preview |
PDF (Open Access Conference Article)
51785 KHAN_Cross-Model_Evaluation_Of_LLMs_For_Generating_Formal_Specification_Of_Distributed_Industrial_Control_Systems_(OA)_2025.pdf - Published Version Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (537kB) | Preview |
Abstract
The increasing complexity and decentralization of Industrial Control Systems (ICS) have expanded the attack surface for cyber-security threats, particularly in critical infrastructure domains. Inline monitoring using formal annotations like the Java Modeling Language (JML) offers a lightweight yet precise method to detect behavioral anomalies. However, the manual creation of such specifications is resource-intensive and requires domain expertise. This paper explores generative artificial intelligence (AI), specifically large language models (LLMs), to automate the synthesis of inline formal security monitors. We benchmark three state-of-the-art LLMs (GPT-4o, DeepSeek-V3, and Gemini 2.5 Flash) on their ability to generate JML annotations for ICS software drawn from the ASM2S water distribution system. Our evaluation across five dimensions (syntax, semantics, property coverage, alarm semantics, and effort savings) reveals distinct trade-offs. GPT-4o demonstrates strong syntactic and structural alignment with ASM2S, while DeepSeek-V3 offers richer behavioral modeling. Gemini 2.5 Flash showcases conceptual depth but introduces non-verifiable constructs. These findings demonstrate the potential of LLMs as co-pilots in secure-by-design ICS development and underscore the need for syntax-aware fine-tuning and interactive verification workflows.
| Item Type: | Article |
|---|---|
| Additional Information: | 29th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES 2025) |
| Uncontrolled Keywords: | inline security monitoring, industrial control systems, distributed ICS, water distribution systems, JML specifications, generative AI, large language models, cybersecurity, formal verification, runtime anomaly detection |
| Subjects: | Q Science > Q Science (General) Q Science > QA Mathematics Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
| Faculty / School / Research Centre / Research Group: | Faculty of Engineering & Science Faculty of Engineering & Science > School of Computing & Mathematical Sciences (CMS) |
| Last Modified: | 26 Nov 2025 10:00 |
| URI: | https://gala.gre.ac.uk/id/eprint/51785 |
Actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year
Tools
Tools