ASWmark: a copyright protection approach for audio classification datasets

Tools

Fan, Xuefeng ORCID: https://orcid.org/0000-0003-1932-725X, Wang, Yili ORCID: https://orcid.org/0009-0005-7615-9071, Tian, Zhiyi ORCID: https://orcid.org/0000-0001-8905-0941, Xing, Fan ORCID: https://orcid.org/0009-0003-6350-5057, Ma, Jixin ORCID: https://orcid.org/0000-0001-7458-7412 and Zhou, Xiaoyi ORCID: https://orcid.org/0000-0003-3777-9479 (2026) ASWmark: a copyright protection approach for audio classification datasets. Expert Systems with Applications, 327:132743. ISSN 0957-4174 (Print), 1873-6793 (Online) (doi:10.1016/j.eswa.2026.132743)

[thumbnail of Author's Accepted Manuscript]

PDF (Author's Accepted Manuscript)
53467 MA_ASWmark_A_Copyright_Protection_Approach_For_Audio_Classification_Datasets_(AAM)_2026.pdf - Accepted Version
Restricted to Repository staff only until 8 May 2027.
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (9MB) | Request a copy

Official URL: https://doi.org/10.1016/j.eswa.2026.132743

Abstract

The success of high-performance Deep Neural Network (DNN) models relies heavily on extensive training datasets. The valuable attributes of these datasets make them attractive targets for unauthorized exploitation, necessitating robust copyright protection mechanisms. Existing research has achieved a certain degree of protection by verifying whether third-party models are trained on specific datasets. However, several challenges remain unresolved: (1) watermark embedding often compromises model fidelity, thereby degrading performance on the original task; and (2) due to the lack of thorough robustness evaluations, verification reliability often falters under practical postdeployment transformations such as fine-tuning, compression, or input preprocessing. To address these issues, this paper proposes ASWmark, a novel watermarking framework for audio classification datasets. ASWmark integrates an adversarial example generation algorithm with a probabilistic heuristic search strategy to construct an optimized trigger set with inherent adversarial properties. We comprehensively evaluate the framework using two flexible watermark embedding strategies: the PreTrained and FromScratch methods. Experimental results demonstrate that ASWmark effectively protects dataset copyright, maintaining high fidelity, robustness, and portability.

Item Type:	Article
Uncontrolled Keywords:	dataset ownership verification, data poisoning, audio watermarking, Deep neural network
Subjects:	Q Science > Q Science (General) Q Science > QA Mathematics > QA75 Electronic computers. Computer science Z Bibliography. Library Science. Information Resources > ZA Information resources Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4450 Databases
Faculty / School / Research Centre / Research Group:	Faculty of Engineering & Science Faculty of Engineering & Science > School of Computing & Mathematical Sciences (CMS)
Last Modified:	22 May 2026 08:50
URI:	https://gala.gre.ac.uk/id/eprint/53467

Actions (login required)

View Item

Downloads

Downloads per month over past year

View more statistics

Altmetric