ASWmark: a copyright protection approach for audio classification datasets
Fan, Xuefeng ORCID: https://orcid.org/0000-0003-1932-725X, Wang, Yili
ORCID: https://orcid.org/0009-0005-7615-9071, Tian, Zhiyi
ORCID: https://orcid.org/0000-0001-8905-0941, Xing, Fan
ORCID: https://orcid.org/0009-0003-6350-5057, Ma, Jixin
ORCID: https://orcid.org/0000-0001-7458-7412 and Zhou, Xiaoyi
ORCID: https://orcid.org/0000-0003-3777-9479
(2026)
ASWmark: a copyright protection approach for audio classification datasets.
Expert Systems with Applications, 327:132743.
ISSN 0957-4174 (Print), 1873-6793 (Online)
(doi:10.1016/j.eswa.2026.132743)
|
PDF (Author's Accepted Manuscript)
53467 MA_ASWmark_A_Copyright_Protection_Approach_For_Audio_Classification_Datasets_(AAM)_2026.pdf - Accepted Version Restricted to Repository staff only until 8 May 2027. Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (9MB) | Request a copy |
Abstract
The success of high-performance Deep Neural Network (DNN) models relies heavily on extensive training datasets. The valuable attributes of these datasets make them attractive targets for unauthorized exploitation, necessitating robust copyright protection mechanisms. Existing research has achieved a certain degree of protection by verifying whether third-party models are trained on specific datasets. However, several challenges remain unresolved: (1) watermark embedding often compromises model fidelity, thereby degrading performance on the original task; and (2) due to the lack of thorough robustness evaluations, verification reliability often falters under practical postdeployment transformations such as fine-tuning, compression, or input preprocessing. To address these issues, this paper proposes ASWmark, a novel watermarking framework for audio classification datasets. ASWmark integrates an adversarial example generation algorithm with a probabilistic heuristic search strategy to construct an optimized trigger set with inherent adversarial properties. We comprehensively evaluate the framework using two flexible watermark embedding strategies: the PreTrained and FromScratch methods. Experimental results demonstrate that ASWmark effectively protects dataset copyright, maintaining high fidelity, robustness, and portability.
| Item Type: | Article |
|---|---|
| Uncontrolled Keywords: | dataset ownership verification, data poisoning, audio watermarking, Deep neural network |
| Subjects: | Q Science > Q Science (General) Q Science > QA Mathematics > QA75 Electronic computers. Computer science Z Bibliography. Library Science. Information Resources > ZA Information resources Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4450 Databases |
| Faculty / School / Research Centre / Research Group: | Faculty of Engineering & Science Faculty of Engineering & Science > School of Computing & Mathematical Sciences (CMS) |
| Last Modified: | 22 May 2026 08:50 |
| URI: | https://gala.gre.ac.uk/id/eprint/53467 |
Actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year
Tools
Tools