Adversarial Attacks on Copyright Detection Systems: Difference between revisions
No edit summary |
No edit summary |
||
Line 11: | Line 11: | ||
3. The detection system needs to handle a vast majority of content which have different labels but similar features. For example, in the ImageNet classification task, the system is easily attacked when there are two cats/dogs/birds with high similarities but from different classes. | 3. The detection system needs to handle a vast majority of content which have different labels but similar features. For example, in the ImageNet classification task, the system is easily attacked when there are two cats/dogs/birds with high similarities but from different classes. | ||
== 3.3 Formulating the adversarial loss function | == 3.3 Formulating the adversarial loss function == | ||
In the previous section, local maxima of spectrogram are used to generate fingerprints by CNN, but a loss has not been quantified how similar tow fingerprints are. After the loss is found, standard gradient methods can be used to find a perturbation $\delta$, which can be added to a signal so that the copyright detection system will be tricked. Also, a bound is set to make sure the generated fingerprints are close enough to the original audio signal. | |||
$$\text{bound:}\ ||\delta||_p\le\epsilon$$ | |||
== Conclusion == | == Conclusion == |
Revision as of 09:49, 15 November 2020
Presented by
Luwen Chang, Qingyang Yu, Tao Kong, Tianrong Sun
1. Introduction
Copyright detection system is one of the most commonly used machine learning systems; however, the hardiness of copyright detection and content control systems to adversarial attacks, inputs intentionally designed by people to cause the model to make a mistake, has not been widely addressed by public. Copyright detection system are vulnerable to attacks for three reasons.
1. Unlike to physical-world attacks where adversarial samples need to survive under different conditions like resolutions and viewing angles, any digital files can be uploaded directly to the web without going through a camera or microphone.
2. The detection system is open which means the uploaded files may not correspond to an existing class. In this case, it will prevent people from uploading unprotected audio/video whereas most of the uploaded files nowadays are not protected.
3. The detection system needs to handle a vast majority of content which have different labels but similar features. For example, in the ImageNet classification task, the system is easily attacked when there are two cats/dogs/birds with high similarities but from different classes.
3.3 Formulating the adversarial loss function
In the previous section, local maxima of spectrogram are used to generate fingerprints by CNN, but a loss has not been quantified how similar tow fingerprints are. After the loss is found, standard gradient methods can be used to find a perturbation $\delta$, which can be added to a signal so that the copyright detection system will be tricked. Also, a bound is set to make sure the generated fingerprints are close enough to the original audio signal. $$\text{bound:}\ ||\delta||_p\le\epsilon$$
Conclusion
In this paper, different types of copyright detection systems will be introduced. A widely used detection model from Shazam, a popular app used for recognizing music, will be discussed. Next, the paper talks about how to generate audio fingerprints using convolutional neural network and formulates the adversarial loss function using standard gradient methods. An example of remixing music is given to show how adversarial examples can be created. Then the adversarial attacks are applied onto industrial systems like AudioTag and YouTube Content ID to evaluate the effectiveness of the systems, and the conclusion is made at the end.
5. Conclusion
In conclusion, many industrial copyright detection systems used in the popular video and music website such as YouTube and AudioTag are significantly vulnerable to adversarial attacks established in the existing literature. By building a simple music identification system resembling that of Shazam using neural network and attack it by the well-known gradient method, this paper firmly proved the lack of robustness of the current online detector. The intention of this paper is to raise the awareness of the vulnerability of the current online system to adversarial attacks and to emphasize the significance of enhancing our copyright detection system. More approach, such as adversarial training needs to be developed and examined, in order to protect us against the threat of adversarial copyright attack.