Pixel Deflection - a simple transformation based defense
Change local pixel arrangement and then denoise using wavelet transform
Paper: arXiv Code: GitHub Jupyter Notebook: Source
What
- Select a random pixel and replace it with another randomly selected pixel from a local neighborhood; we call this as pixel deflection (PD).
- Use a class-activation type map (R-CAM) to select the pixel to deflect. The less important the pixel for classification, the higher the chances that it will get deflected.
- Do a soft shrinkage over the wavelet domain to remove the added noise (WD).
Why
- PD changes local statistics without affecting the global statistics. Adversarial examples rely on specific activations; PD changes that but not enough to change the overall image category.
- Most attacks are agnostic to the presence of semantic objects in the image; by picking more pixels outside the regions-of-interest we increase the likelihood of destroying the adversarial perturbation but not much of the content.
- Classifiers are trained on images without such (PD) noise. We can smooth the impact of such noise by denoising, for which we found out that BayesShrink on DWT works best.