Audio-Visual Autoencoding for Privacy-Preserving Video Streaming
Published:
[IoTJ] H. Xu, Z. Cai, D. Takabi and W. Li, Audio-Visual Autoencoding for Privacy-Preserving Video Streaming [J]. IEEE Internet of Things Journal (IoTJ), 2021, 9(3): 1749-1761.(IF: 9.936) Download paper here
The demand of sharing video streaming extremely increases due to the proliferation of Internet of Things (IoT) devices in recent years, and the explosive development of artificial intelligent (AI) detection techniques has made visual privacy protection more urgent and difficult than ever before. Although a number of approaches have been proposed, their essential drawbacks limit the effect of visual privacy protection in real applications. In this article, we propose a cycle vector-quantized variational autoencoder (cycle-VQ-VAE) framework to encode and decode the video with its extracted audio, which takes the advantage of multiple heterogeneous data sources in the video itself to protect individuals’ privacy. In our cycle-VQ-VAE frame- work, a fusion mechanism is designed to integrate the video and its extracted audio. Particularly, the extracted audio works as the random noise with a nonpatterned distribution, which outperforms the noise that follows a patterned distribution for hiding visual information in the video. Under this framework, we design two models, including the frame-to-frame (F2F) model and video-to-video (V2V) model, to obtain privacy-preserving video streaming. In F2F, the video is processed as a sequence of frames; while, in V2V, the relations between frames are utilized to deal with the video, greatly improving the performance of privacy protection, video compression, and video reconstruction. Moreover, the video streaming is compressed in our encoding process, which can resist side-channel inference attack during video transmission and reduce video transmission time. Through the real-data experiments, we validate the superiority of our models (F2F and V2V) over the existing methods in visual privacy protection, visual quality preservation, and video transmission efficiency. The codes of our model implementation and more experimental results are now available at https://github.com/ahahnut/cycle-VQ-VAE.