by Ran Dubin,

Summary : Every day, hundreds of millions of Internet users view videos online - in particular on mobile phones whose numbers are clearly going to increase. Currently, most of the video streaming web sites including YouTube are using HTTP Adaptive Streaming (HAS). Dynamic Adaptive Streaming over HTTP (DASH) is de facto standard method for HAS. In DASH, each quality representation is encoded in to variable bit rates (VBRs). VBR vary the amount of output data per time segment and does not attempt to control the output bit rate of the encoder, so that the distortion will not vary significantly. DASH often uses HTTP byte range mode. In this mode, the byte range of each segment request can be different. This depends on the client's network conditions and playout buffer levels. Previous research has shown that information can be extracted from multimedia streams. Saponas et al. uncovered security issues with consumer electronic gadgets that enables information retrieval such as video titles classification. So if it has already been done, what's new?
Since these works have been conducted, there have been several changes in video traffic over the internet:
Adaptive byte range selection over HTTP - This means that a single title download from the same quality will look different each time.
VBR adaptive streaming with multiple representation layers - Each title can be represent by multiple quality without explicit order. HTTP version 2 - new secure and HTTP multiplexed session network protocol makes the classification much more difficult.
This paper presents an algorithm for encrypted HTTP adaptive video streaming title classification. By Exploiting the HAS multi bit-rate encoding, data mining, network reverse engineering and machine learning, we demonstrate that current network encryption efforts are not enough in order to protect user content anonymity and their viewing titles can be recognized. We evaluated our algorithm on a new YouTube popular videos dataset that was collected from the internet under real-world network conditions. Our algorithm's classification accuracy is 98%.