PrivacyProxy: Leveraging Crowdsourcing and In Situ Traffic Analysis to Detect and Mitigate Information Leakage
Gaurav Srivastava
Matt Fredrikson
Jason Hong
Computing Research Repository, August 2017

Abstract

Many smartphone apps transmit personally identifiable information (PII), often without the users knowledge. To address this issue, we present PrivacyProxy, a system that monitors outbound network traffic and generates app-specific signatures to represent sensitive data being shared. PrivacyProxy uses a crowd-based approach to detect likely PII in an adaptive and scalable manner by anonymously combining signatures from different users of the same app. Furthermore, we do not observe users network traffic and instead rely on hashed signatures. We present the design and implementation of PrivacyProxy and evaluate it with a lab study, a field deployment, a user survey, and a comparison against prior work. Our field study shows PrivacyProxy can automatically detect PII with an F1 score of 0.885. PrivacyProxy also achieves an F1 score of 0.759 in our controlled experiment for the 500 most popular apps. The F1 score also improves to 0.866 with additional training data for 40 apps that initially had the most false positives. We also show performance overhead of using PrivacyProxy is between 8.6% to 14.2%, slightly more than using a standard unmodified VPN, and most users report no perceptible impact on battery life or the network.

Bibtex

@article{DBLP:journals/corr/abs-1708-06384,
  author    = {Gaurav Srivastava and
               Saksham Chitkara and
               Kevin Ku and
               Swarup Kumar Sahoo and
               Matt Fredrikson and
               Jason I. Hong and
               Yuvraj Agarwal},
  title     = {PrivacyProxy: Leveraging Crowdsourcing and In Situ Traffic Analysis
               to Detect and Mitigate Information Leakage},
  journal   = {CoRR},
  volume    = {abs/1708.06384},
  year      = {2017},
  url       = {http://arxiv.org/abs/1708.06384},
  archivePrefix = {arXiv},
  eprint    = {1708.06384},
  timestamp = {Mon, 13 Aug 2018 16:49:11 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1708-06384},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Plain Text

Srivastava, G., Chitkara, S., Ku, K., Sahoo, S.K., Fredrikson, M., Hong, J.I., & Agarwal, Y. (2017). PrivacyProxy: Leveraging Crowdsourcing and In Situ Traffic Analysis to Detect and Mitigate Information Leakage. ArXiv, abs/1708.06384.