A note from the DTL Research Commitee Chairs, Balachander Krishnamurthy and Nikolaos Laoutaris:
Of the 54 submissions received 26 were discussed extensively online (with some submissions receiving a score of comments) and half of them were further vetted in the live PC meeting.
We focused on working end-user software, a collection platform, transparency, privacy protection, and novelty. 8 submissions out of those discussed in the PC meeting were presented in no particular order to the DTL board on Wednesday June 8. The board selected 6 submissions to fund.
We thank the PC for their hard work in reviewing, extensive online discussions and participating in the PC meeting. We thank the board for their pointed questions and selecting submissions that we all hope will generate transparency software. We thank all the submittees for their time. We expect the grant awardees to complete their software, make the code and data available, and to present their results with a demo at DTL next year.
This project aims to develop an online service that enables users to see how privacy-revealing their Twitter accounts are. The project will bring together world-leading experts in geolocation of informal writing, to provide users feedback on what state-of-the-art predictive models can tell about you, based on what you post on Twitter. The service will:
a) present users with predictions of demographics (gender, age, job, location) if given specific Twitter profiles. b) enable users to test whether their Twitter accounts can be identified, based on their publicly available or uploaded texts.
The analysis presented to users will also tell them exactly what their personal signatures are, distinguishing between revealing content words (place names, topical words, etc.), dialectal cues, and stylistic variation (e.g., use of creative spelling and emojis).
Ad-blocker has become an increasing concern of the web services that are largely reliant on advertising revenues. Such web services operate with the implicit assumption that users agree to watch ads to support these "free" services. Unfortunately, the economic magnetism of online advertising has made it an attractive target for various types of abuses, which are driven by incentives for higher monetary benefits (e.g., drive-by downloads, overly annoying ads). Ad-blocking software can seamlessly block ads without requiring any user input, which not only improves the web experience but also protects user privacy by filtering network requests that profile browsing behaviors.
The advertising industry sees ad-blocker as a growing threat to their business model and therefore has started fighting back with ad-block detection capabilities. The idea is that the scripts can detect the presence of ad-blockers and refuse to serve users who use ad-blockers. Many popular websites such as The Guardian, WIRED, and Forbes have recently started interrupting and/or blocking visitors who use ad-blockers. The ongoing arms race between ad-blocker and ad-block detectors has a significant impact on the future of user privacy and the way the Internet advertising industry operates. Yet, little is known in terms of the scale and technical details of the arms race between ad-blockers and ad-block detectors.
In this proposal, we plan to undertake two major research tasks: First, we will perform a systematic measurement and analysis of the ad-block detection phenomenon on the web. This involves understanding how many websites are performing ad-block detection; and what type of technical approaches are used. Second, from the gained understanding, we aim to design and implement new mechanisms, representing the next step in the arms race, in the form of a stealthy or invisible ad-blocker to counter or circumvent ad-block detection. All of the produced data and software will be shared publicly.
The modern web is home to many online services that request and handle sensitive private information from their users. Previous research has shown how websites may leak user information, either due to poor programming practices, or through the intentional outsourcing of functionality to third-party service.
Despite the magnitude of this problem, users today have few, if any, options, for protecting their PII against accidental and intentional leakage. Generic anti-tracking extensions are based on manually-curated blacklists which, due to their reactive nature, are destined to be always out of date. Moreover, these anti-tracking extensions only account for domains belonging to tracking companies and thus cannot account for non-tracking-related third-party domains which happen to receive a user's PII due to the poor programming practices of the first-party website with which the user interacts.
To effectively inform users about the privacy consequences of visiting particular websites, we propose to design, implement, and evaluate PrivacyMeter, a browser extension that, on-the-fly, computes a relative privacy score for any website that a user is visiting. This score will be computed based on each website's privacy practices and how these compare to the privacy practices of pre-analyzed websites. In addition to a numeric score, PrivacyMeter will also provide users with contextual information about the discovered privacy issues (e.g., "many aggressive trackers", or "many inputs are submitted to third parties"), and what actions are advised. The privacy practices that PrivacyMeter will be assessing go above and beyond the state of the art, thereby offering users a much more accurate view of a website's privacy practices, compared to existing tools.
As in the browser context, mobile app developers use third-party services to add features to their apps such as analytics, user tracking, ad delivery and social network integration. While these services are valuable to app developers, they may also collect and share personal information about users. In fact, these services can access sensitive information by piggybacking on the permissions requested by the app developer and granted by the user. Unfortunately, these interactions with third-party services typically happen without any user awareness or consent.
The research community and the regulatory bodies do not have a broad understanding of the players' identities and the information that they collect. In this project we will investigate the third-party service ecosystem and its dynamics at scale. Our methods leverage data from ICSI's Haystack app. The results of our analysis will increase transparency by creating a public catalog and census of analytics services, their behavior, and their use across mobile apps.
Mobile devices generate the majority of Internet traffic today and also have access to a wealth of personal information. Visibility into the activity of mobile devices is of interest to end-users as well as to network operators, advertisers and a number of other players. In this project, we develop AntMonitor -- a tool that monitors the network activity of mobile devices and reveals privacy leaks directly (detecting PII leaking out of the device) or indirectly (profiling users based on minimal information).
In this proposal, we present the design of AntMonitor: a user-space mobile app based on a VPN-service that runs only on the device (\ie without the need of a remote VPN server). We show that AntMonitor significantly outperforms prior state-of-the-art approaches: it achieves speed over 90 Mbps (downlink) and 65 Mbps (uplink), which are 2x and 8x the throughput of existing mobile-only baselines and is 94% of the throughput without VPN, all while using 2--12x less energy. Then, we showcase preliminary results from a pilot study that show that AntMonitor can efficiently perform (i) real-time detection and prevention of private information leakage from the device to the network and (ii) application classification and user profiling.
Finally, we summarize the current state of the prototype, and our efforts in releasing the tool to end-users, commercial partners, and the research community. The mobile-only version of AntMonitor is currently in alpha-testing, and we request DTL support in order to complete the effort, release the tool to the community, and also get the opportunity to interact with the members of the DTL community.
Targeted advertising largely contributes to the support of free web services. However, it is also increasingly raising concerns from users, mainly due to its lack of transparency. The objective of this proposal is to increase the transparency of targeted advertising from the user's point of view by providing users with a tool to understand why they are targeted with a particular ad and to infer what information the ad engines possibly have about them. Concretely, we propose to build a browser plugin that collects the ads shown to a user and provides her with analytics about these ads. Our tool relies on an innovative collaborative approach to infer what information the ad engine may have.