Collusion is a really fun add-on to Firefox that allows the visualization of which sites track you as you browse the web. For this project, we'd like a simplified version of Collusion without visualization ([url removed, login to view]) for a small data gathering project.
1. Must read a csv file with a set of URL (3 columns: URLId [bigint], SiteId [bigint], URL [string])
2. Must output two CSV files:
2.1 List of all third-party trackers encountered (TrackerId [autogenerated], TrackerDomainName [Collusion provides this])
2.2 Trackers encountered on each URL (URLId [from input csv], TrackerId [connecting to TrackerId in 2.1], GoogleAnalyticsAnon [If Google Analytics tracks a URL, is it set to anonymization?]
For information on checking whether Google Analytics is set to Anonymization, see: [url removed, login to view]
The project will be split into three deliverables with a percentage payment for each:
1. Freelancer is able to provide sample data for an input file we will provide (15% upon approval)
2. Freelancer provides executables runnable on a Mac or Windows Server along with source code and documentation (55% upon approval)
3. After Freelancer runs the tool with our own input file (same as for #1) and find the top 10 trackers and examine whether these provide an anonymization option, Freelancer will collect information in the same fashion as for output file #2. If no other top 10 trackers do the same, Freelancer will be paid without additional work (30% upon approval).