Minimizing privacy risks of location data collection
With the advent of smartphones and tablets – devices users often carry with them everywhere they go – collected location data became a valuable source of information for both commercial entities and public sector organizations.
Users’ location data is routinely collected at a large scale by cellular network operators, location-based services, and location-enabled social network platforms. But can this type of information reveal too much about our lives – is it a threat to our privacy?
A group of researchers from the National University of Singapore and European-based SAP have explored information about the movements (collected every 15 minutes) of over half a million individuals over a period of one week, and have discovered that “anonymizing” users by replacing their identities with a random identifier is not enough, and that “human mobility traces are highly identifiable with only a few spatio-temporal points.”
“We assume that the adversary may know a certain number of spatio-temporal points among the trace of a target user. We measure the number of trajectories that the adversary can find based on the existing knowledge,” they explained. “The trajectory is unique and re-identification is successful if only one trajectory is found.”
Unfortunately, even within that large dataset (56 million records), more than 60 percent of the trajectories are unique. Fortunately, that number can be brought down to less than 30 percent with one simply adjustment.
“The main idea of the method is to ‘cut’ long trajectories into several short trajectories according to different time windows. These shorter trajectories are then assigned different user identifiers for each time window,” they noted, and added that the adjustment is both easily scalable and does not affect data utility.
Still, they made sure to point out that while this change improves privacy protection, it does not provide full anonymity.