Social networks: Can robots violate user privacy?

High-Tech Bridge decided to conduct a simple technical experiment to verify how the 50 largest social networks, web services and free emails systems respect – or indeed abuse – the privacy of their users. The experiment and its results can be reproduced by anyone, as we tried to be as neutral and objective as possible.

The nature of the experiment was quite simple: we deployed a dedicated web server and created secret and totally unpredictable URLs on it for each tested service, something similar to:

http://www.our-domain-for-test.com/
secret/18354832319/sgheAsZaLq/

Then we used various legitimate functionalities (detailed in the table below) of the tested services to transmit the secret URLs, carefully monitoring our web server logs for all incoming HTTP requests (to see which services followed the secret link that was not supposed to be known and accessed by anyone).

During the 10 days of our experiment, we trapped only six services out of the 50. However, among those six were four of the biggest and most used social networks: Facebook, Twitter, Google+ and Formspring. The remaining two were URL shortening services: bit.ly and goo.gl.

If for the URL shortening services such behavior may be part of their legitimate functionalities, it should not also be the case with social networks such as Facebook and Twitter. Taking into consideration that some of the services may have legitimate robots (e.g. to verify and block spam links) crawling every user-transmitted link automatically, we also created a robots.txt file on our web server that restricted bots accessing the server and its content. Only Twitter respected this restriction, all other social networks simply ignored it, accessing the secret URL.

Below is our table detailing this experiment’s testing results:

Below you can find HTTP requests of trapped services that accessed the secret URLs:

Bit.ly:
IP: 50.17.69.56
User-Agent: bitlybot

Facebook:
IP: 173.252.112.114
User-Agent: facebookexternalhit/1.1
(+http://www.facebook.com/externalhit_uatext.php)

Formspring:
IP: 54.226.58.107
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31

goo.gl:
IP: 66.249.81.112
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.4 (KHTML, like Gecko; Google Web Preview) Chrome/22.0.1229 Safari/537.4

Google+:
IP: 66.249.81.112
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0 Google
(+https://developers.google.com/+/web/snippet/)

Twitter:
IP: 199.59.148.211
User-Agent: Twitterbot/1.0

The results of this experiment are quite interesting actually. The four trapped social networks justify their activities by “automated verifications”. However, it is technically impossible to verify what is really going on and how the information obtained on the user-transmitted URLs is being used. Today, quite a lot of web applications omit authentication and rely on temporary or unpredictable URLs to hide some content and, when users transfer such URLs via social networks, they cannot be sure that their information will indeed remain confidential. Unfortunately there is no way to keep the URL and its content confidential [if there is no authentication of course] while transferring the URL via social networks.

Author: Marsel Nizamutdinov, Chief Research Officer at High-Tech Bridge.

More about

Don't miss