Google Takeout: viewing what Google knows about you

Off the Beat: Bruce Byfield's Blog
Ever wonder what information Google has collected about you? Now, you can find out, thanks to Google Takeout, which allows you to download most of the information that Google has collected about you.
The question should be of more than passing interest to just about everyone. Few people may have bought Google's Chromebook with its web-based applications, but Google still dominates our computer lives. We use it to receive emails. We store pictures and documents on it. We socialize on it -- and, all the time, Google is collecting information about us.
Google Takeout is a creation of the Data Liberation Front, which describes itself as
"an engineering team at Google whose singular goal is to make it easier for users to move their data in and out of Google products. We do this because we believe that you should be able to export any data that you create in (or import into) a product. We help and consult other engineering teams within Google on how to 'liberate' their products."
You can run Google Takeout to see what information Google has stored about you, then go the Data Liberation Front site for instructions about how to remove data from specific Google products. Not all Google products have been "liberated," although most of the major ones have.
What is unclear, however, is how official Google Takeout or the Data Liberation Front is. Google Takeout appears to be hosted on Google servers, but the Data Liberation Front site gives no information about where it is hosted, doesn't identify anyone associated with the project, and gives no contact information other than a Twitter account.
Consequently, whether either is officially supported by Google, semi-official, or clandestine is uncertain. I'm guessing they may be projects Google employees have undertaken in their twenty percent time -- the time Google gives employees to work on private projects -- but don't know.
What I can say is that running Google Takeout is an educational and somewhat alarming experience.
Revelations in the archive
Running Google Takeout is as easy as logging into your Google account, and selecting which services to include in the archive of your data. If you want, you can first review a summary of the information collected by each service by clicking the download button. Probably, the largest omissions are GMail, which I suspect is the most heavily used Google service, and search engine records.
Obviously, the time needed to create the archive depends on how heavily you have used Google, but the result is a zip archive named for your account saved to your hard drive, neatly divided with a separate folder for each service.
In my case, the archive was just over 4.4 megabytes -- but, then, compared to other people, I am undoubtedly a light user of Google services, especially when GMail and searches are omitted. The service I most heavily use is Google+, and even that I've lost interest in because of its refusal to accept pseudonyms -- and even, in some cases, non-European names.
Still, even as a light user, I was taken aback by how much information Google was storing about me. I shouldn't have been, I know -- I willingly provided all that information, and I could see no sign that Google was storing anything I hadn't authorized.
All the same, seeing all the accumulated information was a shock. It is one thing to know that Google never throws out old information, and another one to realize that documents abandoned six years ago in Google Docs are still around. In my case, they are mostly test documents and likely to be of minimal interest to anyone, but if I had other work habits, their continued existence would raise concerns about privacy and security.
Similarly, while I was obviously aware that pictures posted on Google+ had to be stored somewhere, I was puzzled to see I had graphics stored in Picasa. Since I have never used Picasa separately, I took a few seconds to realize how they had got there. This experience convinced me that, in recently announcing the centralization of its services, Google was only making official what was already happening, but it also raises security questions. After all, making sure that your data is safe becomes harder when you are unaware of exactly which service it is stored in.
Then there's the information I was aware of. In the Streams folder, the archive included every posting I had every made on Google+, as well as all my contacts and circles (groupings of people I follow, if you don't happen to use Google+). All those individual decisions to post, I quickly discovered as I read them together, add up to a thorough picture of my online persona, especially since many of the posts are links to articles I've published.
Even more seriously, the people I choose to follow and the circles in which I've arranged them easily tells information that goes far beyond what I ever intended to give. From my circles, for example, anyone reading the information could tell something about my family and professional associates, and therefore about interests and connections I might prefer to keep private.
These discrete pieces of information could easily be combined with other archived information such as my pictures to tell more about me than I ever intended. For instance, from my pictures, one might deduce what I enjoy buying, and, from the circles, from whom I buy.
True, Google shows no signs of selling such information to advertisers or retailers. But what if Google's security is breached? Like most people, I have no informed opinion about the quality of Google's security. Yet,one piece at a time, I have entrusted more deeply personal information to Google. The fact that Google probably has less information about me than about most people is no comfort, because I find that, just by using Google's services, I have let my information be made available in ways in which I never consented.
Taking responsibility
I don't mean to be paranoid. Nor am I suggesting that Google is untrustworthy or deceptive. Its services are convenient, and perhaps some small losses of privacy are a reasonable exchange for that convenience.
Yet I am disturbed by how little Google emphasizes this potential lack of privacy, and how willingly I went along with it, less than half aware of what I was doing. If someone like me, with a reasonable lay knowledge of security and privacy issues, can fall into such complacent behavior, there must be millions of users who are even more naive than I was, and entrusting far more potentially damaging information to Google.
Even more importantly, what about the mail and search services not included in Google Takeout? If our uses of other services contain so much information, how much more do these popular services contain?
I can't answer that question. But I do know that over the next few days I will be using the Data Liberation Front's tools to remove unnecessary information from as many Google services as I can. At future intervals, I will repeat the process. In addition, I'll consider what Google services I might do without.
I have also decided that, instead of turning to Google twenty or thirty times a day for search results, I will transition completely to DuckDuckGo, a small search engine that claims not to store records of your search. I consider these decisions not paranoid, but simply small steps towards being more responsible about my online habits.
comments powered by DisqusIssue 269/2023
Buy this issue as a PDF
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Find SysAdmin Jobs
News
-
Kubuntu Focus Announces XE Gen 2 Linux Laptop
Another Kubuntu-based laptop has arrived to be your next ultra-portable powerhouse with a Linux heart.
-
MNT Seeks Financial Backing for New Seven-Inch Linux Laptop
MNT Pocket Reform is a tiny laptop that is modular, upgradable, recyclable, reusable, and ships with Debian Linux.
-
Ubuntu Flatpak Remix Adds Flatpak Support Preinstalled
If you're looking for a version of Ubuntu that includes Flatpak support out of the box, there's one clear option.
-
Gnome 44 Release Candidate Now Available
The Gnome 44 release candidate has officially arrived and adds a few changes into the mix.
-
Flathub Vying to Become the Standard Linux App Store
If the Flathub team has any say in the matter, their product will become the default tool for installing Linux apps in 2023.
-
Debian 12 to Ship with KDE Plasma 5.27
The Debian development team has shifted to the latest version of KDE for their testing branch.
-
Planet Computers Launches ARM-based Linux Desktop PCs
The firm that originally released a line of mobile keyboards has taken a different direction and has developed a new line of out-of-the-box mini Linux desktop computers.
-
Ubuntu No Longer Shipping with Flatpak
In a move that probably won’t come as a shock to many, Ubuntu and all of its official spins will no longer ship with Flatpak installed.
-
openSUSE Leap 15.5 Beta Now Available
The final version of the Leap 15 series of openSUSE is available for beta testing and offers only new software versions.
-
Linux Kernel 6.2 Released with New Hardware Support
Find out what's new in the most recent release from Linus Torvalds and the Linux kernel team.