Living with Statistics

Off the Beat: Bruce Byfield's Blog

Feb 20, 2012 GMT

Bruce Byfield

One of the prices of software freedom is the impossibility of getting accurate figures for usage. As a user, I consider that a small price to pay for not having to register or activate software. However, as a journalist I'm often frustrated, because accurate figures can be useful for establishing a point or debunking rumors.

The questions for which I would like accurate stats include: how many GNU/Linux users are there? Has Linux Mint really overtaken Ubuntu as the most popular distribution? Has GNOME gained or lost users with the start of its third release series? All these questions and more would benefit from reliable figures, yet we don't have any. Instead, we have a series of indicators that are approximate at best, and completely unreliable at worst.

One problem is external biases. For example, when NetApplications places Linux usage at 1.6%, that total is derived "from the browsers of site visitors to our exclusive on-demand network of live stats customers." But when I consider that the same methodology based on visits to my personal blog would suggest a figure of 19% for Linux, I have to wonder if NetApplications' figures aren't as skewed as mine, but in the opposite direction.

Similarly, since NetApplications' headquarters are in California, probably American companies are most likely to use its services. Unofficially, I am always told that free software usage is lightest in North America, Microsoft's home, and higher in Europe or in developing countries.

However, other problems arise when I rely on sources that are more friendly to free software, such as Distrowatch's page views for distributions. My guess is that most people who visit Distrowatch are already familiar with free and open source software (FOSS), so that their figures reflect only reflect the tastes of relatively experienced users.

Yet even that assumption may be questionable. Page views might tell what distributions people are curious about, but that might be a rough indicator of what people are downloading and using.

Moreover, Distrowatch's numbers are small enough that a new release or a lively discussion elsewhere online can skew results for days or weeks at a time. A handful of fans might easily distort results, although nothing indicates that such an effort has ever been made. Armed with such doubts, you can easily dismiss Distrowatch figures altogether, as Canonical employee Michael Hall did when Distrowatch reported Linux Mint as receiving more views than Ubuntu.

User surveys share some of the problems of Distrowatch's figures, but also come with their own problems. For instance, FLOSSPOLS' survey of gender in the community frames all discussions of women's under-representation in FOSS. Yet the FLOSSPOLS data was collected seven to eight years ago, making it decidely obsolete, especially in a field that changes as rapidly as FOSS. Today, we have no idea whether the situation in the community is better than the survey reports (it could hardly be worse).

Still, at least the FLOSSPOLS survey was designed according to research standards. Community surveys, such as the Linux Journal's Readers' Choice Awards or the LinuxQuestions' Members Choice Awards can't even claim that. In both, participants are self-selected and answers are open ended. The number of participants may or may not be given, and margins of errors never -- although, if they were, they might be as high as five percent. If so, then in many cases where GNOME was declared the most popular desktop environment over KDE, or Mozilla the most popular web browser over Chrome, a more accurate result would probably be to declare a tie.

None of what I am saying is meant to be a reflection upon those who collect the data. With the exception of FLOSSPOLS and NetApplications, none of these sources has ever claimed to be providing scientifically reliable information. In some cases, entertainment is probably more of a motivation than anything else.
But for those of us in search of accurate information, the shortcomings of what is available are annoying, to say the least.

Living with Imperfection
So what's a writer to do? The high road would be to ignore such sources of information, and learn to live with uncertainty. As much as I want accurate information about FOSS, I might have to accept that it just doesn't exist.

However, that is hardly a solution. Even if I ignore these figures, others don't. Such sources as are available always being cited to support various arguments, and, if nothing else, I might want to debunk the argument with something more than the reasonable doubt of meta-arguments.

Besides, the issues that such sources touch upon are ones that I -- and many other people -- want to talk about. As limited as these information sources maybe, they at least give some context to discussions that would otherwise be even less uninformed.

As a result, the way I use these figures is an uneasy compromise. However, briefly, I try to indicate that they're not reliable. I try not to make arguments that depend on a couple of percentage points of difference.

Most of all, I try not to base an argument on any single set of results. If a survey gets the same results several years running, I'm more likely to trust the figures than if they appear in a single year. Better yet are times when more than one source shows similar results over several years.

Of course, if I was paranoid enough, I might worry about whether all surveys were being manipulated by a small group of users or corporate employees. Realistically, though, I think that, under the conditions I describe these statistical sources can indicate general trends to a degree that no other sources of information can. But I try not to forget that these sources are tentative, and can never be used with any precision.

« previous post next post »

Comments

better Linux usage stats

kornelix
Have you looked at Wikipedia's stats? Wikipedia is global and OS neutral. The latest stats put Linux usage at 4.44%, with the Android part 2.87%. That leaves 1.57% for Ubuntu et. al.

http://stats.wikimedia.org/...quidReportOperatingSystems.htm

comments powered by Disqus

Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

News

There's a New Linux AI Assistant in Town

Artificial Inte... , Linux , LLM

Newelle is a Linux AI assistant that can work with different LLMs and includes document parsing and profiles.
Linux Kernel 6.16 Released with Minor Fixes

Kernel , Linux , Security

The latest Linux kernel doesn't really include any big-ticket features, just a lot of lines of code.
EU Sovereign Tech Fund Gains Traction

funding , open source , Security

OpenForum Europe recently released a report regarding a sovereign tech fund with backing from several significant entities.
FreeBSD Promises a Full Desktop Installer

Desktop , FreeBSD , open source

FreeBSD has lacked an option to include a full desktop environment during installation.
Linux Hits an Important Milestone

Linux , open source

If you pay attention to the news in the Linux-sphere, you've probably heard that the open source operating system recently crashed through a ceiling no one thought possible.
Plasma Bigscreen Returns

KDE , open source , Plasma

A developer discovered that the Plasma Bigscreen feature had been sitting untouched, so he decided to do something about it.
CachyOS Now Lets Users Choose Their Shell

CachyOS , shell , Wayland

Imagine getting the opportunity to select which shell you want during the installation of your favorite Linux distribution. That's now a thing.
Wayland 1.24 Released with Fixes and New Features

communication , Linux , Wayland

Wayland continues to move forward, while X11 slowly vanishes into the shadows, and the latest release includes plenty of improvements.
Bugs Found in sudo

Linux , Security

Two critical flaws allow users to gain access to root privileges.
Fedora Continues 32-Bit Support

Fedora , Games , Linux

In a move that should come as a relief to some portions of the Linux community, Fedora will continue supporting 32-bit architecture.

Living with Statistics

Off the Beat: Bruce Byfield's Blog

Comments

better Linux usage stats

Subscribe to our Linux Newsletters Find Linux and Open Source Jobs Subscribe to our ADMIN Newsletters

Support Our Work

News

Tag Cloud

Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters