What is machine learning?

Wise from Experience

Article from Issue 264/2022

"I won't make this mistake again," you promise yourself. In other words, you'll learn from experience. If you translate experience into data, computers can do that, too. We'll introduce you to the fundamental forms of machine learning.

In May 2022, a friendly person from Linux Magazine wrote to me and asked if I could write a short article on machine learning. After a few quick queries, I agreed, as you can see here. I'm going to try to be more creative in fulfilling this request than just restating the Wikipedia entry on machine learning techniques.

Now that we have known each other for a few lines, let me ask you a personal question: Did you have a friendly person from Linux Magazine in your mind's eye while you were reading the opening lines? Second question: No matter who it was – can you imagine why? Most of us piece images together in our heads from the experience we gain while reading. Because you didn't have any information, your brain used the experience it had, or thought it had, in combination with your knowledge of media in general or Linux Magazine to be more specific.

In other words, you used experience to make inferences about a new, unknown situation, and you do it all the time without even noticing. Of course, this doesn't always work, but – as often as not – that isn't a problem. Machine learning endeavors to give computers the ability to learn and generalize from experience. Experience, in computer-speak, means data, and the learning we're talking about has nothing to do with consciousness or intelligence in the strict sense. It's about acquiring and, if necessary, improving skills from experience without being programmed to do so using a legacy approach.

That's Arthur L. Samuel's definition, or something like it, from back in 1959, and I think you can infer one thing from that date – that is, what machine learning isn't: It isn't a buzzword. It is a fairly old discipline for computer science with roots close to mathematics, which undeservedly gives it some potential for instilling fear in students and users.

In this article, I'll describe three main types of machine learning: supervised learning, unsupervised learning, and reinforcement learning. In addition, there are a few variants – such as semi-supervised learning and the like – but I'll not dwell on them today. As you will learn in this article, the differences between these three fundamental types of machine learning is primarily the way in which experiences are delivered.

Supervised Learning

Supervised learning relies on the existence of pairs of desired input X and output Y. For example, X could consist of images of animals, say dog(1), cat(2), mouse(3), and elephant(4). The numbers that follow are the Y values that people assign beforehand for a certain number of images. Using this database of examples, you can monitor the learning process and define a target. The computer then learns the function f:X->Y that assigns one of the four classes to each input image.

The set of images used for training is referred to as the training set. Ideally, at the end of the training, the computer will be able to assign each image from this set to the correct class. But simply repeating the assignments given in the training set doesn't add any value. The real power is in generalization, that is, using the lessons learned through training to process new examples. The reliability of a function (in machine learning it would typically be known as a model) is a measure of its ability to successfully process new examples.

The new examples used to test the model are known as the test set. Novices will often ask, "How many data pairs do you need for successful training?" or "How many pairs do you need for a meaningful test?" Both these questions are based on false premises. It's not so much a question of the number of examples or the size of the database – it's more about the information.

In the preceding example, for instance, if the training set only contains images of the dogs at the local dachshund kennel, and the test set contains images of the general dog population, the testing will not go well no matter how many training images you use. No matter how many times you take pictures of the dachshunds, and no matter the angle of the pictures or what the dogs are doing, the model has still never seen a husky or a mastiff. Of course, more is better when you're dealing with data, but it all boils down to whether the training set represents a good and representative mix.

This need for a good mix is also true for the test set. What the set doesn't include cannot be tested. Don't panic: In many models, learning by example can be used in conjunction with domain knowledge to deal with the unknown, but that approach will not work everywhere. Just to recap, supervised learning requires datasets that people have already embellished, and, although more is better than less, it's ultimately the mix that matters (Figure 1).

Figure 1: Supervised learning takes place wherever there is a teacher who knows the right answers and can correct the learner. Learning is by example, based on answers judged by the teacher to be right or wrong. © Ian Allenden, 123RF.com

By the way, the discussion so far has covered one of the two main types of supervised tasks in our heads: classification. The other is regression. In a certain sense, the difference lies primarily in the space you map given a function f:X->Y. If Y is discrete – that is, if it contains only a few individual elements (dog, cat, mouse, elephant), then you have classification. If Y is continuous, for example, representing a credit line, a temperature, or a probability, the result is a regression. By the way, from a math point of view, this is always a form of optimization problem based on stochastic data. The reason is that you basically add up the errors and naturally want to keep this total low.

Unsupervised Learning

The alternative to supervised learning is unsupervised learning. Unsupervised learning does not rely on target information in the sense of (x,y) pairs. You only have the X examples themselves. What can you do with them? Probably more than you think.

Unsupervised learning exclusively relies on working directly with the structures that exist in the examples. One of the most common applications is clustering these examples (Figure 2). You form groups based on the similarity of each example. For a computer to do this, however, you need to define the similarity cleanly in terms of a distance function – mathematicians refer to this as a metric. Once you've established a metric, you're ready to go, and the machine learning algorithms form groups from the data. For instance, these groups could be groups of customers or users.

Figure 2: An unsupervised classification of countries into clusters in 1994 and 2014. Countries shown in white are outliers that do not belong to any cluster. As you can see in the 2014 classification, the USA loses its special role as an outlier in this classification, and the EU is broken up into two groups due to the aftermath of the 2008/2009 financial and economic crisis.

What's the point of forming groups? For one thing, grouping the datapoints could help you determine purchase or product recommendations. For many applications, you don't need a defined class as required for supervised learning. If you manage to sort people with similar behavior into groups, you can draw conclusions on one from the other and use these conclusions to create recommendation systems.

You can also identify groups that have emerged due to the different viewpoints of disciplines. This could mean using unsupervised learning as an intermediate step in a scientific process, with the computer suggesting an alternative taxonomy. One example of a taxonomy is the practice of dividing languages into families based on the relationships among them.

The opposite of grouping can also be interesting. In this scenario, you still create groups, but you are not interested in the groups themselves but are instead focused on the outliers. One typical application would be detecting security breaches and attempted fraud.

In some systems, if one actor behaves completely differently from the others, it could be a sign of suspicious activity. However, you have to be careful how you apply this technique. With computer networks, for instance, you will immediately notice why this kind of analysis should only be used as an initial filter and not automated. It's typically users with administrative rights who exhibit the most suspicious behavior: working unusual hours, messing around with other people's files. It is often necessary to filter out these anticipated outliers in order to find the truly interesting information.

Reinforcement Learning

Supervised and unsupervised learning are best thought of as functions that assign an output, class, value, or cluster to an input. Reinforcement learning, instead, represents an object with methods, that is, an agent that acts autonomously in its own environment. That environment can be reality and the agent can be a robot, but it can also be a bot on the World Wide Web or something similar. This agent follows a policy, which is often abbreviated in literature by the Greek letter PI.

Reinforced learning is not hard-programmed into the computer. Instead, the agent learns it based on data. This data is different from the data used with supervised or unsupervising learning. In supervised learning, the system is more or less told how to get it right. The opposite is true in unsupervised learning, where you have to make do with the structures you have.

Reinforcement learning is sort of in between. The agent receives feedback (typically called a reward). The agent's goal is to obtain the highest possible reward. Of course, this goal cannot be achieved in a single step – that would be pretty short-sighted. Instead, the reward is added up across the whole task (Figure 3). The agent doesn't get any advice on how to get things right, just a feedback signal. In fact, the humans are totally ignorant as to what the best policy could look like. If they knew that, they could have programmed it in directly.

Figure 3: The agent's task is to reach the newspaper in as few moves as possible from any starting point. On the right, you can see the strategy acquired using an elementary reinforced learning approach. The longer an arrow, the clearer the tendency to move in the corresponding direction.

Consider an example with children and parents that would easily take on unethical overtones if you were serious about it. The target is a tidy room, but you don't tell your children. You only give them feedback in the form of the kind of unhealthy food they really want to eat. The good thing about being a computer agent is that it wouldn't matter if you are handing out candy (positive reward), varying degrees of slaps (negative reward), or a mix of both. But in this example, I'll stick with the less violent but unhealthier picture.

An important aspect of reinforcement learning is when to give the positive or negative feedback. For example, you could stay in the room and give the child a piece chocolate every time a toy is taken from the floor and put in a cupboard. The agent/child will quickly understand that this is a desired behavior. But you could also give no feedback for a very long time before distributing an amount of chocolate equivalent to the amount of space freed up on the floor of the room right at the end. In this case, learning the desired behavior will likely to take longer.

So should you always give feedback in as small a way as possible? Unfortunately, it's not that simple. After all, the agent's goal is not to clean up the room – it's to get the highest possible total reward. To do this, the agent needs to find the best strategy for the highest possible yield. Your goal, on the other hand, is to identify the best strategy for a real-world problem. There is often a massive difference between these goals.

To give a real world example: Imagine you want a cleaning robot that will not bump into your furniture. A bumper on the front of the robot determines whether or not this happens. Your goal is to avoid wrecking the furniture. From the agent's point of view, the most likely solution is to move in reverse, because then the bumper will not be depressed, which solves the problem from the agent's point of view but does nothing to protect your furniture.

You can experience similar unwanted effects by using feedback loops that are too strict. In fact, your prior knowledge always drives the design of the feedback loop. If your prior knowledge is good, everything will probably work out fine. But if you knew what the solution was, why didn't you just define it clearly in the first place?

When cleaning up a room, yes, parents also usually take a path that is more oriented toward supervised learning: They show children how to do things, or clean up the room along with the children, simply because they (think they) know how to do it best. The skill here is often to find the right feedback function, along with the right measure of quality on which it is often derived. You can mark off free space on the floor in the child's room – or do you just want to find everything dumped on the bed?

Incidentally, you again are dealing with optimization in stochastic environments: Optimization because the agent's goal is to maximize a target variable; stochastic because, from the agent's point of view, the environment usually has aspects that appear to be random.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Machine Learning

    We explore some machine learning techniques with a simple missing person app.

  • Unsupervised Learning

    The most tedious part of supervised machine learning is providing sufficient supervision. However, if the samples come from a restricted sample space, unsupervised learning might be fine for the task.

  • FAQ – Apache Spark

    Spread your processing load across hundreds of machines as easily as running it locally.

  • Natural Language Processing

    If an actor's lip movements don't match the spoken text in a dubbed movie, it not only stresses people who are hard of hearing, but it can also make things difficult for everyone. AI can help solve this problem with lip-sync translations of movie scripts.

  • Programming Snapshot – Mileage AI

    On the basis of training data in the form of daily car mileage, Mike Schilli's AI program tries to identify patterns in driving behavior and make forecasts.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.