How Deep Is Your Chat?
Welcome
![](/var/linux_magazin/storage/images/issues/2024/279/welcome/casad_joe_2.png/833973-1-eng-US/Casad_Joe_2.png_medium.png)
Books, academic journals, tech blogs, and social media posts have been trumpeting dire warnings about super-intelligent AI systems snuffing out civilization.
Dear Reader,
Books, academic journals, tech blogs, and social media posts have been trumpeting dire warnings about super-intelligent AI systems snuffing out civilization. This certainly is a real problem – I don't want to make light of it. But another serious, and perhaps more immediate, problem is really stupid, inept AI systems messing things up through sheer incompetence.
The Washington Post had a story recently [1] about a study by a European nonprofit [2] on the trouble AI chatbots had with answering basic questions about political elections. According to the story, Bing's AI chatbot, which is now called Microsoft Copilot, "gave inaccurate answers to one out of every three basic questions about candidates, polls, scandals, and voting in a pair of recent election cycles in Germany and Switzerland."
Before you write this off as yet another Linux guy ranting about Microsoft, I should add, the reason why the study focused on Microsoft's chat tool is because Copilot can output its sources along with its chat responses, which made it easier to check. The story points out that "Preliminary testing of the same prompts on OpenAI's GPT-4, for instance, turned up the same kinds of inaccuracies." Google Bard wasn't tested because it isn't yet available in Europe.
The errors cited in the study included giving incorrect dates for elections, misstating poll numbers, and failing to mention when a candidate dropped out of the race. The study even documents cases of the chatbot "inventing controversies" about a candidate.
Note that I'm not talking about some arcane anomaly buried deep in the program logic. The bot literally couldn't read the very articles it was citing as sources.
Of course, Copilot got many of the answers right. "Two out of three" wouldn't have been too bad for an experimental system 10 years ago maintained by experts who knew what they were getting. The problem is that we have endured a year of continuous hype about the wonders of generative AI, and people are actually starting to believe it. It is one thing to ask an AI to write a limerick – it is quite another to ask it to chase down information you will use for voting in a critical election. Many elections are decided by one- to three-percent margins. The implications of a chatbot acting as a source for voters and getting 30 percent of the answers wrong are enormous.
The study also points out that accuracy varies with the language. Questions asked in German led to inaccurate responses 37 percent of the time, whereas English answers were only wrong 20 percent of the time (that's still way too many mistakes). French weighed in at a 24-percent error rate.
AI proponents answer that this is all a process, and the answers will get more accurate in time. The general sense is that this is just a matter of bug hunting. You make a list of the problems, then tick them off one by one. But it isn't clear that these complex issues will be solved in some pleasingly linear fashion. The AI industry made surprisingly little progress for years and slow-walked through most of its history before the recent breakthroughs that led to the latest generation. It is possible we'll need to wait for another breakthrough to make another incremental step, and in the meantime, we could do a lot of damage by encouraging people to put their trust in all the bots that are currently getting hyped in the press.
If you want to get an AI to draw a picture of your boss, go ahead and play. But it looks like, at least for now, questions about which candidate to vote for might require a human.
Joe Casad, Editor in Chief
Infos
- "AI Chatbot Got Election Info Wrong 30 Percent of the Time, European Study Finds" by Will Oremus, Washington Post, December 15, 2023: https://www.washingtonpost.com/technology/2023/12/15/microsoft-copilot-bing-ai-hallucinations-elections/ (paywalled)
- "Prompting Elections: The Reliability of Generative AI in the 2023 Swiss and German Elections," AI Forensics: https://aiforensics.org/work/bing-chat-elections
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
![Learn More](https://www.linux-magazine.com/var/linux_magazin/storage/images/media/linux-magazine-eng-us/images/misc/learn-more/834592-1-eng-US/Learn-More_medium.png)
News
-
NVIDIA Released Driver for Upcoming NVIDIA 560 GPU for Linux
Not only has NVIDIA released the driver for its upcoming CPU series, it's the first release that defaults to using open-source GPU kernel modules.
-
OpenMandriva Lx 24.07 Released
If you’re into rolling release Linux distributions, OpenMandriva ROME has a new snapshot with a new kernel.
-
Kernel 6.10 Available for General Usage
Linus Torvalds has released the 6.10 kernel and it includes significant performance increases for Intel Core hybrid systems and more.
-
TUXEDO Computers Releases InfinityBook Pro 14 Gen9 Laptop
Sporting either AMD or Intel CPUs, the TUXEDO InfinityBook Pro 14 is an extremely compact, lightweight, sturdy powerhouse.
-
Google Extends Support for Linux Kernels Used for Android
Because the LTS Linux kernel releases are so important to Android, Google has decided to extend the support period beyond that offered by the kernel development team.
-
Linux Mint 22 Stable Delayed
If you're anxious about getting your hands on the stable release of Linux Mint 22, it looks as if you're going to have to wait a bit longer.
-
Nitrux 3.5.1 Available for Install
The latest version of the immutable, systemd-free distribution includes an updated kernel and NVIDIA driver.
-
Debian 12.6 Released with Plenty of Bug Fixes and Updates
The sixth update to Debian "Bookworm" is all about security mitigations and making adjustments for some "serious problems."
-
Canonical Offers 12-Year LTS for Open Source Docker Images
Canonical is expanding its LTS offering to reach beyond the DEB packages with a new distro-less Docker image.
-
Plasma Desktop 6.1 Released with Several Enhancements
If you're a fan of Plasma Desktop, you should be excited about this new point release.