Putting free digital assistants to the test

Out of the Box

Sirius is not being further developed, which is reason enough to cast a glance at its successor, Lucida [9]. Since compiling from the sources [17] did not succeed, the test team opted for Docker containers that had a trimmed-down demo version. After installing docker and docker-compose, you should run the following two commands:

sudo docker pull claritylab/lucida:latest
sudo docker pull claritylab/lucida-asr

In the process, around 17GB copies to the disk. Next, access the main container with this command:

sudo docker run -i -t claritylab/lucida /bin/bash

The docker-compose.yml file, which is necessary to start the demo version on the host system, is in the root directory. You are advised to copy its content onto the clipboard and paste it into a new docker-compose.yml file after leaving the Lucida container with exit; then, start all the Lucida services from the same directory:

sudo docker-compose up

The developers recommend using the Lucida web interface (http://localhost:8081) with the Chrome browser or free Chromium variant. A Wikipedia data dump is not present. The web front end instead asks for text to form the evidence base for the question-answering system. You enter this yourself and then click on Submit.

Lucida then asks for the access privileges for the microphone, because the demo version exclusively communicates by speech. You click on the microphone symbol to activate this, and speak your question. Kaldi operates in the background and attempts to understand the text. The result is shown in the speech bubble on the left, with the answer opposite on the right (Figure 6).

Figure 6: The new Lucida web front end functions in Chrome and Chromium. It accepts questions by microphone, although the quality of the answers remains limited by the Kaldi speech recognition.

As previously seen in the Sirius experiment with Kaldi (Table 1), dictating a sentence and having it correctly recognized was only possible with great difficulty. In this Docker edition, it is not possible to send questions to the system from your keyboard, and image recognition is likewise lacking.

A process to exchange the Kaldi speech recognition back end is not provided, meaning that the test team was also unable to experiment with PocketSphinx or Sphinx4. In the GitHub repository, the developers state that they will publish the next generation of Lucida toward the end of summer 2016 [18]. Along with a new command center, they should especially enclose a better question-answering system and a guide for how users can exchange individual components.

Future Helpers

Sirius and Lucida are not suitable for serious use to support you in everyday life, so Linux users will probably have to wait a while before they can get reasonable answers or real help from a digital assistant. The performance of the free programs – still a long way behind that of commercial alternatives – presumably is not because of an inferior quality of software, however. Firms such as Google or Apple have undoubtedly invested a large amount of money into training efforts. In the case of speech recognition, for instance, training consists of tedious tasks providing hours of recordings and phonetic transcription.

In principle, as a user, you can also train the free components and accustom them to your voice. However, linguistic knowledge and, most of all, staying power are needed to make that happen.

Infos

  1. Sirius: http://sirius.clarity-lab.org/sirius
  2. Hauswald, Johann, Michael A. Laurenzano, Yunqi Zhang, et al. "An open end-to-end voice and vision personal assistant and its implications for future warehouse scale computers." In: Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS, 2015), New York: ACM, pp. 223-238.
  3. CMU Sphinx: http://cmusphinx.sourceforge.net
  4. Kaldi: http://kaldi-asr.org
  5. OpenCV: http://opencv.org
  6. OpenEphyra: http://www.ephyra.info
  7. Caffe: http://caffe.berkeleyvision.org
  8. Sirius seminar during ASPLOS-20: http://sirius.clarity-lab.org/tutorial
  9. Lucida: http://lucida.ai
  10. Sirius downloads: http://sirius.clarity-lab.org/downloads/#sirius
  11. Lemur's Indri: http://www.lemurproject.org/indri.php
  12. OpenEphyra architecture: https://mu.lti.cs.cmu.edu/trac/Ephyra/wiki/Docs/ArchitectureOverview
  13. Status of OpenEphyra: https://github.com/claritylab/lucida/issues/89
  14. Freeimages: http://www.freeimages.com
  15. Google Goggles: http://www.google.com/mobile/goggles
  16. CamFind: http://camfindapp.com
  17. Lucida GitHub repo: https://github.com/claritylab/lucida
  18. Next Lucida version: https://github.com/claritylab/lucida/issues/116

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News