Using voice-controlled interfaces via Amazon Alexa

Skills in Detail

To understand more exactly how an Alexa skill is composed, it is worth taking a look at its individual components. The skill describes an interaction model that summarizes all behaviors and components. The interaction model itself consists of a number of components.

The invocation is the name of the Alexa skill and serves as a keyword for its activation. You'll need to follow some rules when assigning names – the Alexa website explains them in detail.

An intent describes an action that the user wants to perform within the skill. The individual intents require at least one defined keyword (sample utterance), which starts the action. If required, several sample utterances can be stored for each intent – for example, the variants "affirmative," "exactly," and "yeah" for the simple "yes." The more meaningful the utterances for an intent, the more reliable the skill becomes.

The slot types are self-defined data types that are used within the skills. The idea behind the slot types is similar to the enumeration data types known from almost all programming languages. For example, when asking for a car brand, the slot type could be car brand and contain values such as BMW, Audi, or Ford.

With the JSON editor, a simple text editor, the user edits the individual components of the skill directly in the interface. Although this task requires appropriate know-how, it makes it easier to keep track of complex skills.

In the Interfaces section, the user integrates multimedia content directly into an Alexa skill. Currently you will find three interfaces: Audio Stream, Video Stream, and Display Interface.

The endpoints determine which services the Alexa skill should communicate with on the Internet. In the present case, Raspberry Pi, which is linked to the skill via the Ngrok service, acts as the endpoint.

The sample skill needs to control three devices: a TV set, the light, and a socket outlet. To implement this, you create an intent for each of the three devices. Each intent supports the commands that name the corresponding device and transmit its status. The House command acts as the keyword. Define it below Invocation and then press the Save Model button (Figure 6). From experience, I can only advise readers to save more often than you might think necessary, rather than wondering later on why the skill does not work as it should.

Figure 6: Finally, you define the activation keyword for the skill.

Slots and Intents

A slot type named GPIO transmits the status of the individual devices. GPIO can assume one of two values: on and off. The slot type can already be created at this point. Press Add to the right of the Slot Types entry in the sidebar.

When you get there, select a custom slot type and assign the name of GPIO (Figure 7). Then, on the following page, assign the on and off values. To save the values, remember to press Save Model at the top.

Figure 7: Create a slot type via the interface.

On top of this, you now create the intents. You could do this manually by clicking your way through the individual dialog boxes. However, to reach this goal far faster, simply copy the content from Listing 4 to the interaction model using the JSON editor.

Listing 4

Intents

 

Make sure that the intents in the interaction model use the same names as in the Python program, otherwise you can expect trouble later on when you try to execute the skill.

In addition, the skill must be saved after each change to the interaction model and then rebuilt by clicking on Build Model. If you forget this step, nothing happens and you keep the old model.

Now only the communication between the skill and the Python program is missing. Click on Endpoint and select the HTTPS option. In the Default Region field, enter the URL currently used by Ngrok. In the drop-down box below, also enable the option labeled My development endpoint is a subdomain of a domain that has a wildcard certificate from a certificate authority (Figure 8). A click on Save Endpoints in the dialog header saves the changes.

Figure 8: Creating an endpoint.

After all configuration work has been completed, you need to build the skill for the first time. The Build Model switch appears on all pages where the model can be changed. Click on Invocation and then on Build Model.

For an initial test, change to the tab labeled with the name of the skill. Test mode can be enabled using the slider below the tab bar. Then enter the activation keyword, House, in the text box. The skill should then immediately welcome you in Australian. The connected devices can then be switched. The LEDs on the Raspberry Pi's GPIO prove that the skill works.

Controlling this with Amazon Echo should work the same way: The Start house voice command loads the skill, then the TV on command switches on the TV set.

Conclusion

Alexa provides an easy path for adding voice activation to your Raspberry Pi automations, but you'll need to set up a tunnel and do a little basic programming. This article offers a simple example for how to get Alexa talking to your RaspPi. Once you have established a communication channel, the possibilities for your voice-activated Raspberry Pi are endless.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Programming Snapshot – Alexa

    Asking Alexa only for built-in functions like the weather report gets old quickly, and add-on skills from the skills store only go so far. With a few lines of code, Mike teaches this digital pet some new tricks.

  • Mycroft

    Voice-activated assistants like Mycroft bring online, hands-free help to users, but with more transparency and less spying.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News