Equipping Alexa with self-programmed skills

Utterances and Intent

Speech processing gurus divide user communication with the device into utterances and intent. The user might say, for example, "latest release" or "newest issue" (utterance), thus revealing the same intent in both cases – that is, calling the function that outputs the month and year of the current Snapshot in the Latest Snapshot skill. In the web flow of the newly created skill, you need to enter the data from Listings 1 and 2 [4] into the Interaction Model box (Figure 7).

Figure 7: Intent and utterances in the Alexa Skill Setup.

Listing 1 uses JSON format to define the intents identified later on, and – in addition to LatestSnapshotIntent – also defines a generic Help menu, which Alexa recites if the user says "Help." Listing 2 assigns various spoken sentences to each intent.

Listing 1

IntentSchema.json

 

Listing 2

SampleUtterances.txt

 

My Old Pal, Lambda

The next screen in the development flow asks for the Endpoint of the request – that is, the web service to which the language assistant should forward the request. In the simplest case, this is a Lambda function in the Amazon Cloud, whose configuration was discussed in a previous edition of the Snapshot column [5]. You'll need to enter the Lambda function's Amazon Resource Name (ARN) in the text box shown in Figure 8, for which you can use cut and paste.

Figure 8: In this case, the endpoint of the request is a Lambda function.

The developer may write the Lambda function on AWS either in Java or in JavaScript in a Node.js environment; the newly released Alexa SDK [6] makes this very easy. Because the Lambda server won't have the alexa-sdk Node.js package installed, developers are required to bundle both the script in Listing 3 and the alexa-sdk distribution by running the command:

npm install --save alexa-sdk

Listing 3

index.js

 

After stepping into the newly created node_modules directory, copy Listing 3 there as well, name it index.js, and zip up the whole enchilada with

zip -r upload.zip index.js alexa-sdk/

to then upload it to a newly created Lambda server (Figure 9).

Figure 9: The endpoint web service in Lambda as JavaScript code.

The handlers for the defined intents to which the web service responds are found in lines 5-12 in Listing 3 as an associative array. For the new LatestSnapshot skill, line 7 calls the getData() function (lines 14-32) and passes the response object to it so that the function can send the response it finds directly back to the client.

Asynchronous HTTP

Line 15 then uses the http package included with Node.js and employs its get() method (line 16) to pick up the JSON salad from the Perlmeister website. In line with the asynchronous program flow in Node.js code, lines 19 and 23 register handlers for incoming data chunks and completion of the HTTP transfer. The former glues together the bits and pieces that arrive to create the json_string; the latter launches a JSON parser, which returns a populated data structure derived from the string. In production operations, the code should include error handlers, so that Alexa can inform the user via voice output what went wrong.

Because the entries in the JSON document are in reverse chronological order, line 25 just picks the first element from the array below issues. The code then extracts the month and year, as well as the title, from the data structure.

Using the response object's emit() method with the ":tell" option, line 29 sends the reply to the speech processor. To register the previously defined handlers, the script sets the exports.handler attribute to a function that first obtains a new Alexa object, calls registerHandlers on it, and then runs execute to start the control flow.

For the Lambda function to accept Alexa events, it needs the Alexa Skills Kit option set, which developers can accomplish by activating it under the Triggers tab in the web UI. The user selects this by clicking on the grey-ish dashed empty box (Figure  10), from which an arrow points to the orange shutters of the Lambda function. After this, back in the configuration flow of the Alexa skill, the Test section now has an orange button, set to enabled, with a message stating that the skill has been enabled for the user's personal account. If the user now says, "Alexa, ask Latest Snapshot for release | issue | last month", Alexa answers "04 2017, Home Run into the Cloud" after a short pause for reflection, because that is indeed the date and title of the latest published edition of this column, as I write this in March 2017.

Figure 10: A selection of Alexa events.

If anything goes wrong, you can check the Alexa log on the phone app to discover what Alexa actually understood and then browse the logfile of the Lambda function to see whether the code crashed or logged any errors.

Skills can support multiple languages, so the endpoint must respond in the language in which the question was asked. Currently, Alexa supports English, both UK and US variants, and German. Developers can define multiple intents and utterances for each language for the same skill, and the speech processor will decide which language to use on the back end according to which language was selected on the Alexa device that took the request.

If you want to control your smart home with Alexa, you do not have to delve into the depths of programming to the extent demonstrated here. Instead, you can select the Smart Home Skill API as the Skill Type at the beginning of the flow (in the Skill Information tab) instead of selecting Custom Interaction Model. This gives you access at a higher command level, which is correspondingly easier to handle. Flash Briefing Skill API is another option for playing back previously bundled radio broadcasts.

If you prefer to use your own web server as the endpoint instead of a Lambda function on Amazon's Web services, you need to use SSL to do so, as well as deposit the certificate with Alexa. Of course, before any skill appears in the public list on Amazon's Alexa app store, Amazon wants to verify and certify it manually.

Mike Schilli

Mike Schilli works as a software engineer in the San Francisco Bay area of California. In his column, launched back in 1997, he focuses on short projects in Perl and various other languages. You can contact Mike at mailto:mschilli@perlmeister.com.

Infos

  1. "Getting Started with the Alexa Skills Kit": https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/getting-started-guide
  2. "Alexa Skills Store": https://www.amazon.com/b?node=13727921011
  3. Amazon Developer Services: https://http://developers.amazon.com
  4. Listings for this article: ftp://ftp.linux-magazine.com/pub/listings/magazine/199/
  5. "Programming Snapshot – Amazon Web Services" by Mike Schilli, Linux Magazine, issue 196, March 2017, pg. 52, http://www.linux-magazine.com/Issues/2017/196/Programming-Snapshot-Amazon-Web-Services
  6. Alexa Skills SDK on GitHub: https://github.com/alexa/alexa-skills-kit-sdk-for-nodejs

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Mycroft

    Voice-activated assistants like Mycroft bring online, hands-free help to users, but with more transparency and less spying.

  • Programming Snapshot – Multilingual Programming

    We show you how to whip up a script that pulls an HTTP document off the web and how to find out which language offers the easiest approach.

  • FOSSPicks

    Graham tears himself away from updating Arch Linux to search for the best new free software.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News