AWS Account
Alexa Developer Console
https://developer.amazon.com/alexa/console/ask
Custom Skills - complete control over the UX, define your own custom interaction model
Smart Home Skills - communicate with smart home devices using prebuilt model
Flash Briefing Skills - short overview of news or other content
Video Skills - provide video content
Music Skills - provide audio content
List Skills - for providing CRUD to a list
Adaptability - adaptable skill understands what a user says appropriately
Multiple Utterances Account for Over-Answering (too much information) Request Additional Information (re-prompt the user) Handle Corrections Plan for Errors (gracefully handle errors, don't use generic responses) Anticipate for Alexa not understanding (feature not available etc.)
Personalization - personalized skill remembers interactions and information about the user
Custom welcome message (explain the skill for first time, second time say welcome back etc.) Remember user information Remember user interactions (persist to database)
Availability - available skill guides the user and keeps all options open
Response Time Limits - 8 second time limit for user, prompt after the time limit Effectively Manage lists - simple options, use Speech Synthesis Markup Language (SSML)
Complete Task and End the Session - end session after you fulfill a user's request
Relatability - relatable skill allows the user to feel like they are having a conversation
Don't make it formal, vary responses, use transition and timeline markers
Think voice first and screen second (not a requirement), provide context and mind the 8 seconds response rule (no information overload) Leverage Dialog Management Prepare for the global marketplace (localization) Speechcons and SSML
Speechcons are special words and phrases that Alexa pronounces more expressively.
Define Purpose and Goals Who are your users? What can users do with the skill? What information you need?
Identify customer needs and how your skill will address those needs
Write a script - dialog between Alexa and users Develop a storyboard - add variations and error cases to the script Interaction model - use Alexa Skills Kit (ASK) to implement logic and voice interface (Alexa Developer Console and Command Line Interface can be use to define it)
Custom (complete control) and Pre-built (requests and utterances are defined, less control) interaction models
Script and storyboard document the information flow
Flow consists of:
- Utterance
- Situation
- Response
- Prompt
consist of 4 components:
Intents - action that fulfills a user request (leverage built in intents to save time) Utterances - spoken phrases the user says Custom Slots - possible values Dialog Model - defines the steps for multi turn conversation
When you say Alexa (wake word) everything after is recorded and streamed to the cloud via WiFi. Alexa Voice Service (AVS) consists of Services, APIs and uses Machine Learning and Natural Language Understanding to process the request.
Frontend - Alexa Developer Console (Alexa Skills Kit (ASK)) Backend - AWS or whatever endpoint you prefer (HTTPS), verify is request it from Alexa, adhere to ASK JSON format
Voice User Interface (VUI)
defines the intents (user requests) and words (utterances)
Fulfillment Logic (backend, typically LAMBDA) Connect VUI to Code Building & Testing (thru console and real device) Distribution (store, business, personal) Certification (Amazon tests and verifies)
Collection of APIs, Tools, Documentation and Code Samples
bash npm install -g ask-cli
uses SMAPI behind the scenes
Manage your skill programmatically using API root api.amazonalexa.com
Nice UI for creating skills, instead of using CLI
AWS CodeStar, AWS CodePipeline, AWS CloudFormation
GUI for designers and story writers (content) and developers to implement those
Set up Skill Flow Builder as a Developer
npm install --global @alexa-games/sfb-cli
npx alexa-sfb
npx alexa-sfb vscode
npx alexa-sfb new my_best_story
npx alexa-sfb simulate <your_project_path>
Only used for Custom Skill, not needed for prebuilt model
Invoking using:
- IntentRequest (Alexa - tell APP to do something)
- LaunchRequest (no intent - Alexa open App)
- CanFulfillIntentRequest (no name, Alexa searches for a skill to match request)
Intent is an action that fulfills user's request (has a NAME and list of UTTERANCES)
2 types of Intents:
- Built-in
Standard CancelIntent HelpIntent StopIntent FallbackIntent (no match) NextIntent YesIntent NoIntent PauseIntent SearchAction (lookup information)
Standard with screen NavigateHomeIntent (ends skill session, user leaves the skill)
- Custom (complete control)
Slot(s) are values passed to the utterance
Audio Player - streaming audio and monitoring playback Display/ DisplayTemplate Video App - streaming video files GadgetController / GameEngine CanFulfillIntentRequest - no name Alexa Presentation Language Auto Delegation / Dialog
List of Alexa Interfaces and Supported Languages
- LaunchRequest - opening the skill
- IntentRequest - sent when intent corresponds with user query
- SessionEndedRequest - when session ends (any reason)
- CanFulFillIntentRequest - when Alexa is querying if there is a skill that can fulfill the intent (can opting but then need to implement it for all intents)
Request Handlers - handle incoming request and response
Exception Handlers - for errors
Handler Classes - implement the abstract class AbstractRequestHandler
and its two methods, can_handle()
and handle()
Leverage userId and deviceId for personalization and segmentation along with DynamoDB
Available standard built-in intents for Alexa-enabled devices with a screen
Must implement Amazon.PreviousIntent and Amazon.NextIntent
Card types:
- Simple (title and content)
- Standard (title, text and image)
- Linked Account
- AskForPermissionsConsent
Cards displayed on Alexa-enabled devices with a screen
When to include cards, and how often?
Limit the number of cards Too many will take the user out of the voice experience. Avoid pushing cards with every response, unless it is absolutely necessary.
For screen devices, voice should remain the primary form of interaction, and as much as possible, the user should be able to navigate through the skill strictly by voice.
Display templates:
- Body Templates (1 - 7) - can't select images
- List Templates () - can select images
- Images
Display Template is used if both Display Template and Card are in the response
check the
{device: supportedInterfaces}
Offer more control than Display Template(s)
Understand Alexa Presentation Language (APL)
Leverage Voice & Tone tab to experiment with SSML tags
The Alexa.PlaybackController interface describes messages that are used to play, stop, and navigate playback for audio or video content.