Tag Archives: Voice User Interface

Call Centre Training e-poll

11 Oct

As part of our METALOGUE project, we have created an electronic poll (e-poll).

metalogue_replay

Our goal is to collect actual real-world requirements from Call Centre professionals that will inform our system pilot design and implementation. Through this and a number of other e-polls, we are asking some basic questions on Call Centre Agent training goals, Call Centre Agent preferences, target functionality of an automated agent training tool, etc.

We are inviting anyone from the Industry, from Call Centre Operators and Managers, Agent Trainers, to Call Centre Agents (experienced and novice) to participate. Feel free to add your own input and comments.

If you can also use the Contact form below to indicate whether you are a Call Centre Operator / Manager, Trainer, or Agent (or all of the above!), we would be able to collect some data on the demographics of the e-poll respondents.

Thank you in advance!

Advertisements

Cross-linguistic & Cross-cultural Voice Interaction Design

31 Jan

(update at the end)

2010 saw the first SpeechTEK Conference to have taken place outside of the US, SpeechTEK Europe 2010 in London. This year’s European Conference, SpeechTEK Europe 2011, will take place again in London (25 – 26 May 2011), but this time it will be preceded on Tuesday 24th May by a special Workshop on Cross-linguistic & Cross-cultural Voice Interaction Design organised by the Association for Voice Interaction Design (AVIxD). The main goal of AVIxD is to bring together voice interaction and experience designers from both Industry and Academia and, among other things, to “eliminate apathy and antipathy toward the need for good design of automated voice services” (that’s my favourite!). This is the first AVIxD Workshop to take place in Europe and I am honoured to have been appointed Co-Chair alongside Caroline Leathem-Collins from EIG.

Participation is free to AVIxD members and just £25 for non-members (which may be applied towards AVIxD membership). However in order to participate in the workshop, you need to submit a brief position paper in English (approx. 500 words) on any of the special topics of interest of the Workshop (See CFP below). The deadline for electronic submissions is Friday 25 March, so you need to hurry if you want to be part of it!

Here’s the full Call for (Position) Papers from the AVIxD site:

Call for Position Papers

First European AVIxD Workshop

Cross-linguistic & Cross-cultural Voice Interaction Design

Tuesday, 24 May 2011 (just prior to SpeechTEK Europe 2011), 1 – 7 PM

London, England

The Association for Voice Interaction Design (AVIxD) invites you to join us for our first voice interaction design workshop held in Europe, Cross-linguistic & Cross-cultural Voice Interaction Design. The AVIxD workshop is a hands-on day-long session in which voice user interface practitioners come together to debate a topic of interest to the speech community. The workshop is a unique opportunity for them to meet with their peers and delve deeply into a single topic.

As in previous years with the AVIxD Workshops held in the US, we will write papers based on our discussions which we will then publish on www.avixd.org. Please visit our website to see papers from previous workshops, and for more details on the purpose of the organization and how you can be part of it.

In order to participate in the workshop, individuals must submit a position paper of approximately 500 words in English. Possible topics to touch upon in your submission (to be discussed in depth during the workshop) include:

  1. Language choice and user demographics
  2. Presentation of the language options to the caller and caller preference
  3. Creation and (co-)maintenance of dialogue designs, grammars, prompts across languages
  4. Political and sociolinguistic issues in system prompt choices and recognition grammars, such as code-switching, formal versus informal registers
  5. Guidelines for application localization, translation, and interpretation
  6. Setting expectations regarding availability of multilingual agents, Language- and culture-sensitive persona definition
  7. Coordinating usability testing and tuning across diverse linguistic / cultural groups
  8. Language choice and modality preference

We always encourage the use of specific examples from applications you’ve worked on in your position paper.

Participation is free to AVIxD members; non-members will be charged £25, which may be applied towards AVIxD membership at the workshop. Please submit your position papers via email no later than Friday 25 March 2011 to cfp@avixd.org. Letters of acceptance will be sent out on 30 March 2011.

We look forward to engaging with the European speech design community to discuss the particular challenges of designing speech solutions for users from diverse linguistic and cultural backgrounds. Feel free to contact either of the co-chairs below, if you have any questions.

Caroline Leathem-Collins, EIG  (caroline {at} eiginc {dot} com)

Maria Aretoulaki, DialogCONNECTION Ltd (maria {at} dialogconnection {dot} com)

UPDATE

SpeechTEK Europe 2011 has come and gone and I’ve got many interesting things to report (as I have been tweeting through my @dialogconnectio Twitter account).

But first, here are the slides for my presentation at the main conference on the outcome of the AVIxD Workshop on Cross-linguistic & Cross-cultural Voice Interaction Design organised by the Association for Voice Interaction Design (AVIxD). I only had 12 hours to prepare them – including sleep and London tube commute – so I had to practically keep working on them until shortly before the Session! Still I think the slides capture the breadth and depth of topics discussed or at least touched upon at the Workshop. There are several people now writing up on all these topics and there should be one or more White papers on them very soon (by the end of July we hope!). So the slides did their job after all!

Get the slides in PDF here:  Maria Aretoulaki – SpeechTEK Europe 2011 presentation.

2010 in review – Not bad at all :)

3 Jan

The stats helper monkeys at WordPress.com mulled over how this blog did in 2010, and here’s a high level summary of its overall blog health:

Healthy blog!

The Blog-Health-o-Meter™ reads Wow.

Crunchy numbers

Featured image

A Boeing 747-400 passenger jet can hold 416 passengers. This blog was viewed about 7,600 times in 2010. That’s about 18 full 747s.

In 2010, there were 9 new posts, not bad for the first year! There were 32 pictures uploaded, taking up a total of 5mb. That’s about 3 pictures per month.

The busiest day of the year was September 15th with 326 views. The most popular post that day was The Social Media scene in Manchester (UK) is very sociable!.

Where did they come from?

The top referring sites in 2010 were linkedin.com, mail.live.com, mail.yahoo.com, twitter.com, and facebook.com.

Some visitors came searching, mostly for voice activated lift, scottish voice activated lift, voice activated lift scotland, and scottish voice activated elevator.

Attractions in 2010

These are the posts and pages that got the most views in 2010.

1

The Social Media scene in Manchester (UK) is very sociable! September 2010
32 comments

2

The voice-activated lift won’t do Scottish! July 2010
4 comments

3

Speech Recognition for Dummies May 2010
12 comments

4

The Loneliness of the long-distance … VUI Designer! June 2010
5 comments

5

About May 2010
1 comment

The voice-activated lift won’t do Scottish! (Burnistoun S1E1 – ELEVEN!)

28 Jul

Voice recognition technology? …  In a lift? … In Scotland? … You ever TRIED voice recognition technology? It don’t do Scottish accents!

Today I found this little gem on Youtube and I thought I must share it, as apart from being hilarious, it says a thing or two about speech recognition and speech-activated applications. It’s all based on the urban myth that speech recognisers cannot understand regional accents, such as Scottish and Irish.

Scottish Elevator – Voice Recognition – ELEVEN!

(YouTube – Burnistoun – Series 1 , Episode 1 [ Part 1/3 ])

What? No Buttons?!

These two Scottish guys enter a lift somewhere in Scotland and find that there are no buttons for the floor selection, so they quickly realise it’s a “voice-activated elevator“, as the system calls itself. They want to go to the 11th floor and they first pronounce it the Scottish way:

/eh leh ven/

That doesn’t seem to work at all.

You need to try an American accent“, says one of them, so they try to mimic one, sadly very unsuccessfully:

/ee leh ven/

Then they try a quite funny, Cockney-like English accent:

/ä leh ven/

to no avail.

VUI Sin No. 1: Being condescending to your users

The system prompts them to “Please speak slowly and clearly“, which is exactly what they had been doing up to then in the first place! Instead, it should have said something along the lines of “I’m afraid I didn’t get that. Let’s try again.” and later “I’m really sorry, but I don’t seem to understand what you’re saying. Maybe you would like to try one more time?“. Of course, not having any buttons in the lift means that these guys could be stuck in there forever! That’s another fatal usability error: Both modalities, speech and button presses, should have been allowed to cater for different user groups (easy accents, tricky accents) and different use contexts (people who have got their hands full with carrier bags vs people who can press a button!).

I’m gonna teach you a lesson!

One of them tries to teach the system the Scottish accent: “I keep saying it until she understands Scottish!“, a very reasonable expectation, which would work particularly well with a speaker-dependent dictation system of the kind you’ve got on your PC, laptop or hand-held device. This speaker-independent one (‘cos you can’t really have your personal lift in each building you enter!) will take a bit more time to learn anything from a single conversation! It requires time analysing the recordings, their transcriptions and semantic interpretations, comparing what the system understood with what the user actually said and using those observations to tune the whole system. We are talking at least a week in most cases. They would die of dehydration and starvation by then!

VUI Sin No.2: Patronising your users until they explode

After a while, the system makes it worse by saying what no system should ever dare say to a user’s face: “Please state which floor you would like to go to in a clear and calm manner.” Patronising or what! The guys’ reaction is not surprising: “Why is it telling people to be calm?! .. cos Scottish people would be going out for MONTHS at it!“.

Well, that’s not actually true. These days off-the-shelf speech recognition software is optimised to work with most main accents in a language, yes, including Glaswegian! Millions of real-world utterances spoken by thousands of people with all possible accents in a language (and this for many different languages too) are used to statistically train the recognition software to work equally well with most of them and for most of the time. These utterances are collected from applications that are already live and running somewhere in the world for the corresponding language. The more real-world data available, the better the software can be tuned and the more accurate the recognition of “weird” pronunciations will be, even when you take the software out of the box.

VUI Best Practice: Tune your application to cater for YOUR user population

An additional safeguarding and optimising technique is tuning the pronunciations for a specific speech recognition application.  So when you already know that your system will be deployed in Scotland, you’d better add the Scottish pronunciation for each word explicitly in the recognition lexicon.  This includes manually adding /eh leh ven/ , as the standard /ee leh ven/ pronunciation is not likely to work very well. Given that applications are usually restricted to a specific domain anyway (selecting floors in a lift, getting your bank account balance, choosing departure and arrival train times etc.), this only needs to be done for the core words and phrases in your application, rather than the whole English, French, or Farsi language! So do not despair, there’s hope for freedom (of speech) even for the Scottish! 🙂

For a full transcript of the video, check out EnglishCentral.

OK, so why Speech?

14 May

listen and understand

plan and respond

Businesses of all sizes, governments and other organisations are introducing Automatic Speech Recognition (ASR) in their existing Customer Relationship Management (CRM) processes, or upgrading their Touchtone (DTMF) IVRs, or even deploying brand new services from scratch. Their motivation is to keep Call Centre and Helpline costs down, aiming at the same time towards 24/7 availability of both information and services to their customers, as well as towards increasing customer satisfaction and loyalty.

A number of questions need to be answered, however, before going ahead with implementing a speech-activated or speech-enabled self-service:

  • Why use speech recognition in your CRM process at all?
  • Is speech really necessary or is touchtone sufficient or even more suited to your purposes?
  • Should perhaps your service combine both speech and touchtone? Which modality should be used where and when?
  • What is VUI Design and why and where will you need it?
  • How can you tell a “good” from a “bad” design?
  • How can you test a service and how can you ensure your customers will accept and even … like it?!
  • Is it possible to optimise an existing service and how?

This is where a Discovery Workshop will come in handy!

Define the solution through a series of Discovery Workshops

A proper VUI Designer will work closely with your organisation to help you answer questions such as the above and to help you decide on the potential business case for the introduction of speech and/or touchtone (DTMF) in your existing CRM processes. To this effect, intensive and productive 1-5 day Discovery Workshops should be organised, which will also be used for the conception and design of new services – if applicable.

In the process, the VUI Designer should talk extensively to both your Accounts and Marketing executives, and your IT staff, as all aspects of your business need to be taken into account in order to have a comprehensive, representative and realistic view of the existing and any potential issues, and the possibilities for optimisation. Part of this process involves identifying and interviewing real people, representative of your target market segments. The logic behind this is to pinpoint more accurately and effectively the needs, goals and expectations of both sides (the organisation and the end customer) regarding the planned service. Existing business processes, marketing strategies and channels are analysed, along with financial, logistical and technical constraints and targets.

The outcome of these brainstorming sessions and workshops should be a VUI Vision Proposal along with a Statement of Works report. The VUI Vision paper sketches out the proposed Voice User Interface, both in terms of suggested and desired functionality (what the system can and cannot do) but also in terms of hear-and-feel (communication style and tone). The accompanying Statement of Works is again a proposal on the corresponding list of tasks and deliverables towards the implementation of the VUI vision and feeds into the final Project Plan.

Some organisations decide – to their peril – to limit the time spent on such Brainstorming activities (too “fluffy” for them!) or even skip them altogether. The repercussions later on in the project cycle can be devastating. Erroneous or unrealistic assumptions about what a service should do and what its users expect or how they behave can mean that the whole time designing and implementing the solution could go to waste. After that, starting again from scratch is the only – very embarrassing! – option!