Tag Archives: VUI Designer

A speech recognition user interface works when it … disappears!

25 Oct

Today is a big day for me! I’m finally getting to meet in person one of the Coryphées of the VUI Design World (even though as far as I know he’s not a ballet dancer), Bruce Balentine of the Enterprise Integration Group (EIG).  Bruce is the author of one of the best books ever written on IVR / Speech applications / Voice User Interface Design, It’s Better to Be a Good Machine Than a Bad Person – Speech Recognition and Other Exotic User Interfaces at the Twilight of the Jetsonian Age.

Apart from the ingenuity of the title itself, encapsulating the golden rule of good user experience / usability design, you can readily see to what great lengths Bruce has gone to serve his pearls of design wisdom in a most humourous and utterly witty way. This doesn’t in any way decrease in the least the importance, relevance and truthfulness of his observations and recommendations. Bruce is a veteran designer and he has seen it all before, from the excitement and optimism to the disappointment and pessimism, to the final destination, design realism:

First we tried to make them human. Now it’s time to make them work 

To get a flavour of the type of UX design advice and messages conveyed in the book, here’s an extract from Chapter  132: Will Speech Technology Ever Work? (pp. 393-395 in my 2007 edition):

In closing, I must ask the question. Will it ever work? And, of course, the answer is, yes. Speech recognition—and its related technologies (e.g., speaker verification, text-to-speech, audio indexing, speech data mining, dictation) will work. Indeed they already do. They will fill their respective application niches almost completely. And, in fact, the majority will do so quite soon. What will change is the definition of “work”.

Speech recognition is primarily a user interface technology*. As such, it works when it disappears. It’s really that simple. When the users are not thinking about the user interface, but instead are accomplishing the task to which they are connected by the user interface, then and only then can the interface be said to be “working.” We have to stay on message with this fundamental fact if we are ever to succeed at bringing speech to the performance level where we can legitimately claim that it “works.”

True words!!! As a bonus,  Leslie Degler’s illustrations perfectly complement and enhance the messages conveyed in the text, once again in the wittiest and most original manner.  Buy this book ASAP! After all, if you don’t agree with its theses, you can always return it. All you need to do is:

Write out in longhand, on a separate page, “I,” and add your name, “agree that there’s not a chance in Hell any refund will ever come of this claim.” Label this statement as your “declaration.”  

After you have received your refund, we’ll call you with an outbound IVR that asks you several hundred thought-provoking questions about your customer experience. We value your opinion—please give us your most honest and spontaneous responses. We’ll do our best to recognize them

It says it all really! :)

To date, I have only met Bruce virtually, through Skype calls and the Creative Speech Technology Network (CreST) of which we are both members, and I can already tell he is a very funny, witty, creative (musical!),  interesting, as well as intelligent person. So I can’t wait to meet him in person later today and hear some more fascinating stories and hilarious anecdotes from the world of speech recognition application design, voice interface usability and technology abuse!

UPDATE:

I went (to the dinner with Bruce) and (was) conquered by the brilliance and witticism of the man! I got my long-awaited autograph in his book too, as I can now prove!

Speech Interaction on Mobile Devices at SpeechTEK 2011 (New York)

7 Aug

Today sees the launch of the Joint AVIxD / IxDA Workshop on Speech Interaction on Mobile Devices that kick-starts the mother of Voice Solutions Fairs, SpeechTEK 2011 in New York next week (8-10 Aug).

AVIxD

AVIxD is the Association for Voice Interaction Design, a professional organisation that aims to

“eliminate apathy and antipathy toward the need for good design of automated voice services”, 

which has become my favourite VUI mantra!

IxDA is the Interaction Design Association, a much bigger professional “un-organisation” which  intends to:

“improve the human condition by advancing the discipline of Interaction Design”

A very worthy cause indeed, especially since it is true that “the human condition is increasingly challenged by poor experiences. “!

IxDA

Today’s Joint Workshop in New York aims to bring together interaction design practitioners from across the voice, interactive, and digital areas to identify the issues and challenges involved in  speech interaction design on mobile devices, such as smartphones and tablets, and to come up by the end of the day with ways to approach them or even tackle them. A very ambitious format that, however, really does work!

AVIxD organised another Workshop this year on Cross-linguistic & Cross-cultural Voice Interaction Design, which was also the 1st European Workshop, just before SpeechTEK Europe in London this May past. See what we all came up with in those 6 hours in the SpeechTEK Europe PDF presentation below.

And if you don’t manage to take part in today’s workshop, make sure you go to the SpeechTEK Conference and Exhibition itself that starts tomorrow and runs until Wednesday the 10th. Listen to presentations and see or even try for yourself market-ready products relating to:

  • multimodal applications
  • cross-channel applications
  • speech analytics
  • speaker identification and verification
  • in-car systems
  • natural language and say-anything technologies
  • speech translation
  • voice-enabled personal assistants
  • as well as the latest speech recognition techniques and technologies

I particularly recommend the Keynote Panel on “Mobility — A Game-Changer for Speech?” on Tuesday on how smartphones are dramatically changing how customers interact with businesses and with the devices themselves. Some really interesting issues and questions will be raised, such as:

* How voice user interfaces will be integrated with graphical user interfaces?

or

* Will users embrace voice as they have embraced keypads on mobile devices? 

Sadly I am in the UK today and next week, so I’m going to miss it all. But if you are lucky enough to be in or near New York, make sure you go and enjoy!

SpeechTEK 2011 New York

Cross-linguistic & Cross-cultural Voice Interaction Design

31 Jan

(update at the end)

2010 saw the first SpeechTEK Conference to have taken place outside of the US, SpeechTEK Europe 2010 in London. This year’s European Conference, SpeechTEK Europe 2011, will take place again in London (25 – 26 May 2011), but this time it will be preceded on Tuesday 24th May by a special Workshop on Cross-linguistic & Cross-cultural Voice Interaction Design organised by the Association for Voice Interaction Design (AVIxD). The main goal of AVIxD is to bring together voice interaction and experience designers from both Industry and Academia and, among other things, to “eliminate apathy and antipathy toward the need for good design of automated voice services” (that’s my favourite!). This is the first AVIxD Workshop to take place in Europe and I am honoured to have been appointed Co-Chair alongside Caroline Leathem-Collins from EIG.

Participation is free to AVIxD members and just £25 for non-members (which may be applied towards AVIxD membership). However in order to participate in the workshop, you need to submit a brief position paper in English (approx. 500 words) on any of the special topics of interest of the Workshop (See CFP below). The deadline for electronic submissions is Friday 25 March, so you need to hurry if you want to be part of it!

Here’s the full Call for (Position) Papers from the AVIxD site:

Call for Position Papers

First European AVIxD Workshop

Cross-linguistic & Cross-cultural Voice Interaction Design

Tuesday, 24 May 2011 (just prior to SpeechTEK Europe 2011), 1 – 7 PM

London, England

The Association for Voice Interaction Design (AVIxD) invites you to join us for our first voice interaction design workshop held in Europe, Cross-linguistic & Cross-cultural Voice Interaction Design. The AVIxD workshop is a hands-on day-long session in which voice user interface practitioners come together to debate a topic of interest to the speech community. The workshop is a unique opportunity for them to meet with their peers and delve deeply into a single topic.

As in previous years with the AVIxD Workshops held in the US, we will write papers based on our discussions which we will then publish on www.avixd.org. Please visit our website to see papers from previous workshops, and for more details on the purpose of the organization and how you can be part of it.

In order to participate in the workshop, individuals must submit a position paper of approximately 500 words in English. Possible topics to touch upon in your submission (to be discussed in depth during the workshop) include:

  1. Language choice and user demographics
  2. Presentation of the language options to the caller and caller preference
  3. Creation and (co-)maintenance of dialogue designs, grammars, prompts across languages
  4. Political and sociolinguistic issues in system prompt choices and recognition grammars, such as code-switching, formal versus informal registers
  5. Guidelines for application localization, translation, and interpretation
  6. Setting expectations regarding availability of multilingual agents, Language- and culture-sensitive persona definition
  7. Coordinating usability testing and tuning across diverse linguistic / cultural groups
  8. Language choice and modality preference

We always encourage the use of specific examples from applications you’ve worked on in your position paper.

Participation is free to AVIxD members; non-members will be charged £25, which may be applied towards AVIxD membership at the workshop. Please submit your position papers via email no later than Friday 25 March 2011 to cfp@avixd.org. Letters of acceptance will be sent out on 30 March 2011.

We look forward to engaging with the European speech design community to discuss the particular challenges of designing speech solutions for users from diverse linguistic and cultural backgrounds. Feel free to contact either of the co-chairs below, if you have any questions.

Caroline Leathem-Collins, EIG  (caroline {at} eiginc {dot} com)

Maria Aretoulaki, DialogCONNECTION Ltd (maria {at} dialogconnection {dot} com)

UPDATE

SpeechTEK Europe 2011 has come and gone and I’ve got many interesting things to report (as I have been tweeting through my @dialogconnectio Twitter account).

But first, here are the slides for my presentation at the main conference on the outcome of the AVIxD Workshop on Cross-linguistic & Cross-cultural Voice Interaction Design organised by the Association for Voice Interaction Design (AVIxD). I only had 12 hours to prepare them – including sleep and London tube commute – so I had to practically keep working on them until shortly before the Session! Still I think the slides capture the breadth and depth of topics discussed or at least touched upon at the Workshop. There are several people now writing up on all these topics and there should be one or more White papers on them very soon (by the end of July we hope!). So the slides did their job after all!

Get the slides in PDF here:  Maria Aretoulaki – SpeechTEK Europe 2011 presentation.

The eternal battle between the VUI Designer & the Customer

7 Dec

I promised some time ago to put up the slides of my presentation at this year’s SpeechTEK Europe 2010 in London, the first SpeechTEK to have taken place outside of the US. My presentation, “The Eternal Battle Between the VUI Designer and the Customer“, was on Wednesday 26th May 2010 and opened the “Voice User Interface Design: Major Issues” Session.  It went down really well, and I had afterwards several people in the audience tell me about their own experience and asking me for tips on how to deal with similar issues.

Here is a PDF with the presentation slides:

Maria Aretoulaki – “The Eternal Battle Between the VUI Designer and the Customer” (SpeechTEK Europe 2010 presentation)

Maria Aretoulaki – “The Eternal Battle Between the VUI Designer and the Customer” (SpeechTEK Europe 2010 presentation)

Maria Aretoulaki – SpeechTEK Europe 2010 presentation UPDATED ppt

And here’s the gist of it:

VUI Design is preoccupied with the conception, the design, the implementation, the testing, and the tuning of solutions that work in the most efficient, secure and non-irritating for the user manner. Well, realistically that’s what VUI Design can achieve. In an ideal world, the VUI Designer would actually strive to create speech applications that – apart from taking into consideration the customer’s financial and brand requirements – would also fit the caller’s needs, goals and preferences. The initial Requirements analysis should bring both in focus. So much is already known and accepted both amidst the VUI Designers and the customers.

The problems start just after they all leave the meeting room and start working on the implementation: Call flow design, system persona development and prompt crafting, but even recognition grammars, all seem to fall victim of a war of words and attitudes between the VUI Design expert who has seen systems being developed and spurned before, and the customer with his tech-savvy business team and their technical architects and programming geniuses, who all think they know what callers want and how call flows should be structured, prompt wording crafted and grammars written, just because they have got strong opinions! Even the results of Usability tests are liable to different interpretations by each side.

This presentation pinpoints common pitfalls in the communication between a VUI Designer and customer employees and recommends ways to resolve conflicts and disagreements on the application design and implementation.

Credits:

SpeechTEK Europe 2010 was organised by:

Information Today, Inc.
143 Old Marlton Pike
Medford NJ 08055 U.S.A.
Phone 1 (609) 654-6266.
http://www.infotoday.com

The Loneliness of the long-distance … VUI Designer!

13 Jun

On Friday 11th June, I took part in the “Pathways” event organised annually by the University of Manchester Career Service to support PhD researchers as well as research staff in “making career choices, exploring future plans and discovering the breadth of opportunities available to them“. I was Guest Panellist at 3 different Sessions:

  1. Opportunities for Engineering and Physical Sciences
  2. Working as a Freelancer or Consultant and
  3. Enterprise, Entrepreneurship and Business Start Up

The University of Manchester Logo

As a University of Manchester graduate (well, technically UMIST, I felt compelled to take part in those Question and Answer panels in order to give some insight on how a career can develop: from a Bachelors in English & Linguistics in Greece, to a Masters of Science in Machine Translation and a Doctorate in Automatic Text Summarisation in the UK, to a Post-Doctoral Fellowship in Spoken Dialogue Management and a position as a Research Project Manager in Germany, to working in Industry both as a full-time employee and as an external contractor as a Voice User Interface (VUI) Designer in Germany, the UK, Switzerland, the US and further afield. It’s been a fascinating journey for sure! And I probably would never have arrived where I am now, if I hadn’t done those degrees or taken up those jobs in those specific places.

Have a look at the Guest Speaker profiles, including mine (p. 24), here: http://www.careers.manchester.ac.uk/media/media,172749,en.pdf

Some very inspiring career journeys!

I have to say I have thoroughly enjoyed the whole journey, the projects I have worked on, the people I have met on the way, the different organisational cultures I had the chance to experience. Plus, I wouldn’t change what I do now for the world! I love working as an external contractor and coming in to design speech self-service systems and voice-to-text services from scratch, or optimise existing ones, and the whole development, testing and tuning cycles:

  • writing functional specification documents
  • defining the system persona
  • drawing call flows
  • crafting system messages and coaching voice talents for the recordings
  • writing speech recognition grammars and pronunciations
  • devising and carrying out Wizard-of-Oz tests and Usability tests (including recording test subjects on video and interviewing them afterwards!)
  • transcribing and analysing phone calls
  • writing tuning reports

Everything is a lot of fun! It’s also great to be bringing in the same VUI Design processes and skills in different organisations and projects, and also getting to work at different places in the world at any one time! I love the variety of work and location of work, as well as the flexibility to work anytime and from anywhere! (Yes, working on your laptop – iPad soon – from a beach in the Caribbean is no longer a daydream but a realistic plan! :) )

working on a deserted beach in the Caribbean is no longer a daydream!

Okay, it does get lonely. No gossiping in the kitchen during coffee breaks and no Christmas office parties. I still get to have probably as many face-to-face project meetings and conference calls as the average office worker though. We all have to work independently and in isolation, when analysing data or composing a report anyway. Only office workers have also got the hectic running-around of their colleagues and lots of intrusive and loud phone calls they have to unwillingly witness in silence. So my loneliness is a very content one! :D

Does Your Customer Know What They are Signing off??

3 Jun

Just back from SpeechTEK Europe 2010, the first SpeechTEK to take place outside of the US, which was great fun. I gave a presentation on “The Eternal Battle Between the VUI Designer and the Customer“, which went down quite well (more on that in my next blog), heard many interesting new ideas about how normal people view normal communication channels to a company or organisation (the Web is prevailing but multimodality and crosschannel communication will be indispensable in a couple of years), heard about new applications of speech and touchtone and any challenges they are facing, and met up with loads of people I know in the field from companies I’ve worked for and cities I have worked in. I have started a few projects and collaborations as a result (again to be announced in my next blog), but for now I would like to share my presentation at SpeechTEK 2007 in New York on Monday 20th August 2007 (how time passes!), entitled: “Does Your Customer Know What They are Signing off?”.

Maria Aretoulaki – SpeechTEK 2007 presentation – opening slide

As it says in the accompanying blurb: “This presentation stresses the importance of incremental and modular descriptions of system functionality for targeted and phased reviews and testing. This strategy ensures clarity, consistency, and maintainability beyond the project lifetime and eliminates the need for changes midproject, thus both managing customer expectations and protecting the service provider from ad-hoc requests.“.

Here is a PDF with the presentation slides:

Maria Aretoulaki – SpeechTEK 2007 presentation : “Does Your Customer Know What They are Signing off?”

You can also get the Powerpoint file from the SpeechTEK site itself at: http://conferences.infotoday.com/stats/documents/default.aspx?id=29&lnk=http%3A%2F%2Fconferences.infotoday.com%2Fdocuments%2F27%2FB105_Aretoulaki.pps

The idea is to have a standardised way to document speech application design both in terms of call flow depictions and in terms of functionality description. In addition, 3 different tiers of functionality and call flow representation are proposed, from the more abstract High-Level design (what range of tasks can a system perform?), to the rather detailed Macro-Level (all the user interaction and back-end processes and their interdependencies), to the very detailed Micro-Level which documents every single condition, system prompt and related recognition grammar.

Maria Aretoulaki – 3-tier speech app design representation

The point is that, in every speech project, a number of people with very different backgrounds, roles and expectations are involved, from the Business-minded, to the Techie, to the Usability expert: from Account Managers to the Marketing Strategists, to the Call Centre Managers, the IT Managers, the System Architects, the Programmers, and the VUI Designer themselves (more on these different characters in my next blog with my SpeechTEK 2010 presentation). The 3 different tiers of speech design representation and documentation are ideal in catering for the diverse information needs of those very different groups. The Business and Marketing guys understand better the High-Level representation with the list of things that the system can do in different cases. The Call Centre Managers and some very involved (and worried!) business guys from the side of the customer feel better when they see the Macro-Level detail, because they feel they have more information and therefore more control over what is being designed and implemented. It is also something very concrete to sign off (and therefore difficult to dispute at will later on). The VUI Designer and the System Architect and the various application developers really need the excruciating detail of the Micro-Level: every single condition (including every case where things go wrong) needs to be documented, along with every different prompt that the system will utter (including when it doesn’t recognise or even hear what the caller says), and every speech recognition grammar that is activated every time the system expects a reaction from the caller / user. The inherent modularity and the incremental nature of the design representation means that it can be more easily maintained, more readily modified, and even more straightforwardly adopted and adapted for other speech and multimodal applications in the future. So everybody’s happy :)

I gave this presentation when I was Head of Speech Design at Vicorp, although the basic ideas behind it matured during the time I was Senior VUI Designer at Intervoice (now Convergys).

Credits:

SpeechTEK 2007 was organised by:

Information Today, Inc.
143 Old Marlton Pike
Medford NJ 08055 U.S.A.
Phone 1 (609) 654-6266.
http://www.infotoday.com

OK, so why Speech?

14 May

listen and understand

plan and respond

Businesses of all sizes, governments and other organisations are introducing Automatic Speech Recognition (ASR) in their existing Customer Relationship Management (CRM) processes, or upgrading their Touchtone (DTMF) IVRs, or even deploying brand new services from scratch. Their motivation is to keep Call Centre and Helpline costs down, aiming at the same time towards 24/7 availability of both information and services to their customers, as well as towards increasing customer satisfaction and loyalty.

A number of questions need to be answered, however, before going ahead with implementing a speech-activated or speech-enabled self-service:

  • Why use speech recognition in your CRM process at all?
  • Is speech really necessary or is touchtone sufficient or even more suited to your purposes?
  • Should perhaps your service combine both speech and touchtone? Which modality should be used where and when?
  • What is VUI Design and why and where will you need it?
  • How can you tell a “good” from a “bad” design?
  • How can you test a service and how can you ensure your customers will accept and even … like it?!
  • Is it possible to optimise an existing service and how?

This is where a Discovery Workshop will come in handy!

Define the solution through a series of Discovery Workshops

A proper VUI Designer will work closely with your organisation to help you answer questions such as the above and to help you decide on the potential business case for the introduction of speech and/or touchtone (DTMF) in your existing CRM processes. To this effect, intensive and productive 1-5 day Discovery Workshops should be organised, which will also be used for the conception and design of new services – if applicable.

In the process, the VUI Designer should talk extensively to both your Accounts and Marketing executives, and your IT staff, as all aspects of your business need to be taken into account in order to have a comprehensive, representative and realistic view of the existing and any potential issues, and the possibilities for optimisation. Part of this process involves identifying and interviewing real people, representative of your target market segments. The logic behind this is to pinpoint more accurately and effectively the needs, goals and expectations of both sides (the organisation and the end customer) regarding the planned service. Existing business processes, marketing strategies and channels are analysed, along with financial, logistical and technical constraints and targets.

The outcome of these brainstorming sessions and workshops should be a VUI Vision Proposal along with a Statement of Works report. The VUI Vision paper sketches out the proposed Voice User Interface, both in terms of suggested and desired functionality (what the system can and cannot do) but also in terms of hear-and-feel (communication style and tone). The accompanying Statement of Works is again a proposal on the corresponding list of tasks and deliverables towards the implementation of the VUI vision and feeds into the final Project Plan.

Some organisations decide – to their peril – to limit the time spent on such Brainstorming activities (too “fluffy” for them!) or even skip them altogether. The repercussions later on in the project cycle can be devastating. Erroneous or unrealistic assumptions about what a service should do and what its users expect or how they behave can mean that the whole time designing and implementing the solution could go to waste. After that, starting again from scratch is the only – very embarrassing! – option!

Follow

Get every new post delivered to your Inbox.