Tag Archives: voice-to-text

UBIQUITOUS VOICE: Essays from the Field now on Kindle!

14 Oct

In 2018, a new book on “Voice First” came out on Amazon and I was proud and deeply honoured, as it includes one of my articles! Now it has come out on Kindle as an e-Book and we are even more excited at the prospect of a much wider reach!

“Ubiquitous Voice: Essays from the Field”: Thoughts, insights and anecdotes on Speech Recognition, Voice User Interfaces, Voice Assistants, Conversational Intelligence, VUI Design, Voice UX issues, solutions, Best practices and visions from the veterans!

I have been part of this effort since its inception, working alongside some of the pioneers in the field who now represent the Market Leaders (GOOGLE, AMAZON, NUANCE, SAMSUNG VIV .. ). Excellent job by our tireless and intrepid Editor, Lisa Falkson!

My contribution “Convenience + Security = Trust: Do you trust your Intelligent Assistant?” is on data privacy concerns and social issues associated with the widespread adoption of voice activation. It is thus platform-, ASR-, vendor- and company-agnostic.

You can get the physical book here and the Kindle version here.

Prepare to be enlightened, guided and inspired!

Meet META, the Meta-cognitive skills Training Avatar!

16 Jun

METALOGUE logo

EU FP7 logo

 

Since November 2013, I’ve had the opportunity to participate in the EU-funded FP7 R & D project, METALOGUE, through my company DialogCONNECTION Ltd, one of 10 Consortium Partners. The project aims to develop a natural, flexible, and interactive Multi-perspective and Multi-modal Dialogue system with meta-cognitive abilities; a system that can:

  • monitor, reason about, and provide feedback on its own behaviour, intentions and strategies, and the dialogue itself,
  • guess the intentions of its interlocutor,
  • and accordingly plan the next step in the dialogue.

The system tries to dynamically adapt both its strategy and behaviour (speech and non-verbal aspects) in order to influence the dialogue partner’s reaction, and, as a result, the progress of the dialogue over time, and thereby also achieve its own goals in the most advantageous way for both sides.

The project is in its 3rd and final year (ending in Oct 2016) and has a budget of € 3,749,000 (EU contribution: € 2,971,000). METALOGUE brings together 10 Academic and Industry partners from 5 EU countries (Germany, Netherlands, Greece, Ireland, and UK).

 

METALOGUE focuses on interactive and adaptive training situations, where negotiation skills play a key role in the decision-making processes. Reusable and customisable software components and algorithms have been developed, tested and integrated into a prototype platform, which provides learners with a rich and interactive environment that motivates them to develop meta-cognitive skills, by stimulating creativity and responsibility in the decision-making, argumentation, and negotiation process. The project is producing a virtual trainer, META, a Training Avatar capable of engaging in natural interaction in English (currently, with the addition of German and Greek in the future), using gestures, facial expressions, and body language.

METALOGUE Avatar

Pilot systems have been developed for 2 different user scenarios: a) debatingand b) negotiation, both tested and evaluated by English-speaking students at the Hellenic Youth Parliament. We are currently targeting various industry verticals, in particular Call Centres, e.g. to semi-automate and enhance Call Centre Agent Training.

 

And here’s META in action!

 

In this video, our full-body METALOGUE Avatar is playing the role of a business owner, who is negotiating a smoking ban with a local Government Counsellor.   Still imperfect (e.g. there is some slight latency before replying – and an embarrassing repetition at some point!), but you can also see the realistic facial expressions, gaze, gestures, and body language, and even selective and effective pauses. It can process natural spontaneous speech in a pre-specified domain (smoking ban, in this case) and it has reached an ASR error rate below 24% (down from almost 50% 2 years ago!). The idea is to use such an Avatar in Call Centres to provide extra training support on top of existing training courses and workshops. It’s not about replacing the human trainer, but rather empowering and motivating Call Centre Trainee Agents who are trying to learn how to read their callers and how to successfully negotiate deals and even complaints with them in an optimal way.

IMG_20151218_143348

 

My company, DialogCONNECTION, is charged with the task of attracting interest and feedback from industry to gauge the relevance and effectiveness of the METALOGUE approach in employee training contexts (esp. negotiation and decision-making). We are looking in particular for Call Centres;both small and agile (serving multiple small clients) and large (and probably plagued by the well-known agent burn-out syndrome). Ideally, you would give us access to real-world Call Centre Agent-Caller/Customer recordings or even simulated Trainer – Trainee phone calls that are used for situational Agent training (either already available or collected specifically for the project). A total of just 15 hours of audio (and video if available) would suffice to train the METALOGUE speech recognisers and the associated acoustic and language models, as well as its metacognitive models.

However, if you don’t want to commit your organisation’s data, any type of input and feedback would make us happy! As an innovative pioneering research project, we really need guidance, evaluation and any input from the real world of industry! So, if we have sparked your interest in any way and you want to get involved and give it a spin, please get in touch!

TEDxManchester (13 Feb 2012): Best of!

15 May

2012 can easily be dubbed the year of TEDx for me, as by mid-February I had already attended two TEDx events! First up was TEDxSalford in late January, where I was just a mindblown attendee, and two weeks later it was TEDxManchester where I had the honour to be a speaker!

TEDxManchester took place on Monday 13th February this year at one of the iconic Manchester locations – and my “local” – the Cornerhouse.  Among the luminary speakers were people I have always been admiring, such as the radio Goddess Mary Anne Hobbs, and people I have become very close friends with over the years – which has led me to an equal amount of admiration, such as Ian Forrester (@cubicgarden to most of us).

Here are their respective talks at TEDxManchester 2012 for you to get a taste of the atmosphere at the event and of the impact of the ideas and the immediacy of the sentiments circulated!

Mary Anne Hobbs

Ian Forrester

 My TEDxManchester talk

I spoke about the weird and wonderful world of Voice Recognition (“Voice Recognition FTW!”): from the inaccurate – and far too often funny – simple voice-to-text apps and dictation systems on your smartphones, to the most frustrating automated Call Centres, to the next generation, sophisticated SIRI and everything in-between. I explained why things go wrong and when things can go wonderfully right. The answer is “CONTEXT”; the more you have of it , the more accurate and relevant the interpretation of user intention will be, and the more relevant and impressive the system reaction / reply will be.

Here is finally my TEDxManchester video on YouTube.

And here are my TEDxManchester slides.

The (Re)Tweets

(in reverse chronological order)

@ar3toul4ki 17 Feb

thanks for the #TEDxMCR piccie @cubicgarden! http://farm8.staticflickr.com/7050/6875061121_69555f7eb3_b.jpg @TEDxManchester

Cornerhouse @CornerhouseMcr 16 Feb

For those of you who missed #TEDxMCR check out @cubicgarden’s pics! Videos should be with us in a couple of weeks http://www.flickr.com/photos/cubicgarden/tags/tedxmcr/

Retweeted by @ar3toul4ki

Martin Williams @ukcopywriting 15 Feb

@ar3toul4ki ‘s Next level awesome epic bio – http://www.tedxmanchester.com/#speakers #TEDxMCR
Retweeted by @ar3toul4ki
In reply to @ar3toul4ki

@ar3toul4ki 15 Feb

What an awesome (wicked, epic) bio the cool guys and girls @TEDxManchester have written for me!!! http://www.tedxmanchester.com/#speakers #TEDxMCR

@ar3toul4ki 15 Feb
RT @global_lingo: Maria Aretoulaki on voice recognition software. Will digital transcription ever be any good? #tedxmcr no, no it won’t

Lynne McCadden @lmccadden 14 Feb
Belated I know but many congrats to @herbkim for a fantastic TEDxMCR yesterday been thinking about some of it all day today !
Retweeted by @ar3toul4ki

TEDxManchester @TEDxManchester 14 Feb
Here’s to the #TEDxMCR speakers in Session 2 – @daveerasmus @martinsfp @ar3toul4ki @cubicgarden @brendandawes
Retweeted by @ar3toul4ki

TEDxManchester @TEDxManchester 13 Feb
Thanks to @BandXMedia all today’s #TEDxMCR talks were recorded, will be edited & put online soon 🙂 #TEDxMCR @s2martin
Retweeted by @ar3toul4ki

Lynne McCadden @lmccadden 13 Feb

#tedxmcr learning about quarks and leptons from @tarashears making particle physics easy – sort of
Retweeted by @ar3toul4ki

Lynne McCadden @lmccadden 13 Feb

watching this @TEDxManchester kevin slavin’s TED talk on how algorithms shape our world:  http://www.ted.com/talks/kevin_slavin_how_algorithms_shape_our_world.html
Retweeted by @ar3toul4ki

Dr Marieke Navin ‏ @lisamarieke

depends if fitting gaussians to your data is your thing… Question is, do you understand your data?!
Retweeted by @ar3toul4ki
13 Feb
@ar3toul4ki

Oh yes! © Bruce Balentine RT @LukeRobertMason: “It’s better to be a good Machine than a bad person” Discuss? #TEDxMCR
13 Feb
Luke Robert Mason ‏ @LukeRobertMason

How to Wreck a Nice Beach @TEDxManchester #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb
Luke Robert Mason ‏ @LukeRobertMason

It’s a bright future if you are an algorithm or infomorph #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb
Luke Robert Mason ‏ @LukeRobertMason

@RichardMichie A little bit of non-human agency can’t hurt… Or can it 😉 #TEDxMCH
Retweeted by @ar3toul4ki
In reply to RichardMichie
13 Feb
 Ian Forrester ‏ @cubicgarden 

Infomorphs or a weaver… #TEDxMCR love the idea 🙂 very cool! They could work with #perceptivemedia yfrog.com/gzeg2jij
Retweeted by @ar3toul4ki
13 Feb
Ian Pettigrew ‏ @KingfisherCoach

#TEDxMCR @skeuomorphology challenging ‘necessity is the mother of invention’; cars weren’t invented as a response to a shortage of horses!
Retweeted by @ar3toul4ki
13 Feb
Luke Robert Mason ‏ @LukeRobertMason 

Pure information technologies are the first evolutionary aware technologies. They are stochastic… Emerge from randomness #TEDxMCR @weavrs
Retweeted by @ar3toul4ki
13 Feb
Luke Robert Mason ‏ @LukeRobertMason 

Living software ‘bots’ or infomorphs via @weavrs #infomorph #TEDxMCR @skeuomorphology
Retweeted by @ar3toul4ki
13 Feb
Michael Di Paola ‏ @MichaelDiPaola

Robots made from programmable gel…where the hell am I? A parallel universe, the future. No. Just at #TEDxMCR listening to Dan O’Hara
Retweeted by @ar3toul4ki
13 Feb
 Luke Robert Mason ‏ @LukeRobertMason 

Infomorph, a form that exists just of information @skeuomorphology #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb

Luke Robert Mason ‏ @LukeRobertMason 

Another type of software agent that exhibits life, @weavrs #infomorph @skeuomorphology #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb

@ar3toul4ki 

@pgaval δε πειράζει, θα είναι στο YouTube για πάντα! (Μαμά! )
In reply to Petros Gavalakis
13 Feb

@ar3toul4ki 

Mondays are my favourite days of the week : D
13 Feb
 Matthew Brooks ‏ @brooksoid 

Great, great talk by @brendandawes on the value of pursuing ideas, and the ideas they spawn, without necessarily knowing where you’re going
Retweeted by @ar3toul4ki
from Manchester, Manchester
13 Feb
 RichardMichie ‏ @RichardMichie 

Failed art at school? You can still exhibit at #moma @brendandawes #tedxmcr great story love it
Retweeted by @ar3toul4ki
13 Feb
 @ar3toul4ki 

@brendandawes’ cinema redux of Hitchcock’s Vertigo #TEDxMCR twitpic.com/8jfs95
13 Feb

sphey1 ‏ @sphey1

If you make something, give it a name – re: Cinema Redux @brendandawes #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb
@ar3toul4ki 

Things that @brendandawes has done with his 3-printer #TEDxMCR twitpic.com/8jfpn6
13 Feb
 

@ar3toul4ki 

@brendandawes : the creative process is iterative. ( but Battling it against time & cost constraints) #TEDxMCR
13 Feb
@ar3toul4ki 

RT @CMindsKelly: @brendandawes. The thing we in the room all share is curiosity. That’s why we’re always making new things #TEDxMCR”
13 Feb
 Martin Bryant ‏ @MartinSFP 

At #tedxmcr, @cubicgarden explained how @tdobson and @adew saved his life. instagr.am/p/G9BmXRStoc/
Retweeted by @ar3toul4ki
13 Feb

@ar3toul4ki

Ian Forrester: fear the fear #TEDxMCR
13 Feb
 Claire-Marie ‏ @CMBoggiano

‘We are complex & unique organisms And yes, I am still an atheist.’ Ian Forrester, #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb
 @ar3toul4ki 

Indeed! RT @TonyChurnside: @cubicgarden really touching. Very nicely done!
13 Feb
@ar3toul4ki 

@brooksoid any time!
In reply to Matthew Brooks
13 Feb
 Matthew Brooks ‏ @brooksoid 

@ar3toul4ki great talk Maria, speech recog in focus at the beeb right now, be interesting to talk once I’ve worked out what our landscape is
Retweeted by @ar3toul4ki
13 Feb
 TEDxManchester ‏ @TEDxManchester

Link to the funny vid played by @ar3toul4ki – Scottish voice recognition problems.. http://youtu.be/sAz_UvnUeuU

Retweeted by @ar3toul4ki
13 Feb
 @ar3toul4ki 

Ευχαριστώ! Το είδες μήπως; RT @pgaval: @ar3toul4ki Καλή επιτυχία!
13 Feb
 Claire-Marie ‏ @CMBoggiano 

‘When I was lying in bed dying, where were the real people?’ Ian Forrester, #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb
 Tony Churnside ‏ @TonyChurnside 

Watching @cubicgarden talk about his #brushwithdeath. A very scary time. #TEDxMCR pic.twitter.com/hED5mimw
Retweeted by @ar3toul4ki
13 Feb

@ar3toul4ki 

@tdobson @cubicgarden is talking about you! : D
In reply to Tim Dobson
13 Feb
 Tim Dobson ‏ @tdobson 

so @cubicgarden is talking about it #brushwithdeath when I may or may not have been his flatmate at the time.. #tedxman
Retweeted by @ar3toul4ki
13 Feb
 Matthew Brooks ‏ @brooksoid 

And @cubicgarden ‘s talk is about… @cubicgarden ! He’s finally gone recursive. #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb
 Tony Churnside ‏ @TonyChurnside

@cubicgarden you’re looking good! pic.twitter.com/n8xvkzJB
Retweeted by @ar3toul4ki
13 Feb
 Ian Forrester ‏ @cubicgarden

And next on at #TEDxMCR its @ianforrester. With the story of me…
Retweeted by @ar3toul4ki
13 Feb
 TEDxManchester ‏ @TEDxManchester 

Hilarious talk on Voice Recognition from Dr Maria Aretoulaki #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb
 Tim Dobson ‏ @tdobson

@davemee it’s all about context! /cc @ar3toul4ki 😉
Retweeted by @ar3toul4ki
In reply to Dave Mee
13 Feb
 Tim Dobson ‏ @tdobson 

@davemee @ar3toul4ki “fetish cheese”
Retweeted by @ar3toul4ki
In reply to Dave Mee
13 Feb
 Dave Mee ‏ @davemee 

@tdobson @ar3toul4ki feed her through siri and send me a transcript!
Retweeted by @ar3toul4ki
In reply to Tim Dobson
13 Feb
 Kate Towey ‏ @katiemaymanc 

Fascinating talk from Tara Shears on particle physics. ‘2012 is year of the Higgs’ #tedxmcr
Retweeted by @ar3toul4ki
13 Feb
 Ian Pettigrew ‏ @KingfisherCoach 

So far at #TEDxMCR we’ve covered pursuing your passion, JDI (and make mistakes), technology, algorithms, and particle physics. I’m happy!
Retweeted by @ar3toul4ki
13 Feb
 Allie Johns ‏ @AllieJohns

I propose bringing back Tomorrow’s World and having Tara Shears present it #tedxmcr
Retweeted by @ar3toul4ki
13 Feb
@ar3toul4ki

@TaraShears @TEDxManchester: oh my Higgs! We’ve seen something! Or have we?? #TEDxMCR twitpic.com/8jdo56
13 Feb

Ian Forrester ‏ @cubicgarden 

The goddamn particle explained at #TEDxMCR yfrog.com/obsv5tmj
Retweeted by @ar3toul4ki
13 Feb

@ar3toul4ki 

@TaraShears @TEDxManchester: where’s that God-damned Higgs particle?! If we don’t find it, we’ll have to start all over again… #TEDxMCR
13 Feb
 Claire-Marie ‏ @CMBoggiano 

“@lmccadden: #tedxmcr learning about quarks and leptons from @tarashears making particle physics easy – sort of”
Retweeted by @ar3toul4ki
13 Feb
@ar3toul4ki

@TaraShears @TEDxManchester: symmetry, simplicity, elegance = beauty of the standard model of particle physics #TEDxMCR
13 Feb
 TEDxManchester ‏ @TEDxManchester 

Up next @TEDxManchester is @TaraShears – tune in live to ow.ly/92eRf #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb
 @ar3toul4ki 

@TEDx video 1 @TEDxManchester: pragmatic chaos to describe fluid things such as culture #TEDxMCR
13 Feb
@ar3toul4ki 

@coralgrainger no worries sweetness : )
In reply to coralgrainger
13 Feb
@ar3toul4ki 

@TEDx video @TEDxManchester: what we don’t understand, we give a name and a story to #TEDxMCR
13 Feb
 @ar3toul4ki 

@maryannehobbs you were, nay ARE, awesome! Xx
In reply to maryanne hobbs
13 Feb
@ar3toul4ki

Dan O’Hara @skeuomorphology @TEDxManchester: from random relentless replication (cf. spambots) to guided transformation of chaos #TEDxMCR
13 Feb
 Kim Willis ‏ @KimberleyWillis 

Dan O’Hara: technology is not a selection of gadgets but a body of knowledge instagr.am/p/G8sGbrBVY7/ #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb

Ian Wareing ‏ @ianwareing 

#tedxmcr @skeuomorphology “Necessity is not the mother of invention. Invention is the mother of necessity”
Retweeted by @ar3toul4ki
13 Feb
 Ian Forrester ‏ @cubicgarden

Bloatware… or stimulation of the real on the virtual RT @maanasvarun: Skeumorphism. wait what? #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb
@ar3toul4ki 

Dan O’Hara @skeuomorphology @TEDxManchester: the creation of living technology by merging the Arts and Sciences #TEDxMCR
13 Feb
@ar3toul4ki 

@tombloxhammbe @TEDxManchester: go through life making mistakes, otherwise you don’t take any decisions, just do it ! © #TEDxMCR
13 Feb
@ar3toul4ki 

@maryannehobbs @TEDxManchester: John Peel saving lives again #TEDxMCR
13 Feb
@ar3toul4ki 

@maryannehobbs @TEDxManchester: follow your passion! #TEDxMCR twitpic.com/8jcwpn
13 Feb
 TEDxManchester ‏ @TEDxManchester

Hi all we’re suggesting #TEDxMCR as the hashtag for the event today as it’s a bit shorter than #TEDxManchester 🙂
Retweeted by @ar3toul4ki
13 Feb
 TEDxManchester ‏ @TEDxManchester 

Sorry folks for the livestream #fail. We’re currently on this channel live.. bit.ly/y9kkZa #TEDxMCR
Retweeted by @ar3toul4ki
13 Feb
@ar3toul4ki 

@gazshaw cheers!
In reply to Gaz Shaw
13 Feb
@ar3toul4ki

😀 see you there Mike! It’s been a loooong time! RT @mike_higham: @ar3toul4ki @TEDxManchester Looking forward to it #TEDxMCR
13 Feb
@ar3toul4ki 

@heloukee oh nooo : s
In reply to Helen Keegan
13 Feb
@ar3toul4ki 

Excited & honoured to be speaking @TEDxManchester today. My talk “Voice Recognition FTW!” on the present+future of user interfaces #TEDxMCR
13 Feb
@ar3toul4ki 

See you there Matt! RT @matthbooth: A bit of work then @TEDxManchester. Looking forward to it.
13 Feb
 Allie Johns ‏ @AllieJohns 

“@maryannehobbs: interesting day: speaking about passion at @TEDxManchester 1pm.. ” > we can never have enough passion in our lives.
Retweeted by @ar3toul4ki
13 Feb
@ar3toul4ki 

😀 Will you be my groupie?? RT @technicalfault: @ar3toul4ki Dr Maria at TEDx!
12 Feb
@ar3toul4ki 

Looking forward to giving #TEDxMCR an insight into the wondrous+often misconstrued world of voice recognition @TEDxManchester tomorrow
12 Feb
 TEDxManchester ‏ @TEDxManchester 

And in other late-breaking news Dr. Maria @Ar3toul4ki will also be taking the stage tomorrow at #TEDxMCR 🙂
Retweeted by @ar3toul4ki
12 Feb
 TEDxManchester ‏ @TEDxManchester 

A big welcome for our latest speaker @MartinSFP – European Editor @TheNextWeb for #TEDxMCR. Like @MaryAnneHobbs a brave no-slide presenter!
Retweeted by @ar3toul4ki
11 Feb
 Anna Nachesa ‏ @ashalynd 

I’ll probably be very evil if I ask during an interview if tail-optimized recursion is possible in C. OTOH, it might be a great icebreaker:)
Retweeted by @ar3toul4ki
11 Feb
@ar3toul4ki 

RT @TEDxManchester: @ar3toul4ki:really excited bout Mons #TEDxMCR @maryannehobbs @brendandawes @cubicgarden @skeuomorphology @tombloxhammbe

10 Feb
@ar3toul4ki 

Can’t wait to finally meet you! : D RT @maryannehobbs: @ar3toul4ki 🙂

10 Feb
 @ar3toul4ki 

Getting really excited about Monday’s #TEDxMCR @CornerhouseMcr: @maryannehobbs @brendandawes @cubicgarden Dan O’Hara & my Uni’s Tom Bloxham

Cross-linguistic & Cross-cultural Voice Interaction Design

31 Jan

(update at the end)

2010 saw the first SpeechTEK Conference to have taken place outside of the US, SpeechTEK Europe 2010 in London. This year’s European Conference, SpeechTEK Europe 2011, will take place again in London (25 – 26 May 2011), but this time it will be preceded on Tuesday 24th May by a special Workshop on Cross-linguistic & Cross-cultural Voice Interaction Design organised by the Association for Voice Interaction Design (AVIxD). The main goal of AVIxD is to bring together voice interaction and experience designers from both Industry and Academia and, among other things, to “eliminate apathy and antipathy toward the need for good design of automated voice services” (that’s my favourite!). This is the first AVIxD Workshop to take place in Europe and I am honoured to have been appointed Co-Chair alongside Caroline Leathem-Collins from EIG.

Participation is free to AVIxD members and just £25 for non-members (which may be applied towards AVIxD membership). However in order to participate in the workshop, you need to submit a brief position paper in English (approx. 500 words) on any of the special topics of interest of the Workshop (See CFP below). The deadline for electronic submissions is Friday 25 March, so you need to hurry if you want to be part of it!

Here’s the full Call for (Position) Papers from the AVIxD site:

Call for Position Papers

First European AVIxD Workshop

Cross-linguistic & Cross-cultural Voice Interaction Design

Tuesday, 24 May 2011 (just prior to SpeechTEK Europe 2011), 1 – 7 PM

London, England

The Association for Voice Interaction Design (AVIxD) invites you to join us for our first voice interaction design workshop held in Europe, Cross-linguistic & Cross-cultural Voice Interaction Design. The AVIxD workshop is a hands-on day-long session in which voice user interface practitioners come together to debate a topic of interest to the speech community. The workshop is a unique opportunity for them to meet with their peers and delve deeply into a single topic.

As in previous years with the AVIxD Workshops held in the US, we will write papers based on our discussions which we will then publish on www.avixd.org. Please visit our website to see papers from previous workshops, and for more details on the purpose of the organization and how you can be part of it.

In order to participate in the workshop, individuals must submit a position paper of approximately 500 words in English. Possible topics to touch upon in your submission (to be discussed in depth during the workshop) include:

  1. Language choice and user demographics
  2. Presentation of the language options to the caller and caller preference
  3. Creation and (co-)maintenance of dialogue designs, grammars, prompts across languages
  4. Political and sociolinguistic issues in system prompt choices and recognition grammars, such as code-switching, formal versus informal registers
  5. Guidelines for application localization, translation, and interpretation
  6. Setting expectations regarding availability of multilingual agents, Language- and culture-sensitive persona definition
  7. Coordinating usability testing and tuning across diverse linguistic / cultural groups
  8. Language choice and modality preference

We always encourage the use of specific examples from applications you’ve worked on in your position paper.

Participation is free to AVIxD members; non-members will be charged £25, which may be applied towards AVIxD membership at the workshop. Please submit your position papers via email no later than Friday 25 March 2011 to cfp@avixd.org. Letters of acceptance will be sent out on 30 March 2011.

We look forward to engaging with the European speech design community to discuss the particular challenges of designing speech solutions for users from diverse linguistic and cultural backgrounds. Feel free to contact either of the co-chairs below, if you have any questions.

Caroline Leathem-Collins, EIG  (caroline {at} eiginc {dot} com)

Maria Aretoulaki, DialogCONNECTION Ltd (maria {at} dialogconnection {dot} com)

UPDATE

SpeechTEK Europe 2011 has come and gone and I’ve got many interesting things to report (as I have been tweeting through my @dialogconnectio Twitter account).

But first, here are the slides for my presentation at the main conference on the outcome of the AVIxD Workshop on Cross-linguistic & Cross-cultural Voice Interaction Design organised by the Association for Voice Interaction Design (AVIxD). I only had 12 hours to prepare them – including sleep and London tube commute – so I had to practically keep working on them until shortly before the Session! Still I think the slides capture the breadth and depth of topics discussed or at least touched upon at the Workshop. There are several people now writing up on all these topics and there should be one or more White papers on them very soon (by the end of July we hope!). So the slides did their job after all!

Get the slides in PDF here:  Maria Aretoulaki – SpeechTEK Europe 2011 presentation.

2010 in review – Not bad at all :)

3 Jan

The stats helper monkeys at WordPress.com mulled over how this blog did in 2010, and here’s a high level summary of its overall blog health:

Healthy blog!

The Blog-Health-o-Meter™ reads Wow.

Crunchy numbers

Featured image

A Boeing 747-400 passenger jet can hold 416 passengers. This blog was viewed about 7,600 times in 2010. That’s about 18 full 747s.

In 2010, there were 9 new posts, not bad for the first year! There were 32 pictures uploaded, taking up a total of 5mb. That’s about 3 pictures per month.

The busiest day of the year was September 15th with 326 views. The most popular post that day was The Social Media scene in Manchester (UK) is very sociable!.

Where did they come from?

The top referring sites in 2010 were linkedin.com, mail.live.com, mail.yahoo.com, twitter.com, and facebook.com.

Some visitors came searching, mostly for voice activated lift, scottish voice activated lift, voice activated lift scotland, and scottish voice activated elevator.

Attractions in 2010

These are the posts and pages that got the most views in 2010.

1

The Social Media scene in Manchester (UK) is very sociable! September 2010
32 comments

2

The voice-activated lift won’t do Scottish! July 2010
4 comments

3

Speech Recognition for Dummies May 2010
12 comments

4

The Loneliness of the long-distance … VUI Designer! June 2010
5 comments

5

About May 2010
1 comment

The voice-activated lift won’t do Scottish! (Burnistoun S1E1 – ELEVEN!)

28 Jul

Voice recognition technology? …  In a lift? … In Scotland? … You ever TRIED voice recognition technology? It don’t do Scottish accents!

Today I found this little gem on Youtube and I thought I must share it, as apart from being hilarious, it says a thing or two about speech recognition and speech-activated applications. It’s all based on the urban myth that speech recognisers cannot understand regional accents, such as Scottish and Irish.

Scottish Elevator – Voice Recognition – ELEVEN!

(YouTube – Burnistoun – Series 1 , Episode 1 [ Part 1/3 ])

What? No Buttons?!

These two Scottish guys enter a lift somewhere in Scotland and find that there are no buttons for the floor selection, so they quickly realise it’s a “voice-activated elevator“, as the system calls itself. They want to go to the 11th floor and they first pronounce it the Scottish way:

/eh leh ven/

That doesn’t seem to work at all.

You need to try an American accent“, says one of them, so they try to mimic one, sadly very unsuccessfully:

/ee leh ven/

Then they try a quite funny, Cockney-like English accent:

/ä leh ven/

to no avail.

VUI Sin No. 1: Being condescending to your users

The system prompts them to “Please speak slowly and clearly“, which is exactly what they had been doing up to then in the first place! Instead, it should have said something along the lines of “I’m afraid I didn’t get that. Let’s try again.” and later “I’m really sorry, but I don’t seem to understand what you’re saying. Maybe you would like to try one more time?“. Of course, not having any buttons in the lift means that these guys could be stuck in there forever! That’s another fatal usability error: Both modalities, speech and button presses, should have been allowed to cater for different user groups (easy accents, tricky accents) and different use contexts (people who have got their hands full with carrier bags vs people who can press a button!).

I’m gonna teach you a lesson!

One of them tries to teach the system the Scottish accent: “I keep saying it until she understands Scottish!“, a very reasonable expectation, which would work particularly well with a speaker-dependent dictation system of the kind you’ve got on your PC, laptop or hand-held device. This speaker-independent one (‘cos you can’t really have your personal lift in each building you enter!) will take a bit more time to learn anything from a single conversation! It requires time analysing the recordings, their transcriptions and semantic interpretations, comparing what the system understood with what the user actually said and using those observations to tune the whole system. We are talking at least a week in most cases. They would die of dehydration and starvation by then!

VUI Sin No.2: Patronising your users until they explode

After a while, the system makes it worse by saying what no system should ever dare say to a user’s face: “Please state which floor you would like to go to in a clear and calm manner.” Patronising or what! The guys’ reaction is not surprising: “Why is it telling people to be calm?! .. cos Scottish people would be going out for MONTHS at it!“.

Well, that’s not actually true. These days off-the-shelf speech recognition software is optimised to work with most main accents in a language, yes, including Glaswegian! Millions of real-world utterances spoken by thousands of people with all possible accents in a language (and this for many different languages too) are used to statistically train the recognition software to work equally well with most of them and for most of the time. These utterances are collected from applications that are already live and running somewhere in the world for the corresponding language. The more real-world data available, the better the software can be tuned and the more accurate the recognition of “weird” pronunciations will be, even when you take the software out of the box.

VUI Best Practice: Tune your application to cater for YOUR user population

An additional safeguarding and optimising technique is tuning the pronunciations for a specific speech recognition application.  So when you already know that your system will be deployed in Scotland, you’d better add the Scottish pronunciation for each word explicitly in the recognition lexicon.  This includes manually adding /eh leh ven/ , as the standard /ee leh ven/ pronunciation is not likely to work very well. Given that applications are usually restricted to a specific domain anyway (selecting floors in a lift, getting your bank account balance, choosing departure and arrival train times etc.), this only needs to be done for the core words and phrases in your application, rather than the whole English, French, or Farsi language! So do not despair, there’s hope for freedom (of speech) even for the Scottish! 🙂

For a full transcript of the video, check out EnglishCentral.

The Loneliness of the long-distance … VUI Designer!

13 Jun

On Friday 11th June, I took part in the “Pathways” event organised annually by the University of Manchester Career Service to support PhD researchers as well as research staff in “making career choices, exploring future plans and discovering the breadth of opportunities available to them“. I was Guest Panellist at 3 different Sessions:

  1. Opportunities for Engineering and Physical Sciences
  2. Working as a Freelancer or Consultant and
  3. Enterprise, Entrepreneurship and Business Start Up

The University of Manchester Logo

As a University of Manchester graduate (well, technically UMIST, I felt compelled to take part in those Question and Answer panels in order to give some insight on how a career can develop: from a Bachelors in English & Linguistics in Greece, to a Masters of Science in Machine Translation and a Doctorate in Automatic Text Summarisation in the UK, to a Post-Doctoral Fellowship in Spoken Dialogue Management and a position as a Research Project Manager in Germany, to working in Industry both as a full-time employee and as an external contractor as a Voice User Interface (VUI) Designer in Germany, the UK, Switzerland, the US and further afield. It’s been a fascinating journey for sure! And I probably would never have arrived where I am now, if I hadn’t done those degrees or taken up those jobs in those specific places.

Have a look at the Guest Speaker profiles, including mine (p. 24), here: http://www.careers.manchester.ac.uk/media/media,172749,en.pdf

Some very inspiring career journeys!

I have to say I have thoroughly enjoyed the whole journey, the projects I have worked on, the people I have met on the way, the different organisational cultures I had the chance to experience. Plus, I wouldn’t change what I do now for the world! I love working as an external contractor and coming in to design speech self-service systems and voice-to-text services from scratch, or optimise existing ones, and the whole development, testing and tuning cycles:

  • writing functional specification documents
  • defining the system persona
  • drawing call flows
  • crafting system messages and coaching voice talents for the recordings
  • writing speech recognition grammars and pronunciations
  • devising and carrying out Wizard-of-Oz tests and Usability tests (including recording test subjects on video and interviewing them afterwards!)
  • transcribing and analysing phone calls
  • writing tuning reports

Everything is a lot of fun! It’s also great to be bringing in the same VUI Design processes and skills in different organisations and projects, and also getting to work at different places in the world at any one time! I love the variety of work and location of work, as well as the flexibility to work anytime and from anywhere! (Yes, working on your laptop – iPad soon – from a beach in the Caribbean is no longer a daydream but a realistic plan! :))

working on a deserted beach in the Caribbean is no longer a daydream!

Okay, it does get lonely. No gossiping in the kitchen during coffee breaks and no Christmas office parties. I still get to have probably as many face-to-face project meetings and conference calls as the average office worker though. We all have to work independently and in isolation, when analysing data or composing a report anyway. Only office workers have also got the hectic running-around of their colleagues and lots of intrusive and loud phone calls they have to unwillingly witness in silence. So my loneliness is a very content one! 😀