Tag Archives: CxD

A.I.: from Sci-Fi to Science reality

17 Jan

Just found this very brief and illustrated History of Artificial Intelligence at LiveScience.com and I couldn’t help but share it!

We have come a long way! (e.g. getting a Chatbot to pass the Turing Test and having to come up with a new test now!) But we also still have a long way to go until the dreaded Singularity moment comes!

A timeline of developments in computers and robotics.

Source:LiveScience

METALOGUE: Building a Multimodal Spoken Dialogue Tool for teaching Call Centre Agents metacognitive skills

21 Jul

METALOGUE logo

Since November 2013, I’ve had the opportunity to actively participate in the new EU-funded FP7 R & D project, METALOGUE, through my company DialogCONNECTION Ltd, which is one of the 10 Consortium Partners. The project aims to develop a natural, flexible, and interactive Multi-perspective and Multi-modal Dialogue system with metacognitive abilities, i.e. a system that can monitor, reason about, and provide feedback on its own behaviour, intentions and strategies, as well as the dialogue itself, guess those of its interlocutor,  and accordingly plan the next step in the dialogue. The goal is to dynamically adapt both its strategy and its behaviour (speech and non-verbal aspects) in order to influence the interlocutor’s reaction, and hence the progress of the dialogue over time, and thereby also achieve its own goals in the most advantageous way for both sides. The project runs for 3 years (until Oct 2016) and has a budget of € 3,749,000 (EU contribution: € 2,971,000). METALOGUE brings together 10 Academic and Industry partners from 5 EU countries (Germany, Netherlands, Greece, Ireland, and UK).

metalogue_replay

The METALOGUE research focuses on interactive and adaptive coaching situations where negotiation skills play a key role in the decision-making processes. Reusable and customisable software components and algorithms will be developed, tested and integrated into a prototype platform, which will provide learners with a rich and interactive environment that will help them develop metacognitive skills, support motivation, and stimulate creativity and responsibility in the decision-making, argumentation, and negotiation process. The project will produce virtual trainers capable of engaging in natural interaction in English, German, and Greek, using gestures, facial expressions, and body language.

 

Pilot systems will be developed for 2 different Sectors: Government and Industry. The corresponding user scenaria that have been selected are: a) Youth Parliamentarian Debating Skill Tutoring (for the Hellenic Youth Parliament) and b) Call Centre Agent Training (for multilingual UK Call Centres). Particularly for Call Centres, we have identified the following broad CC Agent training goals:

CC training goals

CC training goals B

These training goals translate into the following metacognitive skills that a Call Centre Agent needs to learn and which will be taught through the METALOGUE system:

CC training goals C

To this effect, DialogCONNECTION Ltd and myself are looking for UK-based Call Centres, preferably with multilingual agents, that would like to participate in the course of the project.

 

What we need from you

Ideally, we would get access to real-world Call Centre Agent-Caller/Customer recordings. However, simulated Trainer – Trainee phone calls that are used for situational Agent training are also acceptable (either already available or collected specifically for the project). 2 hours of audio (and video if available) would suffice for the 1st year of the project (needed by October 2014). By the 2nd year (Dec 2015) we would need a total of 15 hours of audio. The audio will be used to train the METALOGUE speech recognisers and the associated acoustic and language models, as well as its metacognitive models. We are looking for Call Centres that are either small and agile (serving multiple small clients) or large (and probably plagued by the well-known agent burn-out syndrome). Strict EU Guidelines for data protection will be applied on all the collected and all the published data (e.g. caller anonymisation, sensitive data redaction) and ultimately YOU determine what can and what cannot be published both during and after the project has ended.

 

What’s in it for you

  • Participation, input in, early access & evaluation of all intermediate pilots as they are developed (No need to wait until the end of the project like the rest of the world!)
  • Provide feedback, express wishes with regards to your own requirements for features and functionality (i.e. the pilots and the end system will be customised to your own needs!)
  • Full METALOGUE system at the end for free, customised to your needs and requirements (no source code, or speech recogniser, just the system as-is)

 

If I have sparked your interest, please get in touch by leaving a comment to this post or contact us through our company website. Here is a handy PDF with the invitation to Call Centres (METALOGUE Poster).

 

You can get updates on the progress of the METALOGUE project by connecting with us on LinkedIn and on Twitter. Watch the future happen now!

 

EU FP7 logo

Develop your own Android voice app!

26 Dec

Voice application Development for Android

My colleague Michael F. McTear has got a new and very topical book out! Voice Application Development for Android, co-authored with Zoraida Callejas. Apart from a hands-on step-by-step but still condensed guide to voice application development, you get the source code to develop your own Android apps for free!

Get the book here or through Amazon. And have a look at the source code here.

Exciting times ahead for do-it-yourself Android speech app development!

The AVIxD 49 VUI Tips in 45 Minutes !

6 Nov

Image

 

 

The illustrious Association for Voice Interaction Design (AVIxD) organised a Workshop in the context of SpeechTEK in August 2010, whose goal was “to provide VUI designers with as many tips as possible during the session“. Initially the goal was 30 Tips in 45 minutes. But they got overexcited and came up with a whooping 49 Tips in the end! The Session was moderated by Jenni McKienzie, and the panelists were David Attwater, Jon Bloom, Karen Kaushansky, and Julie Underdahl. This list dates back 3 years now, but it’s by no means outdated. This is the most sound advice you will find in designing better voice recognition IVRs and I hated it being buried in a PDF!

So I am audaciously plagiarising and bringing you here: the 49 VUI Tips for Better Voice User Interface Design! Or go and read the .PDF yourselves here:

Image

Image

Image

Image

Image

Image

Image

Image

Image

Image

And finally ….

Image

 

Have you got a VUI Tip you can’t find in this list that you’d like to share? Tell us here!

 

XOWi: The wearable Voice Recognition Personal Assistant

30 Oct

I just found out about the new venture of my colleagues, Ahmed Bouzid and Weiye Ma, and I’m all excited and want to spread the word!

They came up with the idea of a Wearable and hence Ubiquitous Personal Voice Assistant, XOWi (pronounced Zoe). The basic concept is that XOWi is small and unintrusive (you wear it like a badge or pin it somewhere near you) but still connects to your smartphone and through that to all kinds of apps and websites for communicating with people (Facebook, Twitter, Ebay) and controlling data and information (selecting TV channels, switching the aircon on). Moreover, it is completely voice-driven, so it is completely hands- and eyes-free. This means that it won’t distract you (if you’re driving, reading, working) and if you have any vision impairment or disability, you are still completely connected and communicable. So, XOWi truly turns Star Trek into reality! The video below explains the concept:


The type of application context is exemplified by the following diagram.

XOWi architecture

And here is how it works:

Ahmed and Weiye have turned to Kickstarter for crowdfunding. If they manage to get $100,000 by 21st November, XOWi will become a product and I will get one for my birthday in March 2014! 😀 Join the Innovators and support the next generation in smart communicators!

TEDxManchester 2012: Voice Recognition FTW!

12 Sep

After the extensive TEDxSalford report, and the TEDxManchester Best-of, it’s about time I posted the YouTube video of my TEDxManchester talk!

TEDxManchester took place on Monday 13th February this year at one of the iconic Manchester locations – and my “local” – the Cornerhouse. Among the luminary speakers were people I have always been admiring, such as the radio Goddess Mary Anne Hobbs, and people I have become very close friends with over the years – which has led me to an equal amount of admiration, such as Ian Forrester (@cubicgarden to most of us). You can check out their respective talks, as well as some awesome others, in my TEDxManchester report below.

My TEDxManchester talk

I spoke about the weird and wonderful world of Voice Recognition (“Voice Recognition FTW!”): from the inaccurate – and far too often funny – simple voice-to-text apps and dictation systems on your smartphones, to the most frustrating automated Call Centres, to the next generation, sophisticated SIRI and everything in-between. I explained why things go wrong and when things can go wonderfully right. The answer is “CONTEXT”; the more you have of it , the more accurate and relevant the interpretation of user intention will be, and the more relevant and impressive the system reaction / reply will be.

Here is finally my TEDxManchester video on YouTube.

And below are my TEDxManchester slides.

A speech recognition user interface works when it … disappears!

25 Oct

Today is a big day for me! I’m finally getting to meet in person one of the Coryphées of the VUI Design World (even though as far as I know he’s not a ballet dancer), Bruce Balentine of the Enterprise Integration Group (EIG).  Bruce is the author of one of the best books ever written on IVR / Speech applications / Voice User Interface Design, It’s Better to Be a Good Machine Than a Bad Person – Speech Recognition and Other Exotic User Interfaces at the Twilight of the Jetsonian Age.

Apart from the ingenuity of the title itself, encapsulating the golden rule of good user experience / usability design, you can readily see to what great lengths Bruce has gone to serve his pearls of design wisdom in a most humourous and utterly witty way. This doesn’t in any way decrease in the least the importance, relevance and truthfulness of his observations and recommendations. Bruce is a veteran designer and he has seen it all before, from the excitement and optimism to the disappointment and pessimism, to the final destination, design realism:

First we tried to make them human. Now it’s time to make them work 

To get a flavour of the type of UX design advice and messages conveyed in the book, here’s an extract from Chapter  132: Will Speech Technology Ever Work? (pp. 393-395 in my 2007 edition):

In closing, I must ask the question. Will it ever work? And, of course, the answer is, yes. Speech recognition—and its related technologies (e.g., speaker verification, text-to-speech, audio indexing, speech data mining, dictation) will work. Indeed they already do. They will fill their respective application niches almost completely. And, in fact, the majority will do so quite soon. What will change is the definition of “work”.

Speech recognition is primarily a user interface technology*. As such, it works when it disappears. It’s really that simple. When the users are not thinking about the user interface, but instead are accomplishing the task to which they are connected by the user interface, then and only then can the interface be said to be “working.” We have to stay on message with this fundamental fact if we are ever to succeed at bringing speech to the performance level where we can legitimately claim that it “works.”

True words!!! As a bonus,  Leslie Degler’s illustrations perfectly complement and enhance the messages conveyed in the text, once again in the wittiest and most original manner.  Buy this book ASAP! After all, if you don’t agree with its theses, you can always return it. All you need to do is:

Write out in longhand, on a separate page, “I,” and add your name, “agree that there’s not a chance in Hell any refund will ever come of this claim.” Label this statement as your “declaration.”  

After you have received your refund, we’ll call you with an outbound IVR that asks you several hundred thought-provoking questions about your customer experience. We value your opinion—please give us your most honest and spontaneous responses. We’ll do our best to recognize them

It says it all really! 🙂

To date, I have only met Bruce virtually, through Skype calls and the Creative Speech Technology Network (CreST) of which we are both members, and I can already tell he is a very funny, witty, creative (musical!),  interesting, as well as intelligent person. So I can’t wait to meet him in person later today and hear some more fascinating stories and hilarious anecdotes from the world of speech recognition application design, voice interface usability and technology abuse!

UPDATE:

I went (to the dinner with Bruce) and (was) conquered by the brilliance and witticism of the man! I got my long-awaited autograph in his book too, as I can now prove!