Archive | Voice Persona Design & Branding RSS feed for this section

How to Design & Optimise Conversational AI Dialogue Systems

26 Jun
Data Futurology EP122 with Dr Maria Aretoulaki

The latest episode of Data Futurology features myself talk with Felipe Flores about how I got into AI, Machine Learning and Deep Learning (“When I discovered Artificial Neural Networks back in 1993, I thought I had discovered God!”) and later Data Science and Big Data Speech Analytics for Voice User Interface (VUI) and Voice First Design (for Speech IVRs and Voice Assistants).

In the podcast, I give an overview of all the different steps involved in VUI Design, the often tricky interaction with the different stakeholders (Coders, Business Units, Marketeers) and the challenges of working in an Agile PM framework, when skipping detailed Voice Design is suicide (and my worst nightmare!). I showcase some of the interesting and outright funny things you discover, when you analyse real-world human-machine dialogues taken from Speech IVR systems (hint, most original swearwords!) and I pinpoint the main differences between a Speech IVR and a Voice Assistant skill / action / capsule. I also contemplate on where Voice is heading now with the ubiquity of Voice First and the prevalance of skills that were developed by software engineers, web designers and Marketing copywriters, without VUI Design expertise, Linguistics Training or knowledge of how Speech Recognition works.

Data Futurology Podcast EP122 – Optimising Conversational AI Dialogue Systems with Dr. Maria Aretoulaki
Episode list of contents

I even provide tips on how to survive working for a Start-up and strategies on how to stay focused and strong when you run your own SME Business. Below is a brief checklist of some of the core qualities required and lessons I have learned working for Start-ups:

No alt text provided for this image

You can listen to the whole episode on Data Futurology , on Apple Podcasts, on Google Podcasts, on Spotify, or wherever you get your Podcasts or you can watch it on YouTube:

#122 – Optimising Conversational AI Dialogue Systems with Dr. Maria Aretoulaki

It was a delight speaking to Felipe! It made me think about and consolidate some of the – occasionally hard – lessons I have learned along the way about Voice Design, NLP, AI, ML and running a Business, so that others hopefully prepare for and ideally also avoid any heartaches!

No alt text provided for this image

The immortality of Data

10 Jun
Boundless Podcast Episode EP48

The latest Boundless Podcast Episode is out! It features myself in a deep conversation with Richard Foster-Fletcher about Big Data and Speech Analytics for Voice First & Voice Assistants (but not only), BigTech, AI Ethics and the need for a new legal framework for Responsible AI and Explainable Machine Learning. Below is a snippet of the conversation:

Boundless Podcast Episode EP48 – snippet

β€œMy data is being used and exploited, and I can do nothing about it. We need to modernise the legal system. Apart from all the ethical, moral discussions that need to be made, we need a legal system that takes into consideration the fact that intelligence doesn’t need to be visible to be acting against me.”

I wouldn’t call myself a technophobe. Quite the opposite. I was learning how to program (BASIC!) back in 1986 in the Summer after getting my degree in English Language & Literature; I was teaching computers how to translate between human languages and chatting online with mainframes in University buildings 2 kms away back in 1991; I was programming Artificial Neural Networks and using Parallel computers back in 1993; I was reading internet newsgroups and downloading music – very much legally! – on a web browser (Netscape!) in 1994; I was designing Voice interfaces already in 1996 and voice-controlled home assistants back in 1998. I have even been using LinkedIn since 2005.

Yet, I am very sceptical and pessimistic about our uninhibited sharing of personal data and sensitive information all day every day on multiple channels, to multiple audiences, much of it willingly, much more unwillingly, in the name of sharing and human connection, service and product personalisation and ultimately, far too often, monetisation.

What will that mean for our legacy as individuals? Who will own, control and curate all the generated data after our death? Who can clone us and why? Will there be a second life after death? Will there ever be freedom from a dictator or will there ever be any point in bequeathing anything in a will? These and many more questions are discussed in this episode. I had lots of fun recording this! Thank you so much to Richard for creating thisπŸ™

You can listen to a snippet of the conversation here (Download)

No alt text provided for this image

You can listen to the full episode here or wherever you get your podcasts.

Alternatively, you can listen to it on YouTube (in 2 parts):

Part 1

Boundless EP48 – Part 1

And Part 2

Boundless EP48 – Part 2

#BigData #BigTech #SpeechAnalytics #VoiceFirst #VoiceAssistants #Alexaskills #GoogleAssistant #Bixby #AIethics #responsibleAI #explainableVUI #AI #ArtificialIntelligence #MachineLearning #ML #DeepLearning #ANNs

My baby, DialogCONNECTION, is 11!

4 Dec

This week, my company, DialogCONNECTION Limited, turned 11 years old! πŸŽ‰ πŸ₯‚ 😁

It feels like yesterday, when in December 2008 I registered it with Companies House and became Company Director (with multiple hats).

My very first client project was for the NHS Business Authority on their EHIC Helpline (which hopefully will survive the Brexit negotiations). Back then, whenever I was telling anyone what my company does (VUI Design for Speech IVRs), I was greeted by blank stares of confusion or incomprehension. It did feel a bit lonely at times!

Many more clients and thousands of long hours, long days and working weekends since, here we are in December 2019 and I suddenly find myself surrounded by VUI Designers and Voice Strategists who have now seen the potential and inescapable nature of speech interfaces and have followed on my footsteps. I feel vindicated, especially since I started in Voice back in 1996 with my Post-Doc in Spoken Dialogue Management at the University of Erlangen! 😎 (Yet another thing I’m hugely grateful to the EU for!)

We started with Voice-First VUI Design back in 1996, well before Samsung’s BIXBY (2017), Google’s ASSISTANT (2016), Amazon’s ALEXA (2014), Apple’s SIRI (2010) and even before the world started using GOOGLE for internet searches (1998)!

http://dialogconnection.com/who-designs-for-you.html

It’s quite frustrating when I realise that many of these newcomers have never heard of an IVR (Interactive Voice Response) system before, but they will eventually learn. πŸ€“ In the past 25 years it was the developers who insisted could design conversational interfaces without any (Computational) Linguistics, Natural Language Processing (NLP) or Speech Recognition (ASR) background and didn’t need, therefore, a VUI Designer. And we were an allegedly superfluous luxury and rarity in those times. In the past couple of years it’s the shiny Marketing people, who make a living from their language mastery, and the edgy GUI Designers, who excell in visual design and think they can design voice interfaces too, but still know nothing about NLP or ASR.

What they don’t know is that, by modifying, for instance, just the wording of what your system says (prompt tuning), you can achieve dramatically better speech recognition and NLU accuracy, because the user is covertly “guided” to say what we expect (and have covered in the grammar). The same holds for tuned grammars (for out-of-vocabulary words), word pronunciations (for local and foreign accents), tuned VUI designs (for error recovery strategies) and tuned ASR engine parameters (for timeouts and barge-ins). It’s all about knowing how the ASR software and our human brain language software works.

Excited to see what the next decade is going to bring for DialogCONNECTION and the next quarter of a century for Voice! Stay tuned!

Towards EU collaboration on Conversational AI, Data & Robotics

22 Nov

I was really interested to read the BDVA – Big Data Value Association‘s and euRobotics‘ recent report on “Strategic Research, Innovation and Deployment Agenda for an AI PPP: A focal point for collaboration on Artificial Intelligence, Data and Robotics“, which you can find here.

Of particular relevance to me was the Section on Physical and Human Action and Interaction (pp. 39-41), which describes the dependencies, challenges and expected outcome of coordinated action on NLP, NLU and multimodal dialogue processing. The associated challenges are:

  • Natural interaction in unstructured contexts, which is the default in the case of voice assistants for instance, as they are expected to hold a conversation on any of a range of different topics and act on them
  • Improved natural language understanding, interaction and dialogue covering all European languages and age ranges, thus shifting the focus from isolated recognition to the interpretation of the semantic and cultural context, and the user intention
  • Development of verbal and non-verbal interaction models for people and machines, underlining the importance of gestures and emotion recognition and generation (and not only in embodied artificial agents)
  • Co-development of technology and regulation to assure safe interaction in safety-critical and unstructured environments, as the only way to assure trust and, hence, widespread citizen and customer adoption
  • The development of confidence measures for interaction and the interpretation of actions, leading to explanable AI and, hence, improved and more reliable decision-making
No alt text provided for this image

You can find the excellent and very comprehensive report here.

Voice Assistants and Kids

7 Nov

This week I bumped into Echodad and “the world’s harshest critics of Alexa skills“, i.e. his 2 kids. Hat tip for the idea by the way!! πŸ˜€

I read with particular interest his Medium post on “How Voice Interfaces are infiltrating society and changing our children“, a mockingly deceptive title for a list of witty repartees to all the usual arguments against Voice Assistants and speech interfaces being used by children. Highly enjoyable! πŸ™‚

He addresses the perennial argument about children unlearning good manners, when they interact with VAs too much, and talks about his own experience with his kids assuring us that, actually, children know the difference between a human and a Voice Assistant. Phew, so there is no need to panic that they are going to suddenly be more and more rude to their teachers and grandparents and, hence, no need to reprogram voice interfaces to insist on “Pleases” and “Thank-yous” (Phew, again). I am fully in agreement there.

Similarly, he discusses the argument that children unlearn the importance and necessity of patience in the face of inescapable delays and life’s frustrations, when they can’t instantly get what they want (an answer to an obscure question about the animal kingdom or their favourite YouTube video). This seasoned dad knows that patience is something all kids have always struggled with and have to continually practise at that age. It is part of learning and growing up. And I can attest to that, too. (Oh the tantrums!)

Echodad & “the world’s harshest critics of Alexa skills”

However, in contrast to him, I am not at all comfortable or happy with the idea that my kid would go straight to Alexa or Google or Bixby for an answer to his Space or Transport questions and not even bother approach me. I may well take recourse to voice search myself for the answer, but I want to be the Gatekeeper to the Information Castle and not a passive bystander in his quests for knowledge. At least that’s what I consider as one of my – traditional – roles as a parent and I would like him to think that way too, at least until he is older. And it’s not just a hunch: having studied Psycholinguistics and Child Developmental Psychology at University in an earlier life (last century though!), I know the paramount importance of parental input, continuous feedback and playful interaction in the child’s quality and rate of learning anything, but particularly language.

In this respect, Voice Assistants, Voicebots, Chatbots and Voice Search itself have changed parent-child roles and interaction forever, and neither I nor many people can predict the long-term effect of this societal development. Kids are already and will become more and more independent, for sure, and that is not a negative thing in itself, but will that at the same time make me – as a parent – less dependable and approachable? Will he automatically assume I’m lacking in encyclopaedic knowledge and he that lessen his respect for me? And will the kid get the same quantity and quality of validation cues from Alexa or Google Assistant, as he would have got of me? More soberingly, will it teach kids that they don’t need to rely on or even invest much in relationships with people, because interacting with technology is easier, more fun and safer and more efficient? It’s not just the relationship to the parents we should be worried about, but all other family, work, peer and romantic relationships as well.

I am all for the use of voice interfaces and spoken dialogue systems by adults, e.g. myself, naturally, and my own clients; VUIs have been my bread and butter for decades! Nonetheless, at the risk of sounding like a regressive technophobe, I am vehemently against its use by children and will keep mine shielded for as long as I can manage! It’s already hard to keep him from doing a regular Google voice search for “pictures for buses” (sic). πŸ€” πŸ˜…

Design Voice Assistants as Performers

4 Nov

I recently read a very interesting article in Fast Company, which resonated a lot with me. Entitled “It’s time to rethink voice assistants completely“, it addresses a common obsession with designing voice interfaces that could easily be mistaken for humans.

It undoubtedly seems like a very noble goal, given that humans are at the apex of the communication pyramid among other living beings (let alone inorganic material!). However, the article makes the case, which I fully support, that human-like is not always desirable or even appropriate, especially if it fools you into thinking you are interacting with another human. This is a particularly poignant faux-pas, if the user did indeed believe the dialogue bot, only to be then frustrated by its lack of understanding or the lack of rationale in its response.

As the article suggests, human conversations should, of course, be studied thoroughly, but only be taken as inspiration for VUI and Voice Design, and not as the gold standard of interaction.

In fact, a disruptive idea is put forward, to consider Voice Assistants to be “performers, rather than human-like conversationalists”. That is precisely how you can create and craft more expressive, emotional, engaging and sticky conversational interfaces and the corresponding conversation designs.

UBIQUITOUS VOICE: Essays from the Field now on Kindle!

14 Oct

In 2018, a new book on “Voice First” came out on Amazon and I was proud and deeply honoured, as it includes one of my articles! Now it has come out on Kindle as an e-Book and we are even more excited at the prospect of a much wider reach!

“Ubiquitous Voice: Essays from the Field”: Thoughts, insights and anecdotes on Speech Recognition, Voice User Interfaces, Voice Assistants, Conversational Intelligence, VUI Design, Voice UX issues, solutions, Best practices and visions from the veterans!

I have been part of this effort since its inception, working alongside some of the pioneers in the field who now represent the Market Leaders (GOOGLE, AMAZON, NUANCE, SAMSUNG VIV .. ). Excellent job by our tireless and intrepid Editor, Lisa Falkson!

My contribution “Convenience + Security = Trust: Do you trust your Intelligent Assistant?” is on data privacy concerns and social issues associated with the widespread adoption of voice activation. It is thus platform-, ASR-, vendor- and company-agnostic.

You can get the physical book here and the Kindle version here.

Prepare to be enlightened, guided and inspired!

DESIGN your conversational interface before Coding it!

20 Sep

I just listened to the latest Google Cloud Platform podcast with Google’s own Cathy Pearl and was delighted (and vindicated!) to hear her stress the importance of DESIGNING your conversational interface before CODING it. You really need to think hard about the user needs you’re trying to meet with the interface and the possible dialogue “scripts” that would help you do that. Only then should you bother start coming up with training phrases for your intents or actions. This will save you a lot of headaches, time and, actually, face among your customers too. I’ve seen so many instances myself of developers seeing VUI Design and Voice UX design as an afterthought and then wonder why their conversational interface is not conversational at all or indeed much of an interface to the intended data and services, as it doesn’t always respond appropriately to what the customer wants or even to what they say.

Listen to the enlightening podcast here.

User Interface Design is the new black!

14 Dec

capture

 

LinkedIn Unveils The Top Skills That Can Get You Hired In 2017

 

Number 5: USER INTERFACE DESIGN!

“User interface design is the new black:

UI Design, which is designing the part of products that people interact with, is increasingly in-demand among employers. It ranked #14 in 2014, #10 last year, and #5 this year (second largest jump on this year’s Global Top Skills of 2016 list). Data has become central to many products, which has created a need for people with user interface design skills who can make those products easy for customers to use.”

 

capture

Read all about it here.

https://blog.linkedin.com/2016/10/20/top-skills-2016-week-of-learning-linkedin

Our META Avatar immortalised in a TEDx talk!

31 Oct

 

meta-avatar-final-athens-pilots

Our very own Niels Taatgen from the University of Groeningen gave a TEDx talk last Summer on “Why computers are not intelligent (yet!)” and why they still have a long way to go.

 

Computers can be better than humans in very specialised tasks, such as chess and Go, but it’s much more difficult to learn how to learn new tasks and to think about intentions and goals other than their own, like humans do. Enter our EU R & D Project METALOGUEΒ .

 

METALOGUE logo

 

Niels briefly shows an example interaction with our META Avatar negotiating the terms of a smoking ban and actually “thinking about the (human) opponent”, a crucial human negotiation and life skill. You can see META from minute 12 onwards but the whole talk is really interesting to watch, so don’t skip to it!

 

EU FP7 logo