Tag Archives: speech analytics

How to Design & Optimise Conversational AI Dialogue Systems

26 Jun
Data Futurology EP122 with Dr Maria Aretoulaki

The latest episode of Data Futurology features myself talk with Felipe Flores about how I got into AI, Machine Learning and Deep Learning (“When I discovered Artificial Neural Networks back in 1993, I thought I had discovered God!”) and later Data Science and Big Data Speech Analytics for Voice User Interface (VUI) and Voice First Design (for Speech IVRs and Voice Assistants).

In the podcast, I give an overview of all the different steps involved in VUI Design, the often tricky interaction with the different stakeholders (Coders, Business Units, Marketeers) and the challenges of working in an Agile PM framework, when skipping detailed Voice Design is suicide (and my worst nightmare!). I showcase some of the interesting and outright funny things you discover, when you analyse real-world human-machine dialogues taken from Speech IVR systems (hint, most original swearwords!) and I pinpoint the main differences between a Speech IVR and a Voice Assistant skill / action / capsule. I also contemplate on where Voice is heading now with the ubiquity of Voice First and the prevalance of skills that were developed by software engineers, web designers and Marketing copywriters, without VUI Design expertise, Linguistics Training or knowledge of how Speech Recognition works.

Data Futurology Podcast EP122 – Optimising Conversational AI Dialogue Systems with Dr. Maria Aretoulaki
Episode list of contents

I even provide tips on how to survive working for a Start-up and strategies on how to stay focused and strong when you run your own SME Business. Below is a brief checklist of some of the core qualities required and lessons I have learned working for Start-ups:

No alt text provided for this image

You can listen to the whole episode on Data Futurology , on Apple Podcasts, on Google Podcasts, on Spotify, or wherever you get your Podcasts or you can watch it on YouTube:

#122 – Optimising Conversational AI Dialogue Systems with Dr. Maria Aretoulaki

It was a delight speaking to Felipe! It made me think about and consolidate some of the – occasionally hard – lessons I have learned along the way about Voice Design, NLP, AI, ML and running a Business, so that others hopefully prepare for and ideally also avoid any heartaches!

No alt text provided for this image

The immortality of Data

10 Jun
Boundless Podcast Episode EP48

The latest Boundless Podcast Episode is out! It features myself in a deep conversation with Richard Foster-Fletcher about Big Data and Speech Analytics for Voice First & Voice Assistants (but not only), BigTech, AI Ethics and the need for a new legal framework for Responsible AI and Explainable Machine Learning. Below is a snippet of the conversation:

Boundless Podcast Episode EP48 – snippet

“My data is being used and exploited, and I can do nothing about it. We need to modernise the legal system. Apart from all the ethical, moral discussions that need to be made, we need a legal system that takes into consideration the fact that intelligence doesn’t need to be visible to be acting against me.”

I wouldn’t call myself a technophobe. Quite the opposite. I was learning how to program (BASIC!) back in 1986 in the Summer after getting my degree in English Language & Literature; I was teaching computers how to translate between human languages and chatting online with mainframes in University buildings 2 kms away back in 1991; I was programming Artificial Neural Networks and using Parallel computers back in 1993; I was reading internet newsgroups and downloading music – very much legally! – on a web browser (Netscape!) in 1994; I was designing Voice interfaces already in 1996 and voice-controlled home assistants back in 1998. I have even been using LinkedIn since 2005.

Yet, I am very sceptical and pessimistic about our uninhibited sharing of personal data and sensitive information all day every day on multiple channels, to multiple audiences, much of it willingly, much more unwillingly, in the name of sharing and human connection, service and product personalisation and ultimately, far too often, monetisation.

What will that mean for our legacy as individuals? Who will own, control and curate all the generated data after our death? Who can clone us and why? Will there be a second life after death? Will there ever be freedom from a dictator or will there ever be any point in bequeathing anything in a will? These and many more questions are discussed in this episode. I had lots of fun recording this! Thank you so much to Richard for creating this🙏

You can listen to a snippet of the conversation here (Download)

No alt text provided for this image

You can listen to the full episode here or wherever you get your podcasts.

Alternatively, you can listen to it on YouTube (in 2 parts):

Part 1

Boundless EP48 – Part 1

And Part 2

Boundless EP48 – Part 2

#BigData #BigTech #SpeechAnalytics #VoiceFirst #VoiceAssistants #Alexaskills #GoogleAssistant #Bixby #AIethics #responsibleAI #explainableVUI #AI #ArtificialIntelligence #MachineLearning #ML #DeepLearning #ANNs

UBIQUITOUS VOICE: Essays from the Field now on Kindle!

14 Oct

In 2018, a new book on “Voice First” came out on Amazon and I was proud and deeply honoured, as it includes one of my articles! Now it has come out on Kindle as an e-Book and we are even more excited at the prospect of a much wider reach!

“Ubiquitous Voice: Essays from the Field”: Thoughts, insights and anecdotes on Speech Recognition, Voice User Interfaces, Voice Assistants, Conversational Intelligence, VUI Design, Voice UX issues, solutions, Best practices and visions from the veterans!

I have been part of this effort since its inception, working alongside some of the pioneers in the field who now represent the Market Leaders (GOOGLE, AMAZON, NUANCE, SAMSUNG VIV .. ). Excellent job by our tireless and intrepid Editor, Lisa Falkson!

My contribution “Convenience + Security = Trust: Do you trust your Intelligent Assistant?” is on data privacy concerns and social issues associated with the widespread adoption of voice activation. It is thus platform-, ASR-, vendor- and company-agnostic.

You can get the physical book here and the Kindle version here.

Prepare to be enlightened, guided and inspired!

Speech Interaction on Mobile Devices at SpeechTEK 2011 (New York)

7 Aug

Today sees the launch of the Joint AVIxD / IxDA Workshop on Speech Interaction on Mobile Devices that kick-starts the mother of Voice Solutions Fairs, SpeechTEK 2011 in New York next week (8-10 Aug).

AVIxD

AVIxD is the Association for Voice Interaction Design, a professional organisation that aims to

“eliminate apathy and antipathy toward the need for good design of automated voice services”, 

which has become my favourite VUI mantra!

IxDA is the Interaction Design Association, a much bigger professional “un-organisation” which  intends to:

“improve the human condition by advancing the discipline of Interaction Design”

A very worthy cause indeed, especially since it is true that “the human condition is increasingly challenged by poor experiences. “!

IxDA

Today’s Joint Workshop in New York aims to bring together interaction design practitioners from across the voice, interactive, and digital areas to identify the issues and challenges involved in  speech interaction design on mobile devices, such as smartphones and tablets, and to come up by the end of the day with ways to approach them or even tackle them. A very ambitious format that, however, really does work!

AVIxD organised another Workshop this year on Cross-linguistic & Cross-cultural Voice Interaction Design, which was also the 1st European Workshop, just before SpeechTEK Europe in London this May past. See what we all came up with in those 6 hours in the SpeechTEK Europe PDF presentation below.

And if you don’t manage to take part in today’s workshop, make sure you go to the SpeechTEK Conference and Exhibition itself that starts tomorrow and runs until Wednesday the 10th. Listen to presentations and see or even try for yourself market-ready products relating to:

  • multimodal applications
  • cross-channel applications
  • speech analytics
  • speaker identification and verification
  • in-car systems
  • natural language and say-anything technologies
  • speech translation
  • voice-enabled personal assistants
  • as well as the latest speech recognition techniques and technologies

I particularly recommend the Keynote Panel on “Mobility — A Game-Changer for Speech?” on Tuesday on how smartphones are dramatically changing how customers interact with businesses and with the devices themselves. Some really interesting issues and questions will be raised, such as:

* How voice user interfaces will be integrated with graphical user interfaces?

or

* Will users embrace voice as they have embraced keypads on mobile devices? 

Sadly I am in the UK today and next week, so I’m going to miss it all. But if you are lucky enough to be in or near New York, make sure you go and enjoy!

SpeechTEK 2011 New York

The Loneliness of the long-distance … VUI Designer!

13 Jun

On Friday 11th June, I took part in the “Pathways” event organised annually by the University of Manchester Career Service to support PhD researchers as well as research staff in “making career choices, exploring future plans and discovering the breadth of opportunities available to them“. I was Guest Panellist at 3 different Sessions:

  1. Opportunities for Engineering and Physical Sciences
  2. Working as a Freelancer or Consultant and
  3. Enterprise, Entrepreneurship and Business Start Up

The University of Manchester Logo

As a University of Manchester graduate (well, technically UMIST, I felt compelled to take part in those Question and Answer panels in order to give some insight on how a career can develop: from a Bachelors in English & Linguistics in Greece, to a Masters of Science in Machine Translation and a Doctorate in Automatic Text Summarisation in the UK, to a Post-Doctoral Fellowship in Spoken Dialogue Management and a position as a Research Project Manager in Germany, to working in Industry both as a full-time employee and as an external contractor as a Voice User Interface (VUI) Designer in Germany, the UK, Switzerland, the US and further afield. It’s been a fascinating journey for sure! And I probably would never have arrived where I am now, if I hadn’t done those degrees or taken up those jobs in those specific places.

Have a look at the Guest Speaker profiles, including mine (p. 24), here: http://www.careers.manchester.ac.uk/media/media,172749,en.pdf

Some very inspiring career journeys!

I have to say I have thoroughly enjoyed the whole journey, the projects I have worked on, the people I have met on the way, the different organisational cultures I had the chance to experience. Plus, I wouldn’t change what I do now for the world! I love working as an external contractor and coming in to design speech self-service systems and voice-to-text services from scratch, or optimise existing ones, and the whole development, testing and tuning cycles:

  • writing functional specification documents
  • defining the system persona
  • drawing call flows
  • crafting system messages and coaching voice talents for the recordings
  • writing speech recognition grammars and pronunciations
  • devising and carrying out Wizard-of-Oz tests and Usability tests (including recording test subjects on video and interviewing them afterwards!)
  • transcribing and analysing phone calls
  • writing tuning reports

Everything is a lot of fun! It’s also great to be bringing in the same VUI Design processes and skills in different organisations and projects, and also getting to work at different places in the world at any one time! I love the variety of work and location of work, as well as the flexibility to work anytime and from anywhere! (Yes, working on your laptop – iPad soon – from a beach in the Caribbean is no longer a daydream but a realistic plan! :))

working on a deserted beach in the Caribbean is no longer a daydream!

Okay, it does get lonely. No gossiping in the kitchen during coffee breaks and no Christmas office parties. I still get to have probably as many face-to-face project meetings and conference calls as the average office worker though. We all have to work independently and in isolation, when analysing data or composing a report anyway. Only office workers have also got the hectic running-around of their colleagues and lots of intrusive and loud phone calls they have to unwillingly witness in silence. So my loneliness is a very content one! 😀