ESPESY: Development of an emotional voice assistant


With our current R&D project ESPESY, we have set ourselves the goal of making speech synthesis emotional and thus intuitively easier for humans to understand. The aim is to make voice assistants sound like J.A.R.V.I.S., the intelligent assistant of Toni Stark, aka Iron Man. The dialogue systems needed for this are almost ready. What is missing so far is the implicit information that is transported via speech melody: emotional prosody. This is exactly where ESPESY comes in.

There are many situations in which an emotional prosody is of great benefit: instead of (re-)acting with the same generic tone of voice, the voice output could sound alarmed and urgent in dangerous situations, particularly friendly and servile in after-sales customer care, calm but firm in complaint management. The solution is to be used in the future for service robots in particular, but also for many other fields of application.

The aim of the project is to develop an algorithmic prosody procedure for emotionalising speech synthesis. In the form of a plug-in, this should be compatible with many speech-supported systems and modularly adaptable. The system should be easy to modify editorially via different modes.

ESPESY is funded by the Federal Ministry for Economic Affairs and Energy and the ZIM – Central Innovation Programme for SMEs.