
In late 2009 Robert Edbert commentator, writer and film buff for the Chicago SunTimes found the Scottish company, CereProc. He wrote at the time: "I have my fingers crossed. I can see my own voice hosting online or telecast video essays. I am greatly cheered."This week he took delivery of a prototype and is said to be pleased with the results and keen to demonstrate it in public.
Ebert's archive of TV tapes, DVD commentaries and radio shows provided a rich seam for the company's researchers to mine. They painstakingly reconstructed his voice, piecing together snippets of recordings to produce new words and sentences. Ebert now has a database of words and a system which can "speak" any typed sentence in the voice he had before his operation.
It will not sound exactly as he used to, but it will sound more like him than his existing generic voice does. And he should be able to modulate the new synthesised version to make it sound more natural.
Matthew Aylett, (left) CTO of CereProc, said: "One of the things that we specialise in is trying to produce voices which have got a bit of character and don't sound neutral or boring. In this case, we're using audio that has been recorded for commentaries on Casablanca or Citizen Kane, for example. We have to take this audio and try to produce something which sounds smooth and natural."
It is believed to be the first time that archive audio material has been used to benefit the speaker.
Aylett said: "We're giving Roger Ebert actually the same voice he had before he had surgery. When he uses it, people who listened to his commentaries in the past as a broadcaster will recognise his voice. Both are examples of the way the voice is so central to who we are as people and so much of our character and personality is expressed by the use of our voice."
CereProc's voices
Voice rehabilitation software
The Hardwicks spend 15 minutes a day working on the computer. Peter, a former engineer, operates the software React2 and Sybil, a former ward sister, does the exercises.
Is Sybil making any progress, 12 years after her stroke? “Crikey, yes,” says Peter. “We use it as a stimulus for her brain.” Sybil agrees. “It’s important because of my speech,” she says. “I don’t get enough practice, so the software is very important.”
Therapy software is increasingly widely used by people who have problems using or understanding language as a result of a brain injury (a condition broadly known as aphasia).
Speech therapists encourage clients to use it alongside other treatments, including paper-based exercises and attending communication support groups – but its great advantages are that people can use it at home and that it tells them how they are doing as they go along. As the Hardwicks’ experience shows, it can help stimulate improvement years after a stroke.
Propeller Multimedia is based in Peebles, in the Borders. Its original program, React has been a leading product for speech and language rehabilitation since 1998 and now the program has been completely re-written and updated to React2.
This combines the development skills of a team of speech and language therapists (left to right): Patricia Mitchell (lead clinician), Louise Runciman, Debbie Allcock - NHS Borders speech and langauge therapy team) from NHS Borders, Scotland, with input from specialists in the UK and around the world.
It is designed to be used with adults with aphasia but also adults and children with learning difficulties and children with delayed or disordered language.
The software focuses on comprehension, rather than speaking, though it can be used to help stimulate conversation too. It comprises 8000 exercises covering five key areas, including recognising sounds and getting the gist of a conversation; improving memory; and life skills, such as understanding bills.
Propeller’s development director, Dean Turnbull, believes React2 is unique in the English-speaking world in the breadth of exercises it offers. It has been developed from paper-based speech and language rehabilitation exercises, and Propeller hope to put it through a formal clinical trial soon.
The benefit of the software, as Turnbull sees it, is in giving people a means to practise after their face-to-face therapy has come to an end. Tricia Mitchell, says the time a speech therapist has to spend with each person is never enough.
“Software would never replace face-to-face therapy,” she explains, “but it could mean people are getting more therapy without having a therapist there with them.The beauty of the program is that we have the computer telling them if they are right or wrong.”
We’re getting to the point where an individual’s programme can be tailor made via software resources and computer programs can be particularly useful in a geographical area, where speech therapists must travel long distances to visit clients. If a patient has very good computer skills, their progress might be monitored remotely using software.
Emotional voice recognition and response
Early voice use in cars was to request passengers to fasten seat belts. It came in female (perceived as nagging) or male (judged dictatorial) voice and has been replaced by an impartial but irritating pinging!
But Toyota's experimental car, the Pod in 2005 (left) was embedded with voice recognition software by Scottish company Affective Media which uses the tone and the pitch of the voice as indicators of the driver's emotion.
The experimental car signalled the mood of its driver to other road users by brightening, dimming or colouring the headlights. As well as altering the headlights, if the car judged the driver to be stressed it would release soothing perfume, turn on some mellow music or perhaps suggest a less-congested route. If it sensed a driver was about to doze off, it would wake them with an alarm.
"Stressed drivers are bad drivers," says Ray Warde, CEO of Affective Media, "and so is a person about to fall asleep. The voice of a driver gives a lot of information about the way they are
actually behaving at the time," he explains. "At the moment only certain models of cars will talk to the driver but that will become more common and drivers will start to control cars through speech recognition." CTO for Affective Media is (right) Christian Jones.
The Scottish games industry, remedial health care and learning systems are increasingly taking up the idea that people are more engaged with a system that seems to demonstrate emotional response for computers then to detect emotions and respond. Voice is one of those key monitors and responders and Scotland has a rich mother lode of software developers that deserve focus.