




|
SpeechAnimator™
Technology
SpeechAnimator™ Technology allows to create a
synchronous face animation in greater volumes and works
in real time (i.e. from a microphone). The Technology
successfully works since 2000 and has found application
in many creative collectives on TV, in advertising, for
developers of games, etc. At participation of the
Technology on Ren-TV telechannel telecasts are issued and
the animated serial is made.
SpeechAnimator™
Opportunities
- Sound stream
transformation to animation of the person, real time
including - see SpeechAnimator Pro Real-Time
- Processing of any languages, including unknown persons
to the program
- Processing of the nonverbal voice sounds published by
the person: laughter, clatter for example and so on.
- The account of noise level, loudness of a voice (whisper,
shout for example), and conversation persistence (is
sharper, normally, more smoothly, etc.)
- 3D Studio MAX and MAYA user models imports and its
visual animation - see SpeechAnimator Pro
- NONVERBAL mimicry automatic creation on the basis of
functional dependence on speech morphemes, tempo of
speech, etc. parameters, blinking for example, etc.
- Export animation into the 3D Studio MAX and MAYA - see SpeechAnimator
Pro
- Fast automatic processing of a plenty of sound files
with delivery of ready animation in a text format for the
further processing in cursors of the user and for
animation - see StreamAnimator
These programs embodies
the SpeechAnimator™ technology, based on "transcription"
of a sound stream. Transcription, which is a definition
of the current sound images, occurs automatically. At
first, the so-called differential sounds database is
formed. It is built on the basis of a large amount of
sound images, which determine distinctions between groups
of images. This database defines the proportion of how
the for group weights for the currently played sound are
distributed, and the nearest group that is "similar"
to this sound. Thus in any moment we have the set of
factors defining similarity of those groups. After that,
the information on the sound succession is transformed to
the visual images according to the match table.
The SpeechAnimator™ technology is used in many
fields. For example, in art dynamic actor shows, pop and
rock bands, and for video clips. In this case on a stage
or in a video clip the unique visual images corresponding
to the current sounds or signals are created. Changing a
visual image being played back over time, we meet the
expectations of audience.
Using basic principles of the SpeechAnimator™
technology, SpeechAnimator Pro creates face animation of the
speaking person by his or her voice. For this purpose the
differential database was constructed from about 20.000
sound samples of 30 voices. The database distinguishes 56
types of human speech sounds. To achieve the multilingual
purpose the database includes both English and Russian
sound samples, so it covers the majority of the speech
sounds for different languages. Experiments indicated
that this sound set is able to generate face mimic not
only for these two languages, but also for many others,
including European and Asian languages. Non-speech human
sounds are covered as well - for instance, laughter,
clatter, etc. The software automatically takes into
consideration how loud the sounds are and increases the
amplitude of output factors for their "visemes".
The technology does not use textual "hints" and
so has the larger scope than many other utilities
presented on the market. SpeechAnimator Pro is the separate application and
communicates with other programs, such as its 3D Studio
MAX or Maya plug-ins, using the external protocol.
The special standalone utility application StreamAnimator is able to process great volume
of a sound, for example, for game engines and cartoon
movies. StreamAnimator creates text files with the
morpheme factors for a set of sound files. StreamAnimator supplements SpeechAnimator
Pro and can use
its projects and settings.
The SpeechAnimator™ technology allows to create
a facial animation in real time (including "from a
microphone"), for example as in program SpeechAnimator
Pro Real-Time.
For the improved automatic recognition the "personal
differential databases" for individual actors'
voices are created. Databases are created on the basis of
sound examples and contain the information on sound
distinctions in view of a concrete voice.
If you want to take advantage of the
SpeechAnimator™ technology in your applications please
contact its author.
©2001-2005
Alexander Okhota
hunt(dog)speechanimator.ru
|