The progress that the computers themselves have made in understanding natural language in recent years is clear. One of the experts who tries to combine this progress in understanding natural language with that of computer-generated animations is Louis-Philippe Morency, associate professor at the Language Technologies Institute (LTI) of Carnegie Mellon University.
The scientist is working with his colleague Chaitanya Ahuja on this goal with a new type of neural architecture called Joint Language-to-Pose, or JL2P. With this model, you can benefit simultaneously from written sentences and physical animations.
The same scientist admits that we are not yet at the point where artificial intelligence can make a film from a script, but declares that it is a historical moment “very exciting.” Also because, in addition to the animation of virtual characters, such techniques could also be applied to robots.
The latter can easily be operated with our voice by performing any action, even if it is not pre-programmed, that we want it to perform. For the time being, researchers have succeeded in getting the algorithm to create simple stylized figures, starting with simple expressions such as “A person walks ahead” and continuing with constructions that are a little more difficult, such as “A person takes a step forward, then turns to new” (see also video below).
The aim of the researchers is, of course, to arrive at complex animations and sequences with multiple, even contemporary, increasingly descriptive actions.