Facebook’s AI Research team has created an AI called Vid2Play that can extract playable characters from videos of real people, creating a much higher-tech version of ’80s full-motion video (FMV) games.
The neural networks can analyze random videos of people doing specific actions, then recreate that character and action in any environment and allow you to control them with a joystick.
The team used two neural networks called Pose2Pose and Pose2Frame. First, a video is fed into a Pose2Pose neural network designed for specific types of actions like dancing, tennis or fencing.
The system then figures out where the person is compared to the background, and isolates them and their poses.
Then, Pose2Frame takes the person, along with their shadow and any objects they’re holding, and inserts them into a new scene with minimal artifacts. You can then control their movement, based on poses from the video, using a joystick or keyboard.
It only took a few short videos of each activity fencing, dancing and tennis to train the system. It was able to filter out other people and compensate for different camera angles.
The research resembles Adobe’s “content-aware fill” that also uses AI to remove elements from video, like tourists or garbage cans. Other companies, like NVIDIA, have also built AI that can transform real-life video into virtual landscapes suitable for games.
The motion is a bit screwy, with the characters looking like they’re playing on ice, a problem in 3D animation known as “foot slide.” On top of that, the range of motion is a bit limited.
However, they do appear fairly realistic against the backgrounds compared to previous efforts at character extraction. It’s still early days for the research, so hopefully the team can solve the motion issues.
Facebook’s Vid2Game synthesis could make gaming more personal, letting you insert your own character, or favorite YouTube personality into games.