Two Minute Papers has once again summarized an important current AI development in a visually appealing way for everyone. This time it&s about semantic interpolation or in English original: Semantic Interpolation.
In contrast to classical morphing, a trained AI (ideally) has an extended understanding of the entered scene. If it recognizes that the scene is a face or a building, it can also conclude that there is probably another ear on the side of the face that is away from the camera. Or that behind a closed mouth, tongue and teeth are also visible when the mouth opens.
Semantic interpolation is about creating images in a relatively arbitrary way, not only with a current AI architecture (like faces e.g. with simple GANs). But the generation of images can be controlled more concretely. Typically, this is achieved by changing vectors in the so-called latent space, but as in a special case shown in the video, another image can be used as style input for properties.
You can imagine this in a way that the first image provides a "basic" face as a basis and the second image "contributes" properties like skin color, glasses, mouth opening and/or head rotation. The second image thus describes the "semantic properties", or their intended state.
The semantic interpolation now goes one step further and allows - if the semantic properties are correctly trained in the latent space - to switch between semantic properties. This means to change head turns or mouth opening in a targeted way. Or to add glasses. Or to age or rejuvenate faces.
And once an AI has learned which properties change an object in which way, it is possible to achieve significantly improved morphings in which almost every intermediate image is a passable result. Even if the source and final image differ very much.