PAPER_TITLE

Huang, Zhiqi; Cui, Dulongkai; Hu, Jinglu

SIE3D: Single-image Expressive 3D Avatar generation via Semantic Embedding and Perceptual Expression Loss

Zhiqi Huang, Dulongkai Cui, Jinglu Hu

Waseda University
Graduate School of Information, Production and Systems

Code arXiv

Input a image and text "happy", "with bread".

Abstract

Generating high-fidelity 3D head avatars from a single image is challenging, as current methods lack fine-grained, intuitive control over expressions via text. This paper proposes SIE3D, a framework that generates expressive 3D avatars from a single image and descriptive text. SIE3D fuses identity features from the image with semantic embedding from text through a novel conditioning scheme, enabling detailed control. To ensure generated expressions accurately match the text, it introduces an innovative perceptual expression loss function. This loss uses a pre-trained expression classifier to regularize the generation process, guaranteeing expression accuracy. Extensive experiments show SIE3D significantly improves controllability and realism, outperforming state-of-the-art methods in identity preservation and expression fidelity on a single consumer-grade GPU.

SIE3D: Single-image Expressive 3D Avatar generation via Semantic Embedding and Perceptual Expression Loss

Input a image and text "happy", "with bread".

Abstract

Overall architecture of the SIE3D framework.

Qualitative comparison with other competitive methods.

Application showcase of SIE3D’s expressive generation capabilities.

Another Carousel