TELA: Text to Layer-wise 3D Clothed Human Generation

Junting Dong1, Qi Fang2, Zehuan Huang3, Xudong Xu1, Jingbo Wang1, Sida Peng4, Bo Dai1

1Shanghai AI Laboratory    2NetEase Games AI Lab    3Beihang University    4Zhejiang University


This paper addresses the task of 3D clothed human generation from textural descriptions. Previous works usually encode the human body and clothes as a holistic model and generate the whole model in a single-stage optimization, which makes them struggle for clothes editing and meanwhile lose fine-grained control over the whole generation process(e.g., specify the order of inside and outside of clothes). To solve this, we propose a layer-wise clothed human representation combined with a progressive optimization strategy, which produces clothes disentangled 3D human models while providing control capacity for the generation process. The basic idea is progressively generating a minimal-clothed human body and layer-wise clothes. During clothes generation, a novel stratified compositional rendering method is proposed to fuse multi-layer human models, and a new loss function is utilized to help decouple the clothes model from the human body. The proposed method, TELA, achieves high-quality disentanglement, which thereby provides an effective way for 3D garment generation. Extensive experiments demonstrate that our approach achieves better 3D clothed human generation than the holistic modeling method while also supporting cloth editing applications such as virtual try-on.

Overview video

Interacitve generating

Key idea

Cloth generation