HAAR: Text-Conditioned Generative Model of 3D Strand-based Human Hairstyles


Description of Image 1 Description of Image 2
...afro hairstyle...
Description of image 2 Description of Image 2
...voluminous straight hair...
Description of image 3 Description of Image 2
...man haircut...
Description of image 4 Description of Image 2
...wavy short hairstyle...
Description of image 1 Description of Image 2
...bob haircut...
Description of image 2 Description of Image 2
...long wavy hairstyle...
Description of image 3 Description of Image 2
...long straight hair...
Description of image 4 Description of Image 2
...short curly hairstyle...

HAAR generates realistic 3D strand-based hairstyles using text prompt as input.

Video

Abstract

We present HAAR, a new strand-based generative model for 3D human hairstyles. Specifically, based on textual inputs, HAAR produces 3D hairstyles that are ready to be used as assets in various computer graphics animation applications. Current AI-based generative models take advantage of powerful 2D priors to reconstruct 3D content in the form of point clouds, meshes, or volumetric functions. However, by using the 2D priors, they are intrinsically limited to only recovering the visual parts. Highly occluded hair structures can not be reconstructed with those methods, and they only model the ``outer shell``, which is not ready to be used in the physics-based rendering of simulation pipelines. In contrast, we propose a first text-guided generative method that uses 3D hair strands as an underlying representation. Leveraging 2D visual question-answering (VQA) systems, we automatically annotate synthetic hair models that are generated from a small set of artist-created hairstyles. This allows us to train a latent diffusion model that operates in a common hairstyle UV space. In qualitative and quantitative studies, we demonstrate the capabilities of the proposed model and compare it to existing hairstyle generation approaches.


Main idea

We present our new method for text-guided and strand-based hair generation. For each hairstyle $H$ in the training set, we produce latent hair maps $Z$ and annotate them with textual captions $P$ using off-the-shelf VQA systems and our custom annotation pipeline.

Then, we train a conditional diffusion model $\mathcal{D}$ to generate the guiding strands in this latent space and use a latent upsampling procedure to reconstruct dense hairstyles that contain up to a hundred thousand strands given textual descriptions.

The generated hairstyles are then rendered using off-the-shelf computer graphics techniques.

Generation results

Comparison

Prompt: "A woman with afro hairstyle"
First Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Second Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Second Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Second Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Prompt: "A woman with bob hairstyle"
First Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Second Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Second Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Second Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Prompt: "A woman with long wavy hair"
First Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Second Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Second Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Second Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Prompt: "A woman with straight long hair"
First Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Second Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Second Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
Second Large Image Second Image
Small Image 1A Small Image 1B
Small Image 2A Small Image 2B
TECA
HAAR

Interpolation between text embeddings

Hairstyle editing

Simulation results

Generated hairstyles can be easily simulated using modern Computer Graphics Engines.

BibTeX


@article{sklyarova2023haar,
title={HAAR: Text-Conditioned Generative Model of 3D Strand-based Human Hairstyles},
author={Sklyarova, Vanessa and Zakharov, Egor and Hilliges, Otmar and Black, Michael J and Thies, Justus},
journal={ArXiv},
month={Dec}, 
year={2023} 
}