Home Community Take Me to One other Dimension: This AI Model Can Generate Realistic Generative 3D Face Models

Take Me to One other Dimension: This AI Model Can Generate Realistic Generative 3D Face Models

Take Me to One other Dimension: This AI Model Can Generate Realistic Generative 3D Face Models

Generating anything, whether it’s a text or a picture, within the digital world has never been easier, because of the advancement of neural networks within the last couple of years. From GPT models for text to diffusion models for images, we’ve seen revolutionary AI models that modified all the pieces we find out about generative AI. Nowadays, the road between human-generated and AI-generated content is getting blurry.

This is very noticeable within the image generation models. If you will have ever played around with the most recent release of MidJourney, you may see how good it’s at generating real-life human photos. In truth, they got so good that we now even have agencies that use virtual models to advertise clothing, products, etc. One of the best thing about using a generative model is its superb generalization ability lets you customize the output nevertheless you wish and still give you visually nice photos.

While these 2D generative models can output high-quality faces, we still need more capability for a lot of applications of interest, resembling facial animation, expression transfer, and virtual avatars. Using existing 2D generative models for these applications often leads to difficulties in the case of effectively disentangle facial attributes like pose, expression, and illumination. We cannot simply use them to change the positive details of faces they generate. Furthermore, a 3D representation of shape and texture is crucial to many entertainment industries—including games, animation, and visual effects— which are demanding 3D content at increasingly enormous scales to create immersive virtual worlds.

🚀 JOIN the fastest ML Subreddit Community

There have been attempts at designing generative models to generate 3D faces, but the shortage of diverse and high-quality 3D training data has limited the generalization of those algorithms and their use in real-world applications. Some tried to beat these limitations with parametric models and derived methods to approximate the 3D geometry and texture of a 2D face image. Nevertheless, these 3D face reconstruction techniques typically don’t get well high-frequency details.

So, it is evident that we’d like a reliable tool that may generate realistic faces in 3D. We cannot just simply stop at 2D while now we have all these possible applications that may utilize from advancement. It might’ve been very nice if we could have an AI model that may generate realistic 3D faces, right? Well, we even have it, and it’s time to satisfy with AlbedoGAN.

AlbedoGAN is a 3D generative model for faces using a self-supervised approach that may generate high-resolution texture and capture high-frequency details within the geometry. It leverages a pre-trained StyleGAN model to generate high-quality 2D faces and generate light-independent albedo directly from the latent space. 

Albedo is a critical aspect of a 3D face model because it largely determines the looks of the face. Nevertheless, generating high-quality 3D models with an albedo that may generalize over pose, age, and ethnicity requires an enormous database of 3D scans, which may be costly and time-consuming. To handle this issue, they use a novel approach that mixes image mixing and Spherical Harmonics lighting to capture high-quality, 1024 × 1024 resolution albedo that generalizes well over different poses and tackles shading variations.

For the form component, the FLAME model is combined with per-vertex displacement maps guided by StyleGAN’s latent space, leading to a higher-resolution mesh. The 2 networks for albedo and shape are trained in alternating descent fashion. The proposed algorithm can generate 3D faces from StyleGAN’s latent space and may perform face editing directly within the 3D domain using the latent codes or text.

Try the Paper and Code. Don’t forget to hitch our 21k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the most recent AI research news, cool AI projects, and more. If you will have any questions regarding the above article or if we missed anything, be happy to email us at Asif@marktechpost.com

🚀 Check Out 100’s AI Tools in AI Tools Club

Ekrem Çetinkaya received his B.Sc. in 2018 and M.Sc. in 2019 from Ozyegin University, Istanbul, Türkiye. He wrote his M.Sc. thesis about image denoising using deep convolutional networks. He’s currently pursuing a Ph.D. degree on the University of Klagenfurt, Austria, and dealing as a researcher on the ATHENA project. His research interests include deep learning, computer vision, and multimedia networking.


Please enter your comment!
Please enter your name here