Home Community Memoji on Steroids: This AI Model Can Reconstruct 3D Avatars from Videos

Memoji on Steroids: This AI Model Can Reconstruct 3D Avatars from Videos

0
Memoji on Steroids: This AI Model Can Reconstruct 3D Avatars from Videos

We see digital avatars all over the place, from our favourite chat applications to virtual marketing assistants on our favourite e-commerce web sites. They have gotten increasingly popular and integrating quickly into our day by day lives. You go into your avatar editor, select skin color, eye shape, accessories, etc. and have one able to mimic you within the digital world.

Constructing a digital avatar face manually and using it as a living emoji might be fun, nevertheless it only scratches the surface of what is feasible. The true potential of digital avatars lies in the power to develop into a clone of our entire body. This kind of avatar has develop into an increasingly popular technology in video games and virtual reality (VR) applications.

Generating high-fidelity 3D avatars require expensive and specialized equipment. Due to this fact, we only see them utilized in a limited variety of applications, just like the skilled actors we see in video games. 

[Sponsored] 🔥 Construct your personal brand with Taplio  🚀 The first all-in-one AI-powered tool to grow on LinkedIn. Create higher LinkedIn content 10x faster, schedule, analyze your stats & engage. Try it without spending a dime!

What if we could simplify this process? Imagine you can generate a high-fidelity 3D full-body avatar by just using some videos captured within the wild. No skilled equipment, no complicated sensor setup to capture every tiny detail, only a camera and an easy recording with a smartphone. This breakthrough in avatar technology could revolutionize many applications in VR, robotics, video games, movies, sports, etc.

The time has arrived. We have now a tool that may generate high-fidelity 3D avatars from videos captured within the wild. Time to fulfill Vid2Avatar.

Vid2Avatar learns 3D human avatars from in-the-wild videos. It doesn’t need without need ground truth supervision, priors extracted from large datasets, or any external segmentation modules. You only give it a video of somebody, and it would generate a sturdy 3D avatar for you. 

Vid2Avatar has some smart tricks up its sleeves to attain this. The very first thing to do is to separate the human from the background in a scene and model it as a neural field. They solve the tasks of scene separation and surface reconstruction directly in 3D. They model two separate neural fields to learn each the human body and background implicitly. This is generally a difficult task because you might want to associate the human body with 3D points without counting on 2D segmentation.

The human body is modeled using a single temporally consistent representation of the human shape and texture in canonical space. This representation is learned from deformed observations using an inverse mapping of a parametric body model. Furthermore, Vid2Avatar uses an optimization algorithm to regulate multiple parameters related to the background, human subject, and their poses to be able to best fit the available data from a sequence of images or video frames. 

To further improve the separation, Vid2Avatar uses a special technique for representing the scene in 3D, where the human body is separated from the background in a way that makes it easier to investigate the motion and appearance of every individually. Also, it uses novel objectives, like specializing in having a transparent boundary between the human body and the background, guiding the optimization process toward producing more accurate and detailed reconstructions of the scene. 

Overall, a worldwide optimization approach for robust and high-fidelity human body reconstruction is proposed. This method uses videos capture in-the-wild without requiring any further information. Fastidiously designed components achieve robust modeling, and in the long run, we get 3D avatars that could possibly be utilized in many applications. 

Try the Paper and Project. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to hitch our 15k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the most recent AI research news, cool AI projects, and more.


Ekrem Çetinkaya received his B.Sc. in 2018, and M.Sc. in 2019 from Ozyegin University, Istanbul, Türkiye. He wrote his M.Sc. thesis about image denoising using deep convolutional networks. He received his Ph.D. degree in 2023 from the University of Klagenfurt, Austria, along with his dissertation titled “Video Coding Enhancements for HTTP Adaptive Streaming Using Machine Learning.” His research interests include deep learning, computer vision, video encoding, and multimedia networking.


🔥 StoryBird.ai just dropped some amazing features. Generate an illustrated story from a prompt. Test it out here. (Sponsored)

LEAVE A REPLY

Please enter your comment!
Please enter your name here