Dynamic view synthesis is the means of reconstructing dynamic 3D scenes from captured videos and creating immersive virtual playback. This process has been a long-standing research problem in computer vision and graphics, a process that holds significant promise in the sector of VR/AR, sports broadcasting, and artistic performance capturing.
Traditional methods for representing dynamic 3D scenes use textured mesh sequences, but these methods are complex and computationally expensive, making them impractical for real-time applications.
In recent times, some methods have produced great results on the subject of dynamic view synthesis, showing impressive rendering quality. Nevertheless, one area they still need to enhance in is latency while rendering high-quality images. This research paper introduces a 4D point cloud representation that supports hardware rasterization and allows quick rendering.
4K4D represents 3D scenes based on a 4D grid of features, i.e., as a vector of 4 features. Such a representation makes the points within the grid regular and easier to optimize. The model first represents objects’ geometry and shape within the input video using an area carving algorithm and a neural network to learn tips on how to represent the 3D scene from the purpose cloud. A differential depth peeling algorithm is then developed for rendering the purpose cloud representation, and a hardware rasterizer is leveraged to enhance the rendering speed.
To spice up the rendering speed, the next acceleration techniques are applied:
- Some model parameters are precomputed and stored in memory, allowing the graphics card to render the scene faster.
- The precision of the model is reduced from 32-bit float to 16-bit float. This increases the FPS by 20 with none visible performance loss.
- Lastly, the variety of rendering passes required for the depth peeling algorithm is reduced, which also increases the FPS by 20 with no visible change in quality.
The researchers evaluated the performance of 4K4D on multiple datasets corresponding to DNA-Rendering, ENeRF-Outdoor, etc. The researcher’s method for rendering 3D scenes will be rendered at over 400 FPS at 1080p on the previous dataset and at 80 FPS at 4K on the latter. That is 30 times faster than the state-of-the-art real-time dynamic view synthesis method ENeRF, that too with superior rendering quality. The ENeRF Outdoor dataset is a slightly difficult one with multiple actors. 4K4D was still in a position to produce higher results as in comparison with the opposite models, which produced blurry results and exhibited black artifacts across the image edges in a number of the renderings.
In conclusion, 4K4D is a brand new method that goals to tackle the problem of slow rendering speed on the subject of real-time view synthesis of dynamic 3D scenes at 4K resolution. It’s a neural point cloud-based representation that achieves state-of-the-art rendering quality and exhibits a greater than 30× increase in rendering speed. Nevertheless, there are a few limitations, corresponding to high storage requirements for long videos and establishing point correspondences across frames, which the researchers plan to deal with in future work.
Take a look at the Paper and Project. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to affix our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the most recent AI research news, cool AI projects, and more.
If you happen to like our work, you’ll love our newsletter..
We’re also on WhatsApp. Join our AI Channel on Whatsapp..
Arham Islam
” data-medium-file=”https://www.marktechpost.com/wp-content/uploads/2022/10/Screen-Shot-2022-10-03-at-10.48.33-PM-293×300.png” data-large-file=”https://www.marktechpost.com/wp-content/uploads/2022/10/Screen-Shot-2022-10-03-at-10.48.33-PM.png”>
I’m a Civil Engineering Graduate (2022) from Jamia Millia Islamia, Latest Delhi, and I actually have a keen interest in Data Science, especially Neural Networks and their application in various areas.