Home Community KAIST Researchers Propose SyncDiffusion: A Plug-and-Play Module that Synchronizes Multiple Diffusions through Gradient Descent from a Perceptual Similarity Loss

KAIST Researchers Propose SyncDiffusion: A Plug-and-Play Module that Synchronizes Multiple Diffusions through Gradient Descent from a Perceptual Similarity Loss

0
KAIST Researchers Propose SyncDiffusion: A Plug-and-Play Module that Synchronizes Multiple Diffusions through Gradient Descent from a Perceptual Similarity Loss

In a recent research paper, a team of researchers from KAIST introduced SYNCDIFFUSION, a groundbreaking module that goals to reinforce the generation of panoramic images using pretrained diffusion models. The researchers identified a major problem in panoramic image creation, primarily involving the presence of visible seams when stitching together multiple fixed-size images. To handle this issue, they proposed SYNCDIFFUSION as an answer.

Creating panoramic images, those with wide, immersive views, poses challenges for image generation models, as they’re typically trained to provide fixed-size images. When attempting to generate panoramas, the naive approach of sewing multiple images together often leads to visible seams and incoherent compositions. This issue has driven the necessity for revolutionary methods to seamlessly mix images and maintain overall coherence.

Two prevalent methods for generating panoramic images are sequential image extrapolation and joint diffusion. The previous involves generating a final panorama by extending a given image sequentially, fixing the overlapped region in each step. Nonetheless, this method often struggles to provide realistic panoramas and tends to introduce repetitive patterns, resulting in less-than-ideal results.

Then again, joint diffusion operates the reverse generative process concurrently across multiple views and averages intermediate noisy images in overlapping regions. While this approach effectively generates seamless montages, it falls short by way of maintaining content and magnificence consistency across the views. In consequence, it incessantly combines images with different content and styles inside a single panorama, leading to incoherent outputs.

The researchers introduced SYNCDIFFUSION as a module that synchronizes multiple diffusions by employing gradient descent based on a perceptual similarity loss. The critical innovation lies in using the expected denoised images at each denoising step to calculate the gradient of the perceptual loss. This approach offers meaningful guidance for creating coherent montages, because it ensures that the pictures mix seamlessly while maintaining content consistency.

In a series of experiments using SYNCDIFFUSION with the Stable Diffusion 2.0 model, the researchers found that their method significantly outperformed previous techniques. The user study conducted showed a considerable preference for SYNCDIFFUSION, with a 66.35% preference rate, versus the previous method’s 33.65%. This marked improvement demonstrates the sensible advantages of SYNCDIFFUSION in generating coherent panoramic images.

SYNCDIFFUSION is a notable addition to the sphere of image generation. It effectively tackles the challenge of generating seamless and coherent panoramic images, which has been a persistent issue in the sphere. By synchronizing multiple diffusions and applying gradient descent from perceptual similarity loss, SYNCDIFFUSION enhances the standard and coherence of generated panoramas. In consequence, it offers a precious tool for a wide selection of applications that involve creating panoramic images, and it showcases the potential of using gradient descent in improving image generation processes.


Take a look at the Paper and Project Page. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to affix our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the most recent AI research news, cool AI projects, and more.

In the event you like our work, you’ll love our newsletter..

We’re also on WhatsApp. Join our AI Channel on Whatsapp..


Pragati Jhunjhunwala is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Kharagpur. She is a tech enthusiast and has a keen interest within the scope of software and data science applications. She is at all times reading in regards to the developments in numerous field of AI and ML.


▶️ Now Watch AI Research Updates On Our Youtube Channel [Watch Now]

LEAVE A REPLY

Please enter your comment!
Please enter your name here