How Can We Mitigate Background-Induced Bias in Fantastic-Grained Image Classification? A Comparative Study of Masking Strategies and Model Architectures

Community

How Can We Mitigate Background-Induced Bias in Fantastic-Grained Image Classification? A Comparative Study of Masking Strategies and Model Architectures

admin

September 12, 2023

How Can We Mitigate Background-Induced Bias in Fantastic-Grained Image Classification? A Comparative Study of Masking Strategies and Model Architectures

Fantastic-grained image categorization delves into distinguishing closely related subclasses inside a broader category. For instance, as a substitute of merely identifying a picture as a “bird,” this approach would differentiate specific bird species. Attributable to the complexity of those tasks, these models incessantly unintentionally depend on tiny information from image backgrounds. Background information might offer contextual cues, but it may also generate bias. As an illustration, a model may unintentionally associate all urban backgrounds with sparrows if it incessantly observes birds in urban environments during training. Eliminating this background-induced bias for more accurate results is crucial because it may limit the model’s real-world applicability.

Modern algorithms for fine-grained image classification incessantly depend on convolutional neural networks (CNN) and vision transformers (ViT) as their structural basis. A fundamental issue still exists, though: the context wherein an object appears can significantly impact humans and machines. Deep learning models incessantly unintentionally concentrate more on backgrounds, occasionally to the purpose where they will categorize based only on it. When utilized in scenarios with unusual, unfamiliar backgrounds, these models suffer significant performance deterioration.

To counteract the challenges posed by background biases, a brand new study was recently published by a research team from the University of Montpellier in France, proposing to analyze two primary strategies:

Early Masking: Where background details are removed on the very outset, on the image level.
Late Masking: This method masks features related to the background at a better, more abstract stage within the model.

The important thing contribution of the research is its thorough investigation of background-induced bias in fine-grained image classification. It fastidiously analyzes how sophisticated models like CNN and ViT perform when faced with these biases and provides creative masking techniques to deal with them.

Concretely, early Masking involves removing the background on the image’s input stage. Before classification by models like CNNs or Vision Transformers, the image’s background regions are masked using a binary segmentation network, making the model concentrate only on the important object. In contrast, Late Masking lets the model process the entire image initially but masks the background at a more advanced stage. After the first model backbone has processed the image, high-level spatial features related to the background are selectively excluded. Each methods aim to make sure models give attention to the article of interest, reducing biases arising from background details, which is especially crucial for fine-grained classifications where distinctions between categories will be subtle.

To guage the 2 strategies, the researchers performed an experimental study. The models were trained in experiments using the CUB dataset, which incorporates images of 200 bird species. On the CUB test set and the Waterbirds dataset, an out-of-distribution (OOD) set where the backgrounds of the CUB images were modified to those from the Places dataset, the performance of those models was evaluated. The researchers used several model layouts, comparable to ConvNeXt and ViT, in addition to Small, Base, and Large model sizes. The outcomes showed that early masking-trained models often outperformed those trained without it, particularly on the OOD Waterbirds test set. This means that using early Masking reduces biases brought on by image backgrounds and improves model generalization.

In conclusion, the authors examined the consequences of background-induced bias on CNN and ViT model generalization for out-of-distribution (OOD) images. They tested various background masking techniques and located early Masking to be essentially the most effective for each model types. The study highlights the importance of background considerations in image tasks and presents strategies to scale back biases and enhance generalization.

Try the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to hitch our 30k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the newest AI research news, cool AI projects, and more.

In case you like our work, you’ll love our newsletter..

Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor’s degree in physical science and a master’s degree in
telecommunications and networking systems. His current areas of
research concern computer vision, stock market prediction and deep
learning. He produced several scientific articles about person re-
identification and the study of the robustness and stability of deep
networks.

🚀 The tip of project management by humans (Sponsored)

LEAVE A REPLY Cancel reply