
Recent achievements in supervised tasks of deep learning may be attributed to the provision of enormous amounts of labeled training data. Yet it takes a number of effort and money to gather accurate labels. In lots of practical contexts, only a small fraction of the training data have labels attached. Semi-supervised learning (SSL) goals to spice up model performance using labeled and unlabeled input. Many effective SSL approaches, when applied to deep learning, undertake unsupervised consistency regularisation to make use of unlabeled data.
State-of-the-art consistency-based algorithms typically introduce several configurable hyper-parameters, although they attain excellent performance. For optimal algorithm performance, it’s common practice to tune these hyper-parameters to optimal values. Unfortunately, hyper-parameter searching is usually unreliable in lots of real-world SSL scenarios, equivalent to medical image processing, hyper-spectral image classification, network traffic recognition, and document recognition. It’s because the annotated data are scarce, resulting in high variance when cross-validation is adopted. Having algorithm performance sensitive to hyper-parameter values makes this issue much more pressing. Furthermore, the computational cost may turn into unmanageable for cutting-edge deep learning algorithms because the search space grows exponentially in regards to the variety of hyper-parameters.
Researchers from Tsinghua University introduced a meta-learning-based SSL algorithm called Meta-Semi to leverage the labeled data more. Meta-Semi achieves outstanding performance in lots of scenarios by adjusting just yet one more hyper-parameter.
The team was inspired by the conclusion that the network could also be trained successfully using the appropriately “pseudo-labeled” unannotated examples. Specifically, throughout the online training phase, they produce pseudo-soft labels for the unlabeled data based on the network predictions. Next, they remove the samples with unreliable or incorrect pseudo labels and use the remaining data to coach the model. This work shows that the distribution of appropriately “pseudo-labeled” data ought to be comparable to that of the labeled data. If the network is trained with the previous, the ultimate loss on the latter must also be minimized.Â
They defined the meta-reweighting objective to reduce the ultimate loss on the labeled data by choosing probably the most appropriate weights (weights throughout the paper all the time seek advice from the coefficients used to reweight each unlabeled sample reasonably than referring to the parameters of neural networks). The researchers encountered computing difficulties when tackling this problem using optimization algorithms.
For that reason, they suggest an approximation formulation from which a closed-form solution may be derived. Theoretically, they show that every training iteration only needs a single meta gradient step to attain the approximate solutions.Â
In conclusion, they suggest a dynamic weighting approach to reweight previously pseudo-labeled samples with 0-1 weights. The outcomes show that this approach eventually reaches the stationary point of the supervised loss function. In popular image classification benchmarks (CIFAR-10, CIFAR-100, SVHN, and STL-10), the proposed technique has been shown to perform higher than state-of-the-art deep networks. For the difficult CIFAR-100 and STL-10 SSL tasks, Meta-Semi gets much higher performance than state-of-the-art SSL algorithms like ICT and MixMatch and obtains somewhat higher performance than them on CIFAR-10. Furthermore, Meta-Semi is a useful addition to consistency-based approaches; incorporating consistency regularisation into the algorithm further boosts performance.
In keeping with the researchers, Meta-Semi requires just a little more time to coach is a drawback. They plan to look into this issue in the longer term.Â
Take a look at the Paper and Reference Article. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to affix our 26k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the newest AI research news, cool AI projects, and more.
Dhanshree
” data-medium-file=”https://www.marktechpost.com/wp-content/uploads/2022/11/20221028_101632-Dhanshree-Shenwai-169×300.jpg” data-large-file=”https://www.marktechpost.com/wp-content/uploads/2022/11/20221028_101632-Dhanshree-Shenwai-576×1024.jpg”>
Dhanshree Shenwai is a Computer Science Engineer and has a very good experience in FinTech corporations covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is passionate about exploring latest technologies and advancements in today’s evolving world making everyone’s life easy.
edge with data: Actionable market intelligence for global brands, retailers, analysts, and investors. (Sponsored)