Home Community Efficient technique improves machine-learning models’ reliability

Efficient technique improves machine-learning models’ reliability

0
Efficient technique improves machine-learning models’ reliability

Powerful machine-learning models are getting used to assist people tackle tough problems similar to identifying disease in medical images or detecting road obstacles for autonomous vehicles. But machine-learning models could make mistakes, so in high-stakes settings it’s critical that humans know when to trust a model’s predictions.

Uncertainty quantification is one tool that improves a model’s reliability; the model produces a rating together with the prediction that expresses a confidence level that the prediction is correct. While uncertainty quantification might be useful, existing methods typically require retraining all the model to offer it that ability. Training involves showing a model hundreds of thousands of examples so it could learn a task. Retraining then requires hundreds of thousands of recent data inputs, which might be expensive and difficult to acquire, and likewise uses huge amounts of computing resources.

Researchers at MIT and the MIT-IBM Watson AI Lab have now developed a way that allows a model to perform simpler uncertainty quantification, while using far fewer computing resources than other methods, and no additional data. Their technique, which doesn’t require a user to retrain or modify a model, is flexible enough for a lot of applications.

The technique involves creating an easier companion model that assists the unique machine-learning model in estimating uncertainty. This smaller model is designed to discover several types of uncertainty, which might help researchers drill down on the basis explanation for inaccurate predictions.

“Uncertainty quantification is crucial for each developers and users of machine-learning models. Developers can utilize uncertainty measurements to assist develop more robust models, while for users, it could add one other layer of trust and reliability when deploying models in the actual world. Our work results in a more flexible and practical solution for uncertainty quantification,” says Maohao Shen, an electrical engineering and computer science graduate student and lead creator of a paper on this method.

Shen wrote the paper with Yuheng Bu, a former postdoc within the Research Laboratory of Electronics (RLE) who’s now an assistant professor on the University of Florida; Prasanna Sattigeri, Soumya Ghosh, and Subhro Das, research staff members on the MIT-IBM Watson AI Lab; and senior creator Gregory Wornell, the Sumitomo Professor in Engineering who leads the Signals, Information, and Algorithms Laboratory RLE and is a member of the MIT-IBM Watson AI Lab. The research can be presented on the AAAI Conference on Artificial Intelligence.

Quantifying uncertainty

In uncertainty quantification, a machine-learning model generates a numerical rating with each output to reflect its confidence in that prediction’s accuracy. Incorporating uncertainty quantification by constructing a brand new model from scratch or retraining an existing model typically requires a considerable amount of data and expensive computation, which is usually impractical. What’s more, existing methods sometimes have the unintended consequence of degrading the standard of the model’s predictions.

The MIT and MIT-IBM Watson AI Lab researchers have thus zeroed in on the next problem: Given a pretrained model, how can they allow it to perform effective uncertainty quantification?

They solve this by making a smaller and simpler model, often known as a metamodel, that attaches to the larger, pretrained model and uses the features that larger model has already learned to assist it make uncertainty quantification assessments.

“The metamodel might be applied to any pretrained model. It is best to have access to the internals of the model, because we will get way more information in regards to the base model, but it’ll also work for those who just have a final output. It will probably still predict a confidence rating,” Sattigeri says.

They design the metamodel to supply the uncertainty quantification output using a way that features each kinds of uncertainty: data uncertainty and model uncertainty. Data uncertainty is attributable to corrupted data or inaccurate labels and might only be reduced by fixing the dataset or gathering recent data. In model uncertainty, the model will not be sure methods to explain the newly observed data and might make incorrect predictions, most definitely since it hasn’t seen enough similar training examples. This issue is an especially difficult but common problem when models are deployed. In real-world settings, they often encounter data which can be different from the training dataset.

“Has the reliability of your decisions modified whenever you use the model in a brand new setting? You would like some approach to trust in whether it’s working on this recent regime or whether you have to collect training data for this particular recent setting,” Wornell says.

Validating the quantification

Once a model produces an uncertainty quantification rating, the user still needs some assurance that the rating itself is accurate. Researchers often validate accuracy by making a smaller dataset, held out from the unique training data, after which testing the model on the held-out data. Nevertheless, this method doesn’t work well in measuring uncertainty quantification since the model can achieve good prediction accuracy while still being over-confident, Shen says.

They created a brand new validation technique by adding noise to the information within the validation set — this noisy data is more like out-of-distribution data that may cause model uncertainty. The researchers use this noisy dataset to judge uncertainty quantifications.

They tested their approach by seeing how well a meta-model could capture several types of uncertainty for various downstream tasks, including out-of-distribution detection and misclassification detection. Their method not only outperformed all of the baselines in each downstream task but in addition required less training time to realize those results.

This method could help researchers enable more machine-learning models to effectively perform uncertainty quantification, ultimately aiding users in making higher decisions about when to trust predictions.

Moving forward, the researchers need to adapt their technique for newer classes of models, similar to large language models which have a special structure than a conventional neural network, Shen says.

The work was funded, partly, by the MIT-IBM Watson AI Lab and the U.S. National Science Foundation.

LEAVE A REPLY

Please enter your comment!
Please enter your name here