Extreme weather conditions have develop into a typical occurrence, especially lately. Climate change is the foremost factor guilty for such extreme weather-related phenomena, from the torrential downpours seen in Pakistan which have submerged large portions of the country under water to the exceptional heat waves which have fueled wildfires throughout Portugal and Spain. The Earth’s average surface temperature is predicted to rise by about 4 degrees in the course of the next decade if the correct actions should not taken soon. In line with scientists, this temperature rise will further contribute to the occurrence of more frequent extreme weather events.
General circulation models (GCMs) are tools that scientists use to forecast the weather and climate in the longer term. GCMs are a system of differential equations that could be integrated across time to supply forecasts for various variables, including temperature, wind speed, precipitation, etc. These models are quite simple to understand and produce appreciably accurate results. Nonetheless, the core problem with these models is that executing the simulations requires significant computational power. Moreover, fine-tuning the models gets difficult when there’s loads of training data.
That is where machine learning techniques are proven to be useful. Particularly in “weather forecasting” and “spatial downscaling,” these algorithms have proven to be competitive with more established climate models. Weather forecasting refers to anticipating future climate variables. For example, we must forecast the quantity of rainfall for the upcoming week in Meghalaya using the data on the every day rainfall (in cm) for the previous week. The problem of downscaling spatially coarse climate model projections, for example, from a grid of 100 km x 100 km to 1 km x 1 km, is often called spatial downscaling.
Forecasting and downscaling could be analogous to quite a lot of computer vision tasks. Nonetheless, the foremost distinction in weather forecasting, spatial downscaling, and other CV tasks is that the machine learning model must utilize exogenous inputs in various modalities. For example, several elements, like humidity and wind speed, together with historical surface temperatures, may have an impact on future surface temperatures. These variables have to be provided as inputs to the model, together with surface temperatures.
Deep learning research has exploded lately, and scientists studying machine learning and climate change are actually looking into how deep learning techniques might address weather forecasting and spatial downscaling issues. In terms of applying machine learning, the 2 take contrasting approaches. Scientists studying machine learning place more emphasis on what architectures are best fitted to what problems and tips on how to process data in a way that’s well suited to modern machine learning methods, whereas climate scientists make more use of physical equations and consider the crucial evaluation metrics.
Nonetheless, ambiguous language (“bias” in climate modeling versus “bias” in machine learning), a scarcity of standardization in the appliance of machine learning for climate science challenges, and a lack of awareness within the evaluation of climate data have hindered their ability to unlock their full potential. To handle these issues, researchers on the University of California, Los Angeles (UCLA) have developed ClimateLearn, a Python package that allows easy, standardized access to enormous climate data and cutting-edge machine-learning models. Quite a lot of datasets, state-of-the-art baseline models, and a set of metrics and visualizations are all accessible through the package, which enables large-scale benchmarking of weather forecasting and spatial downscaling techniques.
ClimateLearn delivers data in a format that current deep learning architectures can easily utilize. The package includes data from ERA5, the fifth-generation reanalysis of historical global climate, and meteorological data from the European Centre for Medium-Range Weather Forecasts (ECMWF). A reanalysis dataset uses modeling and data assimilation techniques to merge historical data into global estimations. By virtue of this mixture of real data and modeling, reanalysis solutions can have entire global data with reasonable accuracy. ClimateLearn also supports preprocessed ERA5 data from WeatherBench, a benchmark dataset for data-driven weather forecasting, along with the raw ERA5 data.
The baseline models implemented in ClimateLearn are well-tuned for the climate tasks and might even be easily prolonged for other downstream pipelines in climate science. Easy statistical techniques like linear regression, persistence, and climatology are only a number of examples of the range of normal machine learning algorithms supported by ClimateLearn. More sophisticated deep learning algorithms like residual convolutional neural networks, U-nets, and vision transformers are also available. The package also provides support for quickly visualizing model predictions using metrics like (latitude-weighted) root mean squared error, anomaly correlation coefficient, and Pearson’s correlation coefficient. Moreover, ClimateLearn provides the visualization of model predictions, ground truth, and the discrepancy between the 2.
Researchers’ primary goal in developing ClimateLearn was to shut the gap between the communities of climate science and machine learning by making climate datasets easily accessible, providing baseline models for straightforward comparison, and visualization metrics to understand the model outputs. Within the near future, the researchers intend so as to add support for brand new datasets, like CMIP6 (the sixth generation Climate Modeling Intercomparison Project). The team can even support probabilistic forecasting with recent uncertainty quantification metrics and several other machine learning methods like Bayesian neural networks and diffusion models. The extra opportunities that machine learning researchers can open up by knowing more about model performance, expressiveness, and robustness have the researchers incredibly enthusiastic. Moreover, climate scientists will have the option to understand how altering the values of the input variables will change the distributions of the outcomes. The team also plans on making the package open-source and appears forward to all of the community’s contributions.
Try the Tool, Colab, and Blog. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to affix our Reddit Page, Discord Channel, and Email Newsletter, where we share the most recent AI research news, cool AI projects, and more.
Khushboo Gupta is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Goa. She is keen about the fields of Machine Learning, Natural Language Processing and Web Development. She enjoys learning more in regards to the technical field by participating in several challenges.