
A comprehensive assessment of time series resampling of solar irradiance using VBA, using Python, and using Python via Excel VBA
Recently, I even have been working with Excel-based interface and models ceaselessly. During this time, I became acquainted with the Visual Basic for Application (VBA) for Office, which is a robust programming language to increase Office applications. VBA is used to automate repetitive tasks, extend user interactions, and interact between different Office applications, making the routine tasks more efficient and effective.
With a good programming background in Python, it took me relatively steep learning curve towards the start to learn different features and functionalities of VBA, which steadily turned less steep with time. I even have realized that learning VBA may be very useful because it interacts with Excel workbook directly and may automate the tasks on the workbook itself that otherwise would require coding in a distinct programming language. Nonetheless, if it is feasible to run a script in a distinct programming language (e.g., Python) inside Excel VBA, it might be much more helpful in automating our routine tasks.
On this post, I’ll share my experience of using Excel VBA and Python to perform a sure bet — time series resampling of solar irradiance data. Moreover, I’m going to present how a Python script could be run via Excel VBA to perform the identical task. Let’s start.
Data
The info used is hourly All Sky Surface Shortwave Downward Irradiance (ALLSKY_SFC_SW_DWN
) for 2020 (UTC time zone) downloaded from NASA Power website for 4 cities: Chitwan (Nepal), Recent York (USA), Sydney (Australia), and Bonn (Germany). The info was accessed and downloaded using the API service of NASA Power with a Python script, which I intend to document in a distinct post.
Solar irradiance is the facility per unit area (W/m2) obtained from the sun in the shape of electromagnetic radiation within the wavelength range of the measuring instrument. Solar irradiance integrated over a time period gives the solar irradiation, known as solar insolation (Wh/m2).
Based on the definition of NASA Power, the parameter used All Sky Surface Shortwave Downward Irradiance (ALLSKY_SFC_SW_DWN
) refers to the overall solar irradiance (direct plus diffused) incident on a horizontal plane on the surface of the earth under all sky conditions. Another term for total solar irradiance is Global Horizontal Irradiance (GHI). For the reason that values are hourly, its unit is Wh/m².
Data in Excel file
Since 2020 was a intercalary year, I obtained the hourly solar irradiance values for 8784 hours for the 4 cities. These values are placed in column B, C, D and E of the Excel file as shown below.
To get the values in column A in date format, the next formula was utilized in Excel (e.g., in cell A2):
=TEXT(DATE(2020,1,1)+ROW(A1)/24, “yyyy-mm-dd hh:mm:ss”)
ROW(A1)/24
is used to get the worth of hour inside a day (0–23 hr).
Also, I named cell A1 as datetime
.
Plotting the unique hourly solar irradiance data for 2020 for the 4 cities in Excel looks as follows:
Time series resampling refers to converting the frequency level of time series data. In easy words, resampling is a time-based groupby, followed by a discount method on each of the groups. The info will need to have a datetime-like index for this purpose.
1. Time Series Resampling using Excel VBA
There is no such thing as a default function in Excel for time series resampling. Subsequently, I wrote a few subroutines to get monthly and hourly average values as described within the sections a and b respectively below.
a. VBA code for converting hourly values to average monthly frequency
This section describes the code snippet (below) used to convert hourly value to average monthly value in VBA.
An array is a gaggle of variables in VBA. The default lower sure of an array element in VBA is 0. Mentioning Option Base 1
on the highest of the sub-routine changes the lower sure of array element to 1. I defined an array called columns(4)
as a gaggle of strings containing 4 variables. And I passed the strings B, C, D and E inside this array.
Cell A1 in Excel is defined as a named cell datetime
. To discuss with this cell in VBA, I declared datetime
as a spread and assigned it to the range with same name in Excel.
To discuss with the month from the datetime column in Excel, I used the MONTH()
function and assigned it to an integer variable called mnth
. To loop through hourly values in each row, I declared one other integer called row
. Finally, I declared sum
and num_hours
to calculate the monthly average values.
'Use Option Base 1 before this subroutine if you would like to start the list from 1 as a substitute of 0.
'https://excelchamps.com/vba/arrays/
Option Base 1
Sub GetMonthlyAverage()'defining an array for 4 strings
Dim columns(4) As String
columns(1) = "B"
columns(2) = "C"
columns(3) = "D"
columns(4) = "E"
'Check with cell A1
Dim datetime As Range
Set datetime = Range("datetime")
'Definining mnth because Month is a function
Dim mnth As Integer
Dim row As Integer
Dim sum As Double
Dim num_hours As Double
Next, I created a for-loop to iterate through the columns for values for every city. Inside this for-loop, there are two other nested loops to iterate through every month and hour of the yr, respectively. The sum
aggregates the hourly solar irradiance values for every month, and the num_hours
aggregates the variety of hours in every month. Finally, by dividing the sum
with num_hours
, monthly average values of solar irradiance are obtained for every month for every city.
'Loop through column for every city
For Each column In columns'Loop through every month of the yr
For mnth = 1 To 12
sum = 0
num_hours = 0
'Loop through each row
For row = 2 To 8785
If MONTH(Cells(row, datetime.column)) = mnth Then
Range(column & row).Interior.Color = RGB(255, 255, 0)
num_hours = num_hours + 1
sum = sum + Range(column & row).Value
End If
Next row
Range(column & mnth).Offset(1, 7).Value = sum / num_hours
Next mnth
Next column
End Sub
As a note, Range(column & row).Interior.Color = RGB(255, 255, 0)
highlights each cell with yellow color while iterating through the row (city) and column (mnth).
b. VBA code for converting hourly values in a yr to average hourly values for every of the 24 hours (0–23 hr) of the day in 2020
The code used to convert hourly values (of a yr) to average hourly values for every of the 24 hours of the day in 2020.
Within the code for this section, first, I extracted the worth for the last_row
(8785) in the information table using
Cells(datetime.row, datetime.column).End(xlDown).row
to loop through each row for further processing.
And I used the HOUR()
function in Excel to retrieve the corresponding hour in column A in each row using:
Hour(Cells(row, datetime.column).column).Value
The entire code for the aim of this section is given within the snippet below:
Option Base 1
Sub GetHourlyAverage()‘defining an array for 4 strings
Dim columns(4) As String
columns(1) = “B”
columns(2) = “C”
columns(3) = “D”
columns(4) = “E”
'Definining mnth because Month is a function
Dim mnth As Integer
Dim row As Integer
Dim sum As Double
Dim num_hours As Double
Dim wb As Workbook
Dim ws As Worksheet
Dim datetime As Range
Dim last_row As Integer
Set wb = ThisWorkbook
Set ws = ThisWorkbook.Sheets("Sheet1")
Set datetime = ws.Range("datetime")
last_row = Cells(datetime.row, datetime.column).End(xlDown).row
Debug.Print datetime.Value
Debug.Print "Row: " & datetime.row & " Column: " & datetime.column
Debug.Print "Last row: " & last_row
'Loop through column for every city
For Each column In columns
'Loop through each hour of the day
For hr = 0 To 23
sum = 0
num_hours = 0
'Loop through each row
For row = datetime.row + 1 To last_row
If Hour(Cells(row, datetime.column).Value) = hr Then
Range(column & row).Interior.Color = RGB(255, 255, 0)
num_hours = num_hours + 1
sum = sum + Range(column & row).Value
End If
Next row
Range(column & hr + 2).Offset(0, 14).Value = sum / num_hours
Next hr
Next column
End Sub
Within the above code snippet, Debug.Print
command is used to print the intermediate ends in the intermediate window within the VBA developer space as shown below:
2. Time Series Resampling using Pandas
The pandas library in Python provides an in-built method for time series resampling using df.resample()
and passing the rule for resampling. For instance, “M” is for monthly, “W” is for weekly, “Q” is for quarterly, “D” is for day by day, “B” is for Business Day, etc. The entire algorithm available for resampling to different frequency levels could be found here.
A pre-requisite for time series resampling is that the dataframe index must be converted right into a datetime type using pd.to_datetime()
.
Any built-in method available via Groupby is obtainable as a technique of the returned object of df.resample()
, including min()
, max()
, mean()
, median()
, std()
, first()
, last()
, ohlc()
, and sem()
. On this post, I’m simply assessing the mean values of the solar irradiance.
Within the Python code below, I allow the user to enter the frequency that they wish to return and display for the solar irradiance values. The choices include Original, Monthly average, Day by day average, Weekly average, Quarterly average, All the above, and Hourly average (for every of the 24 hours inside a day).
import pandas as pd
import matplotlib.pyplot as plt
import os
import sys#Enter os system to current working directory
os.chdir(sys.path[0])
file = "solar_irradiance.xlsm"
#read all rows and first 5 columns
df = pd.read_excel(file).iloc[:, :5]
df["Datetime"] = pd.to_datetime(df["Datetime"])
df.set_index(["Datetime"], inplace = True)
frequency = input("Enter the frequency you would like to display? n1. Original n2. Monthly averagen3. Day by day average n4. Weekly averagen 5.Quarterly average n 6.All the above n 7. Hourly average n? ")
if frequency == "Original":
print (df)
df.plot()
plt.title("Original solar irradiance in 2020")
plt.ylabel("Wh/m$^2$")
plt.legend()
plt.show()
elif frequency == "Monthly average":
print (df.resample(rule = "M").mean())
df.resample(rule = "M").mean().plot()
plt.title("Monthly average solar irradiance in 2022")
plt.ylabel("Wh/m$^2$")
plt.legend()
plt.show()
elif frequency == "Day by day average":
print (df.resample(rule = "D").mean())
df.resample(rule = "D").mean().plot()
plt.title("Day by day average solar irradiance in 2022")
plt.ylabel("Wh/m$^2$")
plt.show()
elif frequency == "Weekly average":
print (df.resample(rule = "W").mean())
df.resample(rule = "W").mean().plot()
plt.title("Weekly average solar irradiance in 2022")
plt.ylabel("Wh/m$^2$")
plt.legend()
plt.show()
elif frequency == "Quarterly average":
print (df.resample(rule = "Q").mean())
df.resample(rule = "Q").mean().plot()
plt.title("Quarterly average solar irradiance in 2022")
plt.ylabel("Wh/m$^2$")
plt.legend()
plt.show()
elif frequency == "All the above":
fig, axs = plt.subplots(2, 2, figsize = (20, 10), sharex = True, sharey = True)
df.resample(rule = "D").mean().plot(ax = axs[0, 0])
axs[0, 0].set_title("Day by day mean")
axs[0, 0].set_ylabel("Wh/m$^2$")
df.resample(rule = "W").mean().plot(ax = axs[0, 1])
axs[0, 1].set_title("Weekly mean")
df.resample(rule = "M").mean().plot(ax = axs[1, 0])
axs[1, 0].set_title("Monthly mean")
axs[1, 0].set_ylabel("Wh/m$^2$")
df.resample(rule = "Q").mean().plot(ax = axs[1, 1])
axs[1, 1].set_title("Quarterly mean")fig.suptitle("Mean solar irradiance in 4 locations converted to different temporal frequencies")
plt.show()
elif frequency == "Hourly average":
#average value in each hour inside 24 hours of a day
print (df.groupby(df.index.hour).mean())
df.groupby(df.index.hour).mean().plot()
plt.title("Hourly average solar irradiance in 2022")
plt.ylabel("Wh/m$^2$")
plt.legend()
plt.show()
else:
print ("The frequency you entered is wrong.")
This script could be run by going to the terminal/command prompt and typing python -m python_script.py
if one is identical path because the script file. To abrubt the run, one can type Ctrl+C
.
3. Running Python script via Excel VBA
The above Python script may also be run via Excel VBA. For this purpose, I saved the above script as python_script.py
file.
The entire subroutine written in VBA to run the Python script is given in code snippet below.
Sub RunPythonScript()
Dim objShell As Object
Dim PythonExePath As String, PythonScriptPath As String
ActiveWorkbook.Save'Enter into the trail of given workbook
ChDir Application.ThisWorkbook.Path
Set objShell = VBA.CreateObject(“Wscript.Shell”)
'Enter into the trail of given workbook
ChDir Application.ThisWorkbook.Path
Set objShell = VBA.CreateObject("Wscript.Shell")
'Goto cmd. Type where python to get this path. Note that there are three quotes below.
' The hash symbol # below must be stuffed with the trail in your system.
PythonExePath = """C:Users#######################python.exe"""
'Get the trail of the file.
PythonScriptPath = Application.ThisWorkbook.Path & "python_script.py"
objShell.Run PythonExePath & PythonScriptPath
End Sub
First, I declared objShell
variable to discuss with the address of an object (Wscript.shell). This object allows to access the Windows functionality to run the external program (here, Python script).
The PythonExePath
refers back to the path of the Python application in the pc system. In a Windows system, this path could be found by typing where python
within the Windows command prompt.
The trail of the python script is defined as string in PythonScriptPath
. Note that this path shouldn’t have space for the script to run
Finally, the next line is used to run the Python script using the Python application via the Excel VBA interface.
objShell.Run PythonExePath & PythonScriptPath
Output
I assigned the above sub-routine/macro (command written using VBA language) to a button in Excel file as shown below:
Clicking the button runs the Python script as shown below and asks the user to enter the frequency the user desires to display output as:
Upon choosing “All the above”, I get the output plot for the day by day, weekly, monthly, and quarterly average values of solar irradiance within the 4 cities in 2020. It’s observed that because the time resolution increases from right to left and top to bottom, the road/curve becomes smoother because the variabilities in lower time resolutions are averaged out.
Conclusion
On this post, I presented three techniques for time series resampling of hourly solar irradiance data for 4 cities in 2020:
- Using Excel VBA
- Using pandas in Python
- Running Python script via Excel VBA interface
Excel VBA could be very handy while working with data in Excel since it allows us to perform various operations and have direct interactions with data in several sheets in the identical or different Excel file where the sub-routines/macros are written. Also, by writing macros and assigning them to interactive buttons/userforms or other widgets, it is straightforward to share Excel file with other users to whom only the functionalities matter and never what happens with the code within the backend.
One among the most important benefits of Python is that there are packages with several in-built functionalities, which makes writing a separate code for routine tasks redundant. On this context, combining the strengths of Excel VBA and Python could be very advantageous. That is depicted by the way in which I ran the Python script for time series resampling by clicking a button on an Excel file, which is related to just a few easy lines of VBA code.
The Excel file including the macros, and the Python script can be found within the script
folder inside this repository. Thanks for reading!