This post demonstrates the functionalities to take care of time zones in Python by the use of comparative assessment of hourly solar irradiance data for 4 cities in 2020 based on different time zones.
Once I start my work in Bonn, Germany on the primary day of October at 9 am, it’s already afternoon at 12:45 pm in my hometown in Chitwan, Nepal. My friend in Sydney, Australia has already finished his work schedule at 6 pm on the identical day. One other friend in Latest York, the USA remains to be sleeping because it is 3 am in morning there. This means that these 4 places have different time zones.
The time zone is an area, which observes uniform standard time for legal, social, or industrial purposes. The world isn’t uniformly divided into different time zones based on longitudes. Time zones are likely to fairly follow boundaries between and inside countries for differentiation.
All time zones are defined as an offset from Coordinated Universal Time (UTC). And these values can range from UTC-12:00 to UTC+14:00. While the offsets are often an entire variety of hours, a number of zones are also offset by a further 30 or 45 minutes. For instance, the time zone of Nepal has a time offset of UTC+05:45. In total, there are 38 time zones on this planet.
If I even have data on solar irradiance for the 4 cities in Nepal, Germany, Australia, and the USA within the UTC time zone, it doesn’t reflect the info for a similar hour of the day in each of those countries. On this post, I’m going to debate how the time zones of the info may be handled for datetime objects including pandas dataframe in Python.
For this purpose, I’m going to download solar irradiance data for 2020 of those 4 cities/countries, compare and analyze the info when:
- The info of every country is within the UTC time zone and
- The info refers back to the respective time zone of the country.
Let’s start.
Geocoding to retrieve the coordinates of 4 cities
In step one, I retrieve the coordinates of the 4 cities in 4 countries because I would like them to extract the solar irradiance data. The strategy of extracting the geographical coordinates by providing the name of the place is known as geocoding.
As shown below, I wrote a function for geocoding using the geopy package. The function utilizes Nominatim, which is an open-source service for geocoding that uses OpenStreetMap data to search out locations on the earth by name and address.
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="app")def get_coordinates(place):
"""Return the latitude and longitude of the place."""
place_details = geolocator.geocode(place)
coordinates = (place_details[1][0], place_details[1][1])
return coordinates
I used the function to extract the coordinates of individual cities and create a pandas dataframe out of it as depicted within the screenshot below.
Accessing data using NASA Power API
The Applications Programming Interface (API) service of NASA Power allows to retrieve Evaluation Ready Data (NASA Power, 2023a). For this post, I download the solar irradiance data for 4 cities in hourly resolution fom NASA Power Data(NASA Power, 2023b). The parameter I take advantage of is All Sky Surface Shortwave Downward Irradiance (ALLSKY_SFC_SW_DWN
) for 2020, which is described in additional detail within the section below.
The info is known as in UTC time zone format, although the hourly API also allows calling the info in Local Solar Time (LST) format by default.
The base_url
configuration looks as follows:
base_url = r”https://power.larc.nasa.gov/api/temporal/hourly/point?parameters=ALLSKY_SFC_SW_DWN&community=RE&time-standard=UTC&longitude={longitude}&latitude={latitude}&format=JSON&start=2020&end=2020"
Next, I loop through the longitude and latitude of every place defined by geocoding in an inventory called places
and request the hourly solar irradiance data for 2020. The complete code for this step is given within the GitHub gist below:
Parameter description
The solar irradiance data refers to the full power (direct + diffused) obtained from the sun per unit area per hour (Wh/m²) on a horizontal plane on the surface of the earth under all sky conditions (NASA Power, 2023c).
This parameter, also known as Global Horizontal Irradiance (GHI), is relevant to calculate the scale of solar PV module needed to satisfy the given electricity demand as given within the formula below:
Basic statistics of given data
The downloaded data is depicted within the plot above. The info shows higher solar irradiance in Sydney towards the start and end of the yr, and lower towards the center of the yr. This pattern is opposite in the opposite three cities, which may be explained by the placement of Sydney within the Southern hemisphere and other cities within the Northern hemisphere of the globe.
It’s observed that Chitwan, Nepal received the very best annual solar irradiance (1669 kWh/m²) in 2020 followed by Sydney, Australia (1631 kWh/m²), Latest York, the USA (1462 kWh/m²), and Bonn, Germany received the least (1193 kWh/m²).
Nonetheless, the utmost solar irradiance received at a specific hour is highest for Sydney (1061.3 W/m²) followed by Chitwan (997 W/m²).
The minimum solar irradiance and the twenty fifth percentile values for every city is zero because there isn’t a solar irradiance during night hours.
Time zone Handling
1. Default pandas dataframe without “datetime” format index
As 2020 was a intercalary year, there have been three hundred and sixty six days and in consequence, the info was obtained for 8784 hours.
When the info is first downloaded, its index is of integer (int64) type as shown below:
2. Converting integer type index to “naive” datetime index
The dataframe index may be converted into datetime type using pd.to_datetime()
and specifying the format %Y%m%d%H
for yr, month, day and hours respectively.
This variation can be reflected when the dataframe is plotted because the months Jan to Dec of 2020 are visible in xticks as shown below:
Although this dataframe has a datetime index, it doesn’t have any details about time zones and daylight saving. Hence, the dataframe index is a naive datetime object. This is clear by checking the time zone info of one among the index of the pandas dataframe.
3. Localizing “naive” datetime object to “time zone aware” datetime object
The datetime module of Python may be used to access, retrieve and manipulate the date and time information.
By default, the datetime.now()
function returns the present “local” date and time information. Nonetheless, it doesn’t have any time zone and daylight saving information as time_now.tzinfo
returns None within the code snippet below, implying it’s a naive datetime object.
As of now (21 April 2023), I’m in Nepal. Subsequently, I localize the present time to “Asia/Kathmandu” time zone using the timezone.localize()
module of pytz package. Now, the time_in_nepal
is a time zone aware datetime object.
To get the present local time in Germany, I can use time_in_nepal.astimezone(timezone("Europe/Berlin"))
, which can be a time zone aware datetime object.
4. Localizing timezone of pandas dataframe
Next, I localize the naive index of pandas dataframe to UTC time zone using df.tz_localize(tz = "UTC")
as shown within the code screenshot below.
It’s observed that the index of df
is converted from naive index to time zone aware index of UTC time zone as shown above.
5. List of all possible time zone addresses
The list of all possible time zone addresses that may be referred can be found using all_timezones
module of pytz package. There are 594 such addresses. Some addresses can check with same time zone. For instance, Europe/Berlin, Europe/Amsterdam, Europe/Copenhagen all check with same time zone.
6. Create latest dataframe for every city and convert UTC time zone to corresponding local time zone
df
comprises the solar irradiance data of the 4 cities in UTC time zone. On this step, I create 4 dataframes out of every column of df
. After which I convert the time zone of recent dataframe from UTC to the local time zone of every city or country it belongs to. For instance, the time zone of df_chitwan
is converted using
df_chitwan.tz_convert(tz = "Asia/Kathmandu")
.
It’s to be noted that for countries which have daylight savings, that is mechanically accounted for within the time zone conversion. For instance, Nepal time is consistent with UTC + 05:45 all year long. Nonetheless, for Sydney, Python mechanically deals with daylight saving because the offset with UTC time zone may be 10 or 11 hours depending on time of yr.
7. Comparing the plots of solar irradiance data in numerous time zones
On this final step, I wanted to match how the solar irradiance looked like within the 4 cities when the info corresponded to:
a. The UTC time zone and
b. The local time zone of every city.
Within the code snippet below, I create two sub-plots to plot the solar irradiance in 4 cities. Within the left subplot, the solar irradiance data for October 1, 2020 based on UTC time zone is plotted. And in the precise subplot, the solar irradiance data for October 1, 2020 based on the local time of every city is plotted.
fig, (ax1, ax2) = plt.subplots(1, 2, figsize = (20, 6))
fig.suptitle("Solar irradiance on October 1, 2020")ax1.plot(df.loc["2020–10–01"])
ax1.set_title("Based on UTC time zone")
ax1.xaxis.set_ticks(ticks = df.loc["2020–10–01"].index[::4], labels = np.arange(0, 24, 4))
cities = df.columns.tolist()
handles = ax1.get_legend_handles_labels()[0]
ax1.legend(handles, labels = cities, loc = "upper right")
ax1.set_xlabel("Hour of day")
ax1.set_ylabel("W/m$^2$")
ax2.plot(df_chitwan.loc["2020–10–01"].values.tolist())
ax2.plot(df_newyork.loc["2020–10–01"].values.tolist())
ax2.plot(df_bonn.loc["2020–10–01"].values.tolist())
ax2.plot(df_sydney.loc["2020–10–01"].values.tolist())
ax2.xaxis.set_ticks(ticks = np.arange(0, 24, 4), labels = np.arange(0, 24, 4))
handles = ax2.get_legend_handles_labels()[0]
ax2.legend(handles, labels = cities)
ax2.set_title("Based on local time zone of every city/country")
ax2.set_xlabel("Hour of day")
ax2.set_ylabel("W/m$^2$")
plt.savefig("output/solar irradiance on october 1.jpeg",
dpi = 300)
plt.show()
The plot looks as shown below:
As of October 1, 2020, the time zones of 4 cities as in comparison with UTC time zone are: Chitwan (UTC+05:45), Latest York (UTC- 04:00), Bonn (UTC + 02:00), and Sydney (UTC+10:00). Thus, we see the solar irradiance peak around 4 am, 3 pm, 10 am and three am of UTC time zone for Chitwan, Latest York, Bonn, and Sydney respectively on the plot on the left.
The plot on the precise shows that solar irradiance has an identical shape based on local hours throughout the day in each city. The solar irradiance starts to extend from zero at around 5 or 6 am in each city, it peaks around noon and continues to say no before reaching zero again at 5 or 6 pm. On this present day of the yr, Sydney received the very best solar irradiance, followed by Chitwan, Latest York, and Bonn.
On this post, I demonstrated the methods to take care of time zones while working with datetime objects including dataframe in Python. I used the instance of working with solar irradiance data for 4 cities the world over. These methodologies may very well be very handy while working with time series data, where time zones matter akin to meteorological data. I even have summarized the important thing techniques learnt from this post to take care of time zones in Python in the next numbered bullets:
- It is feasible to examine the time zone of a datetime object using tzinfo module.
2. When the datetime object doesn’t contain any details about time zones and daylight saving, it is known as naive datetime object.
3. Using the timezone module of pytz package, it is feasible to convert naive time to local time. For instance,
time_in_nepal = timezone("Asia/Kathmandu”).localize(datetime.now())
4. The brand new object is now time zone aware. It is feasible to get the time in a distinct time zone using astimezone
module of datetime object. For instance,
german_timezone = timezone(“Europe/Berlin”)
time_in_germany = time_in_nepal.astimezone(german_timezone)
5. To work with time series data, it is smart to convert the index of pandas dataframe to datetime index.
6. The naive dataframe index may be localized using tz_localize
module in df
and specifying the time zone. For instance,
df_utc = df.tz_localize(tz = “UTC”)
7. The dataframe object will also be converted to different time zone using tz_convert
module of df
.
df_nepal = df_utc.tz_convert(tz = “Asia/Kathmandu”)
The info, code and output plots for this post can be found in notebooks/Timezone_handling
folder on this GitHub repository. Thanks for reading!
References
OpenStreetMap, 2023. Copyright and license.
NASA Power, 2023a. NASA Power APIs.
NASA Power, 2023b. POWER|Data Access Viewer.
NASA Power, 2023c. Parameters definitions.