Home Artificial Intelligence Use to_string() to Stop Python from Hiding the Body of the Printed DataFrames

Use to_string() to Stop Python from Hiding the Body of the Printed DataFrames

0
Use to_string() to Stop Python from Hiding the Body of the Printed DataFrames

3-Minutes Pandas

What should we do to see all the printed dataframe after the execution of a Python script?

Photo by Pascal Müller on Unsplash

Sometimes running through a Python script without reporting any errors isn’t the one task of the debugging process. We want to make certain the functions are executed as expected. It’s a typical step within the exploratory data evaluation to ascertain how the info looks like before and after some specific data processing.

So, we’d like to print out some data frames or essential variables throughout the execution of the script, as a way to check whether or not they are “correct”. Nonetheless, easy print command can only show the highest and bottom rows of the info frame sometimes (as shown in the instance below), which makes the checking procedure unnecessarily hard.

Often, the info frames are within the format of pandas.DataFrame, and in the event you use the print command directly, you may get something like this,

import pandas as pd
import numpy as np

data = np.random.randn(5000, 5)
df = pd.DataFrame(data, columns=['A', 'B', 'C', 'D', 'E'])

print(df.head(100))

print the highest 100 rows (image by creator)

You’ll have already noticed that the center a part of the info frame is hidden by three dots. What if we actually need to ascertain what the highest 100 rows are? For instance, we wish to ascertain the results of a particular step in the course of a big Python script, as a way to make certain the functions are executed as expected.

set_option()

Probably the most straightforward solutions is to edit the default variety of rows that Pandas show,

pd.set_option('display.max_rows', 500)
print(df.head(100))
print the highest 100 rows after setting the default variety of rows that Pandas displays (image by creator)

where set_option is a technique that means that you can control the behavior of Pandas functions, which incorporates setting the utmost variety of rows or columns to display, as we did above. The primary argument display.max_rows is to regulate the utmost variety of rows to display and 500 is the worth we set as the utmost row number.

Though this method is widely used, it’s not ideal to place it inside an executable Python file, especially if you have got multiple data frames to print they usually are desired to display different numbers of rows.

For instance, I actually have a script structured as shown,

## Code Block 1 ##
...
print(df1.head(20))
...

## Code Block 2 ##
...
print(df2.head(100))
...

## Code Block N ##
...
print(df_n)
...

we now have different numbers of top rows to indicate through all the script, and sometimes we wish to see all the printed data frame, but sometimes we only care in regards to the dimension and structure of the info frame without the necessity to see all the data.

In such a case, we probably need to make use of the function pd.set_option() to set the specified display or pd.reset_option() to make use of the default options each time before we print a knowledge frame, which makes it very messy and troublesome.

## Code Block 1 ##
...
pd.set_option('display.max_rows', 20)
print(df1.head(20))
...

## Code Block 2 ##
...
pd.set_option('display.max_rows', 100)
print(df2.head(100))
...

## Code Block N ##
...
pd.reset_option('display.max_rows')
print(df_n)
...

There’s actually a more flexible and effective way of showing all the data frame without specifying the display options for Pandas.

to_string()

to_string() directly transfer the pd.DataFrame object to a string object and after we print it out, it doesn’t care in regards to the display limit from pandas .

pd.set_option('display.max_rows', 10)
print(df.head(100).to_string())
print the highest 100 rows using to_string() (image by creator)

We will see above that although I set the utmost variety of rows to display as 10, to_string() helps us print all the data frame of 100 rows.

The function, to_string() , converts a complete data frame to the string format, so it could possibly keep all of the values and indexes in the info frame within the printing step. Since set_option() is simply effective on pandas objects, our printing string isn’t limited by the utmost variety of rows to display set earlier.

So, the strategy is that you simply don’t have to set anything via set_option() and also you only need to make use of to_string() to see all the data frame. It’s going to prevent from fascinated by which choice to set during which part across the script.

Takeaways

  1. Use set_option('display.max_rows') when you have got a consistent variety of rows to display across all the script.
  2. Use to_string() if you wish to print out all the Pandas data frame regardless of what Pandas options have been set.

Thanks for reading! Hope you enjoy using the Pandas trick in your work!

Please subscribe to my Medium if you wish to read more stories from me. And it’s also possible to join the Medium membership by my referral link!

LEAVE A REPLY

Please enter your comment!
Please enter your name here