In this article, we will be learning how to write data (data frame) from python to Google sheets and how to read already existing data from Google sheets to Python (in the form of pandas data frame).
Creating The Data Frame In Python
I am writing this article in continuation with the previous article How to read data from API in Python. You can either refer to that article and create a data frame from the data fetched from API or you can use the below code to create a data frame.
There are various methods to create a data frame in python using pandas. The one we are going to use here is dict of narray/lists :
#Import the required package import pandas as pd # intialise data of lists. data = {'Name':['India', 'China', 'USA', 'Russia'], 'Population (In crores)':[133.92, 138.64, 32.72, 14.45]} # Create DataFrame df = pd.DataFrame(data) df.head()
Once you have your data frame created, the next step will be to import the pygsheets python package. Before we start with importing the package in python and start reading and writing data into google sheets, we have to authorize the same from google. In order to do so, you can refer to the pygsheets official documentation for authorization, this will provide you a step by step guidance.
Once we have the authorization file from google API, we will have a client email id mentioned in the file. In order to let pygsheets read/write into the particular google sheet, we will have to grant read and write access to this email id. Once that is done, our python code will be able to read and write data from that particular google sheet.
Read From Google Sheets Into Python
The first step you will do in order to read from google sheet is, you should have that google sheet available on your drive. I have created a new google sheet called “Test Google Sheet”. I have shared this file with the email id I got in client.json file, and I have granted edit permission to this email id.
Now we will write our python code to export the values in the Google sheet as data frame into python.
#Import Pygsheets Python package. import pygsheets #Authorization gc = pygsheets.authorize(service_file='pygsheets.json') #Open the google spreadsheet sh = gc.open('Test Google Sheet') #Define which sheet to open in the file wks = sh[0] #Get the data from the Sheet into python as DF read = wks.get_as_df() #Print the head of the datframe read.head()
Every single code has been explained with the comment to make it easier to understand. I have named my authorization file as “pygsheets.json”, so I have mentioned the same in the authorization process. We also need to select the sheet from where we need to import the data. Here is the output of this code.
Write From Google Sheet Into Python
Now we already have some data in our Google sheet which we will want to clear, then we would want to add our data frame, which we created above to write into “Test Google Sheet”. In order to clear the data already existing in the sheet, we will use the clear() function.
#In order to clear all the data in the sheet. wks.clear()
Once we have our sheet cleared. We can write the code to write out already created data frame.
#We will use set_dataframe() in pygsheets to write df data into GS wks.set_dataframe(df,(1,1)) #(Row_Number, Column_Number)
Using set_dataframe() in pygsheets we write the already created df from python to Google Sheets. In the function set_dataframe(a,b), a is the data frame which we want to write and b is the row number and column number where we want to write. Let us again try to read the data from our sheet and see what we have now.
#Get the data from the Sheet into python as DF read = wks.get_as_df() #Print the head of the datframe read.head()
Here is the output for the same:
There can be a lot of practical application of this feature, to mention few which I have used personally are:
- I am collecting Air Pollution data from data.gov.in using the API provided by them. The API provides data on an hourly basis. Every hour I get new data (approx 1240 rows). Since there is no other way to get prior hours of data, I use google sheet to append the new data below the last hour data.
- In my last company, I used this method to automate the calling process. We use to call few customers based on a few parameters, the details for these customers were fetched from the DB, which I did with the use of Python, then it was updated in a google sheet (which was new for every single day). This sheet was then referred by calling agents, they use to call these people and the responses from them were stored in the same sheet by every agent. The next day, new data was added and the already existing data was appended in the log google sheet.
I hope this tutorial helps. If you need any help, you can write your questions in the comment below.