Reading specific columns in an Excel file

When we use the pandas module to read an Excel file using the read_excel method, we can also read specific columns in that file. For reading specific columns, we need to use the  usecols parameter in the read_excel method.

Now, let’s look at an example to read specific columns in an Excel file. Create a script called  rd_excel_pandas1.py  and write the following content in it:

import pandas as pd

excel_file = 'sample.xlsx'
cols = [1, 2, 3]
df = pd.read_excel(excel_file , sheet_names='sheet1', usecols=cols)

print(df.head())

Run the preceding script and you will get the following output:

student@ubuntu:~/test$ python3 rd_excel_pandas1.py

Following is the output:

    Region      Rep    Item
0 Central Smith Desk
1 Central Kivell Desk
2 Central Gill Pencil
3 Central Jardine Binder
4 Central Andrews Pencil

In the preceding example, first we imported the pandas module. Then, we created a string called excel_file to hold the filename. Then we defined the cols variable and put index values of the columns inside it. So, when we used the read_excel method, within that method, we also provided the usecols parameter to fetch a particular column through the index, which we defined previously in the cols variable. Therefore, after running the script, we are getting only specific columns from the Excel file.

We can also perform various operations on Excel files using the pandas module, such as reading an Excel file with missing data, skipping particular rows, and reading multiple Excel sheets.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset