Python - Pandas


Pandas, a Python library used for working with data sets. Pandas is derived from the term "Panel Data"

Installing a package(panda) and using it by importing
Panda is package is used for data cleaning, manipulation and analysis.
python3 -m pip install pandas
import pandas

The problem with above import is we have to use "pandas" everywhere wherever we refer it and it is like more words/code. To address we can import using an alias

python3 -m pip install pandas
import pandas as pd (here pd is the alias for pandas)

After importing pandas, we can create a DataFrame(tabular form) and work with it. Let's say if we have a dictionary type, called authors, we can convert it to a  DataFrame as below.

authors_df = pd.DataFrame(sales)
authors_df 

 

I have attempted to install pip and no luck. I have tried work arounds as per below and was able to install it successfully on Linux Server. Below are the commands along with the screenshot.

  • cat /etc/redhat-release
  • python3 --version
  • python -m pip install pandas
  • python3 -m pip install pandas
  • python -m ensurepip --default-pip
  • python3 -m pip install pandas



























If we have a dictionary with name authors, using pandas, the same authors dictionary can be seen as DataFrame using below commands.
  • import pandas as pd
  • <dictionary>_df = pd.DataFrame(<dictionary>) (to convert Dictionary to DataFrame)
    • E.g. authors_df = pd.DataFrame(authors)
  • <dictionary>_df (to view the DataFrame)
    • E.g. authors_df

Type of authors can be verified using type() which returns "pandas.core.frame.DataFrame" as shown below.

type(authors_df)
<class 'pandas.core.frame.DataFrame'>

 
Reading from a CSV file
  • import pandas as pd
  • authors_df = pd.read_csv(filename.csv)
  • authors_df
We can preview the DataFrame using head() --- DataFrame.head() to print first 5 rows
E.g. authors_df.head()

No comments:

Post a Comment