Pandas in Python – Part 2


Hello friends
I hope you guys are doing good. Today we will go a bit deeper into our panda’s practice. If you have not yet check out the basics of Pandas, you can go with this link Pandas in Python – Part 1.
In this blog, we will learn some basic computations which will help us in our day-to-day life to manage big data and search from that, so let’s get started.
So let’s say, we have the below dataset, and we want to know whose age is highest or whose age is lowest, so for that, we have min() and max() methods in Pandas, by using that, we can directly get the data required data.

firstDataset= {
  'name': ["Jhon", "Krish", "Jiya", "Charls", "Kristin", "Rey"],
  'age': [19, 21, 25, 23, 21, 24]
}

a = pd.DataFrame(firstDataset)

a.max()

Output,

Now you just want the maximum number and not the whole record then you can write below code,

a['age'].max()

Output,

25

For minimum age record,

a.min()

Output,

To get the value of minimum age,

a['age'].min()

Output,

19

Now, if you want to know the statistical data of your dataset, then you can use describe method. describe method will give you the count of records, minimum and maximum integer value, percentage-wise values, the mean and standard deviation in your dataset.

a.describe()

Output,

If I want to know that how many people are above a specific age, then I can use condition stated below,

gt_20 = a[a['age'] > 20]
print(gt_20)

Output,

Just like ‘>’, we can also use ‘==’, ‘>=’, ‘<‘, ‘<=’ and all the other conditions we use in mathematics to find the specific data of our requirement.

We can also get the specific value record, for example, we want the name of the person with age ’21’ and ’23’ then we can get them by writing the below code,

getData = a[a['age'].isin([21, 24])]
print(getData)

Output,

Now I want the names of the person whose age is greater than 21, for that, I can write below code,

names = a.loc[a['age'] > 21, 'name']
print(names)

Output,

Now I want records that do not include null values in data, so for that, I can use the below code,

firstDataset= {
  'name': ["Jhon", "Krish", "Jiya", "Charls", "Kristin", "Rey"],
  'age': [19, 21, 25, None , 21, 24]
}

a = pd.DataFrame(firstDataset)

not_na = a[a['age'].notna()]
print(not_na)

Output,

And as ending of this blog, we will see if we want to fetch a specific number of lines or range from the dataset, then what we can do.

firstDataset= {
  'name': ["Jhon", "Krish", "Jiya", "Charls", "Kristin", "Rey"],
  'age': [19, 21, 25, 21 , 21, 24]
}

a = pd.DataFrame(firstDataset)

a.iloc[2:5]

Output,

I hope you have enjoyed this blog so far, we will learn more details regarding this amazing library Pandas.

Thank you for reading 🙂

Happy coding 🙂

One thought on “Pandas in Python – Part 2

Comments are closed.

Create a website or blog at WordPress.com

Up ↑

Design a site like this with WordPress.com
Get started