Pandas Sort Function

Upgrade Your Skills, Upgrade Your Career - Learn more

Sort Function – Sorting Data Types in Pandas

Pandas is a widely used data manipulation library in Python that helps in data cleaning, preparation, and analysis. One of the essential operations in data analysis is sorting data. Sorting helps to order data in a specific way, which makes it easier to analyze and draw conclusions. Pandas sort function provides an efficient way to sort data in Python. In this article, we will discuss how to use the Pandas Sort function to sort data in Python.

The Pandas Sort function helps in sorting data based on one or more columns in the dataframe. It is a versatile function that provides many options for sorting data. The Sort function sorts the data in ascending or descending order by default.

Syntax – “df.sort_values(by, axis=0, ascending=True, inplace=False, kind=’quicksort’, na_position=’last’, ignore_index=False)”

Parameters:

  • by: Name or list of names to sort the dataframe by.
  • axis: 0 or ‘index’ for rows, one or ‘columns’ for columns.
  • ascending: Boolean or list of booleans to specify the sorting order.
  • inplace: Boolean, makes changes in the dataframe itself if True.
  • kind: Type of sorting algorithm to use. It can be quicksort, mergesort, heapsort or any valid numpy.sort kind value.
  • na_position: Position of null values. It can be ‘last’ or ‘first’.
  • ignore_index: Boolean, if True, renumbers the index after sorting.

Creating a Pandas DataFrame:

A Pandas DataFrame is a 2-dimensional table-like data structure, where each column can have different data types. It can be created using various methods like loading data from an external file, manually creating an empty dataframe, or converting a list, dictionary, or numpy array to a dataframe.

1. Creating an empty DataFrame:

An empty dataframe can be created using the pd.DataFrame() function.

import pandas as pd
df = pd.DataFrame()
print(df)

Output:

Empty DataFrame
Columns: []
Index: []

2. Creating a DataFrame from a list:

A list of values can be converted into a dataframe using the pd.DataFrame() function.

import pandas as pd
data = [['Alice', 25], ['Bob', 30], ['Charlie', 35]]
df = pd.DataFrame(data, columns=['Name', 'Age'])
print(df)

Output:

Name Age
0 Alice 25
1 Bob 30
2 Charlie 35

3. Creating a DataFrame from a dictionary:

A dictionary can be converted into a dataframe using the pd.DataFrame() function.

import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
print(df)

Output:

Name Age
0 Alice 25
1 Bob 30
2 Charlie 35

4. Loading data from an external file:

Pandas can read data from various file formats like CSV, Excel, SQL, etc. using functions like pd.read_csv(), pd.read_excel(), pd.read_sql() etc.

import pandas as pd
df = pd.read_csv('data.csv')
print(df)

Output:
Name Age Gender
0 Alice 25 F
1 Bob 30 M
2 Carl 35 M
3 Dave 40 M
4 Emily 45 F

Sorting in Ascending Order:

To sort data in ascending order, we can simply call the sort_values function on the dataframe

import pandas as pd
df = pd.read_csv('data.csv')
df_sorted = df.sort_values(by='column_name')
print(df_sorted.head())

Output:

column_name
3 1
0 2
2 3
4 4
1 5

Sorting in Descending Order:

To sort data in descending order, we can set the ascending parameter to False

import pandas as pd
df = pd.read_csv('data.csv')
df_sorted = df.sort_values(by='column_name', ascending=False)
print(df_sorted.head())

Output:
column_name
1 5
4 4
2 3
0 2
3 1

Sorting by Multiple Columns:

We can sort data based on multiple columns by passing a list of column names to the by parameter.

import pandas as pd
df = pd.read_csv('data.csv')
df_sorted = df.sort_values(by=['column_1', 'column_2'])
print(df_sorted.head())

Output:
column_1 column_2 column_3
1 1 5 8
3 1 9 6
2 2 3 1
0 3 7 10

Sorting Numeric Data:

Numeric data can be sorted based on their values using the sort_values function.

import pandas as pd
df = pd.read_csv('data.csv')
df_sorted = df.sort_values(by='numeric_column')
print(df_sorted.head())

Output:
column_1 column_2 numeric_column
3 grault garply 1
1 baz qux 2
4 waldo fred 4
0 foo bar 5
2 quux corge 7

Sorting Categorical Data:

Categorical data can be sorted based on their category values using the sort_values function.

import pandas as pd
df = pd.read_csv('data.csv')
df_sorted = df.sort_values(by='category_column')
print(df_sorted.head())

Output:
category_column column_1 column_2
2 A 5 4
0 B 1 2
3 B 6 1
1 C 3 3

Conclusion

In this article, we have discussed how to use the Pandas Sort function to sort data in Python. Sorting is an essential operation in data analysis that helps to organize data and make it easier to analyze. The Pandas Sort function provides many options for sorting data based on one or more columns in a dataframe. The Sort function is a versatile function that can be used to sort numeric and categorical data, and also sort based on multiple columns.

Did you like this article? If Yes, please give PythonGeeks 5 Stars on Google | Facebook


PythonGeeks Team

The PythonGeeks Team offers industry-relevant Python programming tutorials, from web development to AI, ML and Data Science. With a focus on simplicity, we help learners of all backgrounds build their coding skills.

Leave a Reply

Your email address will not be published. Required fields are marked *