To reshape a Pandas DataFrame, you can use the pivot()
and pivot_table()
functions.
These functions allow you to rearrange the data in a DataFrame, by creating new columns, rows, or levels based on the values in the original DataFrame.
pivot()
The pivot()
method allows you to reshape a DataFrame by transforming the data into a new format, with different columns for each unique value in a specified column.
import pandas as pd
df = pd.DataFrame({'A': ['apple', 'banana', 'orange'],
'B': ['red', 'yellow', 'orange'],
'C': [1, 2, 3]})
# display the DataFrame
A B C
0 apple red 1
1 banana yellow 2
2 orange orange 3
This dataframe has three columns: A
, B
, and C
. The pivot()
method allows you to reshape the dataframe by transforming the data into a new format, with different columns for each unique value in a specified column.
For example, suppose we want to create a new DataFrame with columns A
and B
as the index, and C
as the values.
We can use the pivot()
method as follows:
# reshape the dataframe using the pivot() method
df_pivot = df.pivot(index='A', columns='B', values='C')
# display the resulting dataframe
df_pivot
This code will produce the following output:
B red yellow orange
A
apple 1 NaN NaN
banana NaN 2 NaN
orange NaN NaN 3
As you can see, the pivot()
method has transformed the data into a new format, with columns B
as the index and C
as the values.
The resulting dataframe has one column for each unique value in column B
, and the values in column C
are placed in the appropriate column based on the corresponding value in column B
.
pivot_table()
For example, suppose we want to create a new dataframe with columns A
and B
as the index, and the sum of C
as the values.
We can use the pivot_table()
method as follows:
import pandas as pd
df = pd.DataFrame({'A': ['apple', 'banana', 'orange'],
'B': ['red', 'yellow', 'orange'],
'C': [1, 2, 3]})
# reshape the dataframe using the pivot_table() method
df_pivot_table = df.pivot_table(index='A', columns='B', values='C', aggfunc=sum)
# display the resulting dataframe
df_pivot_table
B red yellow orange
A
apple 1 NaN NaN
banana NaN 2 NaN
orange NaN NaN 3
The pivot_table()
method has aggregated the data using the sum()
function, and has transformed the data into a new format with columns B
as the index and the sum of C
as the values.
The resulting dataframe has one column for each unique value in column B
, and the sum of the values in column C
are placed in the appropriate column based on the corresponding value in column B
.
Related tutorials curated for you
How to GroupBy Index in Pandas
What is idxmax() in Pandas?
How to find the minimum in Pandas
How to change the order of columns in Pandas
How to use str.split() in Pandas
What is nlargest() in Pandas?
How to read a TSV file in Pandas
How to use ewm() in Pandas
How to split a Pandas DataFrame by a column value
How to convert Pandas timestamp to datetime
How to drop duplicate columns in Pandas
How to give multiple conditions in loc() in Pandas