The diff()
function in Pandas is used to compute the difference between consecutive elements in a series or dataframe.
This is often used to calculate the change or difference in values between adjacent rows or columns.
Here's an example of using the diff()
function in Pandas:
import pandas as pd
# create a sample series
s = pd.Series([1, 2, 3, 4, 5])
# compute the difference between consecutive elements in the series
s_diff = s.diff()
# display the result
print(s_diff)
This will compute the difference between consecutive elements in the series and return a new series with the same index as the original series.
The output will be:
0 NaN
1 1.0
2 1.0
3 1.0
4 1.0
dtype: float64
Note that the first element in the result is NaN
, because there is no previous element to calculate the difference with.
You can use the dropna()
function to drop the missing values, like this:
# compute the difference between consecutive elements in the series
s_diff = s.diff().dropna()
# display the result
print(s_diff)
This will compute the difference between consecutive elements in the series and return a new series without the missing values.
The output will be:
1 1.0
2 1.0
3 1.0
4 1.0
dtype: float64
You can also use the diff()
function on a dataframe to compute the difference between consecutive elements in one or more columns.
For example:
# create a sample dataframe
df = pd.DataFrame({"A": [1, 2, 3, 4, 5],
"B": [2, 3, 6, 9, 10]})
# compute the difference between consecutive elements in the A and B columns
df_diff = df[["A", "B"]].diff()
# display the result
print(df_diff)
This will compute the difference between consecutive elements in the A
and B
columns in the dataframe and return a new dataframe with the same index as the original dataframe.
The output will be:
A B
0 NaN NaN
1 1.0 1.0
2 1.0 3.0
3 1.0 3.0
4 1.0 1.0
As you can see, the diff()
function is useful for computing the difference between consecutive elements in a series or dataframe, which can be useful for a variety of analysis and visualization tasks.
Related tutorials curated for you
How to get the absolute value for a column in Pandas
How to apply a function to multiple columns in Pandas
How to use Timedelta in Pandas
How to use str.split() in Pandas
What is isna() in Pandas?
How to use qcut() in Pandas
Pandas read SQL
How to make a crosstab in Pandas
How to concatenate in Pandas
How to get the first row in Pandas
What does Diff() do in Pandas?
How to use str.contains() in Pandas