Coding Ref

What does Diff() do in Pandas?

What does Diff() do in Pandas?

The diff() function in Pandas is used to compute the difference between consecutive elements in a series or dataframe.

This is often used to calculate the change or difference in values between adjacent rows or columns.

Here's an example of using the diff() function in Pandas:

main.py
import pandas as pd

# create a sample series
s = pd.Series([1, 2, 3, 4, 5])

# compute the difference between consecutive elements in the series
s_diff = s.diff()

# display the result
print(s_diff)

This will compute the difference between consecutive elements in the series and return a new series with the same index as the original series.

The output will be:

output
0    NaN
1    1.0
2    1.0
3    1.0
4    1.0
dtype: float64

Note that the first element in the result is NaN, because there is no previous element to calculate the difference with.

You can use the dropna() function to drop the missing values, like this:

main.py
# compute the difference between consecutive elements in the series
s_diff = s.diff().dropna()

# display the result
print(s_diff)

This will compute the difference between consecutive elements in the series and return a new series without the missing values.

The output will be:

output
1    1.0
2    1.0
3    1.0
4    1.0
dtype: float64

You can also use the diff() function on a dataframe to compute the difference between consecutive elements in one or more columns.

For example:

main.py
# create a sample dataframe
df = pd.DataFrame({"A": [1, 2, 3, 4, 5],
                   "B": [2, 3, 6, 9, 10]})

# compute the difference between consecutive elements in the A and B columns
df_diff = df[["A", "B"]].diff()

# display the result
print(df_diff)

This will compute the difference between consecutive elements in the A and B columns in the dataframe and return a new dataframe with the same index as the original dataframe.

The output will be:

output
     A    B
0  NaN  NaN
1  1.0  1.0
2  1.0  3.0
3  1.0  3.0
4  1.0  1.0

As you can see, the diff() function is useful for computing the difference between consecutive elements in a series or dataframe, which can be useful for a variety of analysis and visualization tasks.

You'll also like

Related tutorials curated for you

    How to get the absolute value for a column in Pandas

    How to apply a function to multiple columns in Pandas

    How to use Timedelta in Pandas

    How to use str.split() in Pandas

    What is isna() in Pandas?

    How to use qcut() in Pandas

    Pandas read SQL

    How to make a crosstab in Pandas

    How to concatenate in Pandas

    How to get the first row in Pandas

    What does Diff() do in Pandas?

    How to use str.contains() in Pandas