Coding Ref

How to use ffill() in Pandas

How to use ffill() in Pandas

The ffill method in Pandas is used to forward-fill missing values in a Series or DataFrame.

This method replaces missing values with the last non-null value in the Series or DataFrame, and is commonly used to fill in missing data in time series data.

For example, consider the following Series object:

main.py
import pandas as pd

s = pd.Series([1, 2, None, 4, 5, None, 7, 8])

This Series has eight elements, with the third and sixth elements being None (i.e. missing values).

To forward-fill the missing values in this Series, you could do the following:

main.py
import pandas as pd

s = pd.Series([1, 2, None, 4, 5, None, 7, 8])

# Forward-fill the missing values
s_filled = s.ffill()

# Print the resulting Series
print(s_filled)
output
0    1.0
1    2.0
2    2.0
3    4.0
4    5.0
5    5.0
6    7.0
7    8.0
dtype: float64

In the code above, the ffill method is applied to the Series, which replaces the missing values with the last non-null value in the Series.

The result is a new Series object with the missing values filled in. In this case, the resulting Series has the values 1, 2, 2, 4, 5, 5, 7, 8.

You can also use the ffill method on a DataFrame object to forward-fill missing values in multiple columns.

For example, consider the following DataFrame:

main.py
import pandas as pd

df = pd.DataFrame({
    'A': [1, 2, None, 4, 5, None, 7, 8],
    'B': [10, 20, 30, None, 50, 60, None, 80],
    'C': [100, 200, 300, 400, None, 600, 700, 800]
})

This DataFrame has three columns A, B, and C, with eight rows of data. Some of the elements in the DataFrame are missing values, indicated by None.

To forward-fill the missing values in this DataFrame, you could do the following:

main.py
import pandas as pd

df = pd.DataFrame({
    'A': [1, 2, None, 4, 5, None, 7, 8],
    'B': [10, 20, 30, None, 50, 60, None, 80],
    'C': [100, 200, 300, 400, None, 600, 700, 800]
})

# Forward-fill the missing values
df_filled = df.ffill()

# Print the resulting DataFrame
print(df_filled)

In the code above, the ffill method is applied to the DataFrame, which replaces the missing values with the last non-null value in each column.

The result is a new DataFrame object with the missing values filled in:

output
     A     B      C
0  1.0  10.0  100.0
1  2.0  20.0  200.0
2  2.0  30.0  300.0
3  4.0  30.0  400.0
4  5.0  50.0  400.0
5  5.0  60.0  600.0
6  7.0  60.0  700.0
7  8.0  80.0  800.0

You'll also like

Related tutorials curated for you

    How to normalize a column in Pandas

    How to groupby, then sort within groups in Pandas

    How to get the number of columns in a Pandas DataFrame

    How to join two DataFrames in Pandas

    How to fix: ValueError: pandas cannot reindex from a duplicate axis

    How to convert a series to a NumPy array in Pandas

    How to calculate the variance in Pandas DataFrame

    How to drop an index column in Pandas

    How to apply a function to multiple columns in Pandas

    Pandas read SQL

    How to calculate the standard deviation in Pandas DataFrame

    What does factorize() do in Pandas?