The ffill
method in Pandas is used to forward-fill missing values in a Series
or DataFrame
.
This method replaces missing values with the last non-null value in the Series
or DataFrame
, and is commonly used to fill in missing data in time series data.
For example, consider the following Series
object:
import pandas as pd
s = pd.Series([1, 2, None, 4, 5, None, 7, 8])
This Series
has eight elements, with the third and sixth elements being None
(i.e. missing values).
To forward-fill the missing values in this Series
, you could do the following:
import pandas as pd
s = pd.Series([1, 2, None, 4, 5, None, 7, 8])
# Forward-fill the missing values
s_filled = s.ffill()
# Print the resulting Series
print(s_filled)
0 1.0
1 2.0
2 2.0
3 4.0
4 5.0
5 5.0
6 7.0
7 8.0
dtype: float64
In the code above, the ffill
method is applied to the Series
, which replaces the missing values with the last non-null value in the Series
.
The result is a new Series
object with the missing values filled in. In this case, the resulting Series
has the values 1, 2, 2, 4, 5, 5, 7, 8
.
You can also use the ffill
method on a DataFrame
object to forward-fill missing values in multiple columns.
For example, consider the following DataFrame
:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, None, 4, 5, None, 7, 8],
'B': [10, 20, 30, None, 50, 60, None, 80],
'C': [100, 200, 300, 400, None, 600, 700, 800]
})
This DataFrame
has three columns A
, B
, and C
, with eight rows of data. Some of the elements in the DataFrame
are missing values, indicated by None
.
To forward-fill the missing values in this DataFrame
, you could do the following:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, None, 4, 5, None, 7, 8],
'B': [10, 20, 30, None, 50, 60, None, 80],
'C': [100, 200, 300, 400, None, 600, 700, 800]
})
# Forward-fill the missing values
df_filled = df.ffill()
# Print the resulting DataFrame
print(df_filled)
In the code above, the ffill
method is applied to the DataFrame
, which replaces the missing values with the last non-null value in each column.
The result is a new DataFrame
object with the missing values filled in:
A B C
0 1.0 10.0 100.0
1 2.0 20.0 200.0
2 2.0 30.0 300.0
3 4.0 30.0 400.0
4 5.0 50.0 400.0
5 5.0 60.0 600.0
6 7.0 60.0 700.0
7 8.0 80.0 800.0
Related tutorials curated for you
How to normalize a column in Pandas
How to groupby, then sort within groups in Pandas
How to get the number of columns in a Pandas DataFrame
How to join two DataFrames in Pandas
How to fix: ValueError: pandas cannot reindex from a duplicate axis
How to convert a series to a NumPy array in Pandas
How to calculate the variance in Pandas DataFrame
How to drop an index column in Pandas
How to apply a function to multiple columns in Pandas
Pandas read SQL
How to calculate the standard deviation in Pandas DataFrame
What does factorize() do in Pandas?