To calculate the standard deviation for a column in a Pandas DataFrame, you can use the std
method.
This method is applied to a Series
object and returns the standard deviation for the elements in that Series
.
For example, consider the following DataFrame:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50],
'C': [100, 200, 300, 400, 500]
})
This DataFrame has three columns A
, B
, and C
, with five rows of data.
To calculate the standard deviation for a specific column, you could do the following:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50],
'C': [100, 200, 300, 400, 500]
})
# Calculate the standard deviation for column B
std = df['B'].std()
# Print the resulting value
print(std)
15.811388300841896
In the code above, the std
method is applied to the B
column of the DataFrame, which calculates the standard deviation for the elements in that column.
In this case, the resulting standard deviation is 15.811388300841896.
You can also specify multiple columns when using the std
method.
For example, if you wanted to calculate the standard deviation for both columns B
and C
, you could do the following:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50],
'C': [100, 200, 300, 400, 500]
})
# Calculate the standard deviation for columns B and C
std = df[['B', 'C']].std()
# Print the resulting Series
print(std)
B 15.811388
C 158.113883
dtype: float64
In the code above, the std
method is applied to the B
and C
columns of the DataFrame, which calculates the standard deviation for the elements in those columns.
The result is a new Series
object containing the standard deviation for each column.
You can also use the std
method to calculate the standard deviation for the entire DataFrame.
To do this, you can use the apply
method in combination with the std
method, as shown in the following example:
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50],
'C': [100, 200, 300, 400, 500]
})
# Calculate the standard deviation for the entire DataFrame
std = df.apply(lambda x: x.std())
# Print the resulting Series
print(std)
A 1.581139
B 15.811388
C 158.113883
dtype: float64
Related tutorials curated for you
How to calculate covariance in Pandas
How to reshape a Pandas DataFrame
How to calculate the standard deviation in Pandas DataFrame
How to get the number of columns in a Pandas DataFrame
How to drop an index column in Pandas
How to use nunique() in Pandas
How to drop duplicate rows in Pandas
How to split a Pandas DataFrame by a column value
How to select multiple columns in Pandas
How to change the order of columns in Pandas
How to sort by two columns in Pandas
How to use Timedelta in Pandas