Coding Ref

To normalize a column in Pandas, you can use the `apply`

method to apply a normalization function to the column. This method allows you to apply a function to each element in the column, and return a new `Series`

object containing the normalized values.

For example, consider the following DataFrame:

main.py

```
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50],
'C': [100, 200, 300, 400, 500]
})
df
```

output

```
A B C
0 1 10 100
1 2 20 200
2 3 30 300
3 4 40 400
4 5 50 500
```

This DataFrame has three columns `A`

, `B`

, and `C`

, with five rows of data.

To normalize the values in a specific column, you could do the following:

main.py

```
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50],
'C': [100, 200, 300, 400, 500]
})
# Define a normalization function
def normalize(x):
return (x - x.min()) / (x.max() - x.min())
# Apply the normalization function to columns B and C
normalized = df[['B']].apply(normalize)
# Print the resulting DataFrame
print(normalized)
```

output

```
B
0 0.00
1 0.25
2 0.50
3 0.75
4 1.00
```

In the code above, a normalization function is defined that takes a `Series`

object as an argument, and returns a new `Series`

object containing the normalized values.

This function calculates the minimum and maximum values for the `Series`

, and then applies the normalization formula `(x - x.min()) / (x.max() - x.min())`

to each element in the `Series`

.

The `normalize`

function is then applied to the `B`

column of the DataFrame using the `apply`

method. This applies the normalization function to each element in the `B`

column, and returns a new `Series`

object containing the normalized values. In this case, the resulting `Series`

has the values 0.0, 0.25, 0.5, 0.75, and 1.0.

You can also specify multiple columns when using the `apply`

method to normalize data in a Pandas DataFrame.

For example, if you wanted to normalize both columns `B`

and `C`

, you could do the following:

```
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [10, 20, 30, 40, 50],
'C': [100, 200, 300, 400, 500]
})
# Define a normalization function
def normalize(x):
return (x - x.min()) / (x.max() - x.min())
# Apply the normalization function to columns B and C
normalized = df[['B', 'C']].apply(normalize)
# Print the resulting DataFrame
print(normalized)
```

output

```
B C
0 0.00 0.00
1 0.25 0.25
2 0.50 0.50
3 0.75 0.75
4 1.00 1.00
```

In the code above, the normalization function is applied to the `B`

and `C`

columns of the DataFrame using the `apply`

method.

Related tutorials curated for you

How to shuffle data in Pandas

How to change the order of columns in Pandas

How to groupby mean in Pnadas

How to use ffill() in Pandas

How to round in Pandas

How to use Timedelta in Pandas

How to apply a function to multiple columns in Pandas

How to groupby, then sort within groups in Pandas

How to normalize a column in Pandas

How to select multiple columns in Pandas

How to calculate the standard deviation in Pandas DataFrame

How to write a Pandas DataFrame to SQL