Coding Ref

How to use astype() in Pandas

How to use astype() in Pandas

The astype() function in Pandas is used to change the data type of one or more columns in a dataframe.

Here's an example:

main.py
import pandas as pd

# create a sample dataframe
df = pd.DataFrame([
  [1, 2, 3],
  [4, 5, 6],
  [7, 8, 9]])

# change the data type of the 2nd column to float
df[1] = df[1].astype(float)

# check the data types of the columns
print(df.dtypes)

In this example, the data type of the 2nd column (index 1) is changed to float, and the dtypes attribute is used to check the data types of all the columns in the dataframe. The output will be:

output
0      int64
1    float64
2      int64
dtype: object

Change the data type of multiple columns

You can also use astype() to change multiple columns at once by passing a dictionary to the columns parameter. The keys of the dictionary should be the column names, and the values should be the desired data types.

For example:

main.py
# change the data type of multiple columns
df = df.astype({"0": float, "2": str})

# check the data types of the columns
print(df.dtypes)

This will change the data type of the 1st and 3rd columns (index 0 and 2) to float and str, respectively.

The output will be:

output
0    float64
1    int64
2    object
dtype: object

Change the data type of the entire DataFrame

You can also use astype() to change the data type of the entire DataFrame by calling it on the dataframe itself, rather than a column.

This will change the data type of all the columns to the specified data type:

main.py
# change the data type of the entire DataFrame
df = df.astype(int)

# check the data types of the columns
print(df.dtypes)

In this case, the output will be:

output
0    int64
1    int64
2    int64
dtype: object

Converting to an incompatible data type

If you use the astype() method in pandas to convert a column to a data type that is not compatible with the values in that column, you will get a ValueError exception.

This is because the astype() method can only convert a column to a data type that is compatible with the values in the column.

If you try to convert a column to a data type that is not compatible with the values in that column, an error will be raised.

For example, trying to convert a column of strings to int will raise a ValueError if any of the strings cannot be converted to an integer.

You can use the errors parameter to specify how to handle such errors.

Possible values for errors are 'raise' (the default), 'ignore', and 'coerce'.

The 'ignore' option will simply leave the values that cannot be converted unchanged, while the 'coerce' option will replace such values with NaN (missing values).

For example:

main.py
# create a sample DataFrame
df = pd.DataFrame([[1, 2, 3], [4, 5, 6], ["a", "b", "c"]])


# try to convert a column of strings to int
df[2] = df[2].astype(int, errors='ignore')

# check the data types of the columns
print(df.dtypes)

In this case, the output will be:

output
0    object
1    object
2    object
dtype: object

You'll also like

Related tutorials curated for you

    What is idxmax() in Pandas?

    How to get the number of columns in a Pandas DataFrame

    How to convert a Pandas Index to a List

    How to GroupBy Index in Pandas

    How to use where() in Pandas

    How to make a crosstab in Pandas

    How to reorder columns in Pandas

    How to apply a function to multiple columns in Pandas

    What is date_range() in Pandas?

    What does Head() do in Pandas?

    How to read a TSV file in Pandas

    How to use ewm() in Pandas