In order to typecast an integer to string in pyspark we will be using cast() function with StringType() as argument, To typecast string to integer in pyspark we will be using cast() function with IntegerType() as argument. Let’s see an example of type conversion or casting of integer column to string column or character column and string column to integer column or numeric column in pyspark.
- Type cast an integer column to string column in pyspark
- Type cast a string column to integer column in pyspark
We will be using the dataframe named df_cust
Typecast an integer column to string column in pyspark:
First let’s get the datatype of zip column as shown below
### Get datatype of zip column df_cust.select("zip").dtypes
so the resultant data type of zip column is integer
Now let’s convert the zip column to string using cast() function with StringType() passed as an argument which converts the integer column to character or string column in pyspark and it is stored as a dataframe named output_df
########## Type cast an integer column to string column in pyspark from pyspark.sql.types import StringType output_df = df_cust.withColumn("zip",df_cust["zip"].cast(StringType()))
Now let’s get the datatype of zip column as shown below
### Get datatype of zip column output_df.select("zip").dtypes
so the resultant data type of zip column is String
Typecast String column to integer column in pyspark:
First let’s get the datatype of zip column as shown below
### Get datatype of zip column output_df.select("zip").dtypes
so the data type of zip column is String
Now let’s convert the zip column to integer using cast() function with IntegerType() passed as an argument which converts the character column or string column to integer column in pyspark and it is stored as a dataframe named output_df
########## Type cast string column to integer column in pyspark from pyspark.sql.types import IntegerType output_df = output_df.withColumn("zip",output_df["zip"].cast(IntegerType()))
Now let’s get the datatype of zip column as shown below
### Get datatype of zip column output_df.select("zip").dtypes
So the resultant data type of zip column is integer
Other Related Topics :
- Typecast string to date and date to string in Pyspark
- Extract First N and Last N character in pyspark
- Convert to upper case, lower case and title case in pyspark
- Add leading zeros to the column in pyspark
- Concatenate two columns in pyspark
- Simple random sampling and stratified sampling in pyspark – Sample(), SampleBy()
- Join in pyspark (Merge) inner , outer, right , left join in pyspark
- Get duplicate rows in pyspark
- Quantile rank, decile rank & n tile rank in pyspark – Rank by Group
- Populate row number in pyspark – Row number by Group
- Percentile Rank of the column in pyspark
- Mean of two or more columns in pyspark
- Sum of two or more columns in pyspark
- Row wise mean, sum, minimum and maximum in pyspark
- Rename column name in pyspark – Rename single and multiple column
- Typecast Integer to Decimal and Integer to float in Pyspark
- Get number of rows and number of columns of dataframe in pyspark