In order to remove leading zero of column in pyspark, we use regexp_replace() function and we remove consecutive leading zeros. Lets see an example on how to remove leading zeros of the column in pyspark.
- Remove Leading Zeros of column in pyspark
We will be using dataframe df.
Remove leading zero of column in pyspark
We use regexp_replace() function with column name and regular expression as argument and thereby we remove consecutive leading zeros. The regular expression replaces all the leading zeros with ‘ ‘. then stores the result in grad_score_new.
### Remove leading zero of column in pyspark from pyspark.sql.functions import * import pyspark.sql.functions as F df = df.withColumn('grad_Score_new', F.regexp_replace('grad_Score', r'^[0]*', ''))
so the resultant dataframe with leading zeros removed will be
Other Related Topics:
- Left and Right pad of column in pyspark –lpad() & rpad()
- Add Leading and Trailing space of column in pyspark – add space
- Remove Leading, Trailing and all space of column in pyspark – strip & trim space
- String split of the columns in pyspark
- Repeat the column in Pyspark
- Get Substring of the column in Pyspark
- Get String length of column in Pyspark
- Typecast string to date and date to string in Pyspark
- Typecast Integer to string and String to integer in Pyspark
- Extract First N and Last N character in pyspark
- Convert to upper case, lower case and title case in pyspark
- Add leading zeros to the column in pyspark
- Concatenate two columns in pyspark
- Simple random sampling and stratified sampling in pyspark – Sample(), SampleBy()
- Join in pyspark (Merge) inner , outer, right , left join in pyspark
- Get duplicate rows in pyspark
- Quantile rank, decile rank & n tile rank in pyspark – Rank by Group