Pandas series.str.replace(‘.0’, ”) replaces string preceding decimal point

Code Sample

pd.Series(['41.0', '40.0', '35.0', '30.0']).str.replace('.0', '')
0    41
1      
2    35
3      
dtype: object

Problem description

Pandas replaces the string preceding ‘.0’ with the string assigned to repl if the preceding string contains a 0 immediately before the decimal point. This behavior appears to be inconsistent with python’s str.replace.

[s.replace('.0', '') for s in ['41.0', '40.0', '35.0', '30.0']]
['41', '40', '35', '30']

Expected Output

pd.Series(['41', '40', '35', '30'])
0    41
1    40
2    35
3    30

Output of pd.show_versions()

pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.12.final.0
python-bits: 32
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.22.0
pytest: None
pip: 9.0.3
setuptools: 39.0.1
Cython: None
numpy: 1.14.2
scipy: 1.0.1
pyarrow: None
xarray: None
IPython: 5.0.0
sphinx: None
patsy: None
dateutil: 2.7.2
pytz: 2018.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 1.0.2
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
None

Author: Fantashit

1 thought on “Pandas series.str.replace(‘.0’, ”) replaces string preceding decimal point

  1. Replace works off of a pattern / regex. Keep in mind that '.0' as a regex means “match any character followed by a 0”, hence why your second and fourth entries are getting fully replaced as they match the pattern twice.

    If you wanted to match a literal period you need to escape it in your regex

    In [8]: pd.Series(['41.0', '40.0', '35.0', '30.0']).str.replace('\.0', '')
    Out[8]: 
    0    41
    1    40
    2    35
    3    30
    dtype: object

Comments are closed.