Code Sample
pd.Series(['41.0', '40.0', '35.0', '30.0']).str.replace('.0', '')
0 41
1
2 35
3
dtype: object
Problem description
Pandas replaces the string preceding ‘.0’ with the string assigned to repl if the preceding string contains a 0 immediately before the decimal point. This behavior appears to be inconsistent with python’s str.replace.
[s.replace('.0', '') for s in ['41.0', '40.0', '35.0', '30.0']]
['41', '40', '35', '30']
Expected Output
pd.Series(['41', '40', '35', '30'])
0 41
1 40
2 35
3 30
Output of pd.show_versions()
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.12.final.0
python-bits: 32
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.22.0
pytest: None
pip: 9.0.3
setuptools: 39.0.1
Cython: None
numpy: 1.14.2
scipy: 1.0.1
pyarrow: None
xarray: None
IPython: 5.0.0
sphinx: None
patsy: None
dateutil: 2.7.2
pytz: 2018.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 1.0.2
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
None
Replace works off of a pattern / regex. Keep in mind that
'.0'
as a regex means “match any character followed by a 0”, hence why your second and fourth entries are getting fully replaced as they match the pattern twice.If you wanted to match a literal period you need to escape it in your regex