BUG: read_excel with Workbook and engine=”openpyxl” raises ValueError

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.

Code Sample

from openpyxl import load_workbook
from pandas import read_excel

wb = load_workbook("testfile.xlsx")
read_excel(wb, engine="openpyxl")

Problem description

In pandas 1.1.5, the above code completes with no problems.

In pandas 1.2.1, it causes the following exception:

Traceback (most recent call last):
  File "c:/Users/akaijanaho/scratch/pandas-openpyxl-bug/bug.py", line 5, in <module>
    read_excel(wb, engine="openpyxl")
  File "C:\Users\akaijanaho\scratch\pandas-openpyxl-bug\venv\lib\site-packages\pandas\util\_decorators.py", line 299, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\akaijanaho\scratch\pandas-openpyxl-bug\venv\lib\site-packages\pandas\io\excel\_base.py", line 336, in read_excel
    io = ExcelFile(io, storage_options=storage_options, engine=engine)
  File "C:\Users\akaijanaho\scratch\pandas-openpyxl-bug\venv\lib\site-packages\pandas\io\excel\_base.py", line 1057, in __init__
    ext = inspect_excel_format(
  File "C:\Users\akaijanaho\scratch\pandas-openpyxl-bug\venv\lib\site-packages\pandas\io\excel\_base.py", line 938, in inspect_excel_format
    with get_handle(
  File "C:\Users\akaijanaho\scratch\pandas-openpyxl-bug\venv\lib\site-packages\pandas\io\common.py", line 558, in get_handle
    ioargs = _get_filepath_or_buffer(
  File "C:\Users\akaijanaho\scratch\pandas-openpyxl-bug\venv\lib\site-packages\pandas\io\common.py", line 371, in _get_filepath_or_buffer
    raise ValueError(msg)
ValueError: Invalid file path or buffer object type: <class 'openpyxl.workbook.workbook.Workbook'>

The documentation does not specify Workbook as an acceptable value type for io, but supporting it seems reasonable and accords with the 1.1.5 behavior.

In my use case, we mainly parse an Excel file with openpyxl but use pandas with a specific sub-problem. We would like to reuse the same Workbook instead of having pandas re-read the file.

1 possible answer(s) on “BUG: read_excel with Workbook and engine=”openpyxl” raises ValueError

  1. Depending on which behavior is expected, this simple elif-patch is probably not enough. If a user provides a workbook compatible with one of the engines but does not specify an engine explicitly, do we need to auto-detect the engine from the workbook type? If we need that, it should probably go into inspect_excel_format.

    The read_excel documentation is quite explicit about this point: “engine str, default None If io is not a buffer or path, this must be set to identify io”