[jit] Can not pickle torch.futures.Future

Hello, I was trying to test the serialization of torch.futures.Future and got an error.

To reproduce the error:

  1. My local master branch information: git sl: c44b1de54e peterghost86 (master*, HEAD)
  2. I’m running the following:
import torch
from torch.futures import Future
from torch.testing._internal.common_utils import TemporaryFileName
fut = Future[int]()
with TemporaryFileName() as fname:
    torch.save(fut, fname)
  1. I get the following error:
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/home/sinannasir/local/pytorch/torch/serialization.py", line 372, in save
    _save(obj, opened_zipfile, pickle_module, pickle_protocol)
  File "/home/sinannasir/local/pytorch/torch/serialization.py", line 476, in _save
    pickler.dump(obj)
RuntimeError: Can not pickle torch.futures.Future

Am I doing something wrong, or could this be a bug? Thanks,

cc @suo @gmagogsfm @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @xush6528 @osalpekar @jiayisuse @agolynski

1 possible answer(s) on “[jit] Can not pickle torch.futures.Future

  1. @sinannasir, that error message is actually part of the test, and is an expected log message when that test runs, if you look at the log, the test actually passed and just logged that error message. It is a bit confusing since Dr. CI picks it up as the failure reason. It looks like the actual CI error in that PR is coming from:

    Aug 17 18:06:29 ======================================================================
    Aug 17 18:06:29 ERROR [61.826s]: test_backward_ddp_inside (__main__.ProcessGroupDdpUnderDistAutogradTestWithSpawn)
    Aug 17 18:06:29 ----------------------------------------------------------------------
    Aug 17 18:06:29 Traceback (most recent call last):
    Aug 17 18:06:29   File "/Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/testing/_internal/common_distributed.py", line 223, in wrapper
    Aug 17 18:06:29     self._join_processes(fn)
    Aug 17 18:06:29   File "/Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/testing/_internal/common_distributed.py", line 330, in _join_processes
    Aug 17 18:06:29     self._check_return_codes(elapsed_time)
    Aug 17 18:06:29   File "/Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/testing/_internal/common_distributed.py", line 363, in _check_return_codes
    Aug 17 18:06:29     raise RuntimeError(error)
    Aug 17 18:06:29 RuntimeError: Processes 5 exited with error code 10
    Aug 17 18:06:29 
    Aug 17 18:06:29 --------------------------------------------------------------------
    

    Which looks like it is a known flaky test: #40434