Environment info

  • transformers version: 4.3.3

  • Platform: Linux-4.15.0-29-generic-x86_64-with-glibc2.10

  • Python version: 3.8.8

  • PyTorch version (GPU?): 1.8.0 (False)

  • Tensorflow version (GPU?): not installed (NA)



Model I am using Wav2vec2.0:

The problem arises when using:

import soundfile as sf
import torch
from transformers import AutoTokenizer, AutoModel,Wav2Vec2ForCTC, Wav2Vec2Tokenizer

tokenizer4 = AutoTokenizer.from_pretrained(“facebook/wav2vec2-large-xlsr-53”)
model4 = AutoModel.from_pretrained(“facebook/wav2vec2-large-xlsr-53”)

OSError: Can’t load tokenizer for ‘facebook/wav2vec2-large-xlsr-53’. Make sure that:

  • ‘facebook/wav2vec2-large-xlsr-53’ is a correct model identifier listed on ‘https://huggingface.co/models

  • or ‘facebook/wav2vec2-large-xlsr-53’ is the correct path to a directory containing relevant tokenizer files

The tasks I am working on is:

  • an official wav2vec task: facebook/wav2vec2-large-xlsr-53

To reproduce

Steps to reproduce the behavior:
Follow the instructions

Expected behavior

I try to use xlsr model as the pre-trained model to finetune my own ASR model, but the xlsr model, especially tokenizer, can’t be loaded smoothly. Could you tell me how to modify it? Thank you very much!

