Adds the tokenizer configuration file
#80
by
						
lysandre
	
							HF Staff
						- opened
							
					
The tokenizer configuration file is missing and therefore leading to unforeseen errors after the migration of the canonical models.
Refer to the following issue for more information: transformers#29050
The current failing code is the following:
>>> previous_tokenizer = AutoTokenizer.from_pretrained("gpt2")
>>> current_tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt2")
>>> print(previous_tokenizer.model_max_length, current_tokenizer.model_max_length)
1000000000000000019884624838656, 1024
This is the result after the fix:
>>> previous_tokenizer = AutoTokenizer.from_pretrained("gpt2")
>>> current_tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt2")
>>> print(previous_tokenizer.model_max_length, current_tokenizer.model_max_length)
1024, 1024
lysandre
	
				
		changed pull request status to
		open
			
lysandre
	
				
		changed pull request status to
		merged
			

