Hyperparameter optimization for fine-tuning pre-trained transformer models from Hugging Face

Favorite Large attention-based transformer models have obtained massive gains on natural language pr
