Репост из: Karim Iskakov - канал
GPT-3 is a new SOTA in translation and question-answering and it doesn’t require finetuning. The biggest version has 175 billion parameters (~350gb fp16 weights, over 100 times bigger than GPT-2!)
📝 arxiv.org/abs/2005.14165
📉 @loss_function_porn
📝 arxiv.org/abs/2005.14165
📉 @loss_function_porn