📕 AI Models Get Smarter by Teaching Other Models
Educators deepen their own knowledge while teaching children. This is because teaching forces a person to structure information clearly and identify gaps. This principle also works with large language models (LLMs).
Chinese scientists from Tsinghua University have adapted the Learning by Teaching (LBT) method for training AI models. In their experiment, the powerful GPT-4 model transferred knowledge to the simpler GPT-3.5.
How It Works
➡️ The Strong Trains the Weak: During the training process, the "teacher" answers questions from the "student" and explains complex concepts that the junior model cannot yet understand.
➡️ Knowledge Generalization: The strong model is forced to formulate answers in a way that the weaker model can understand. This "generalization" prompts the "teacher" to look at its knowledge in a new light, simplifying and restructuring it.
➡️ Improvement of the Strong Model: During training, the "teacher" analyzes its own knowledge. This helps the strong model identify and eliminate its weaknesses or find new ways to solve problems. As a result, it refines algorithms, increases prediction accuracy, and improves overall performance.
❔ Why It Matters
LBT opens up new prospects for the development of AI. OpenAI is already using a new powerful model called Strawberry to train the AI model Orion, which will replace GPT-4. According to insider information, the training is going well. Moreover, one "teacher" can train several "students" at once. Using this approach to improve LLMs helps reduce dependency on human-created data.
📱 You can watch a detailed lecture on the work of the scientists from Tsinghua here.
Related Topics:
➡️ The Best Explanation of What's Happening Inside ChatGPT
➡️ When Will Data for Training LLMs Run Out?
#news #ChatGPT @hiaimediaeen
Educators deepen their own knowledge while teaching children. This is because teaching forces a person to structure information clearly and identify gaps. This principle also works with large language models (LLMs).
Chinese scientists from Tsinghua University have adapted the Learning by Teaching (LBT) method for training AI models. In their experiment, the powerful GPT-4 model transferred knowledge to the simpler GPT-3.5.
How It Works
➡️ The Strong Trains the Weak: During the training process, the "teacher" answers questions from the "student" and explains complex concepts that the junior model cannot yet understand.
➡️ Knowledge Generalization: The strong model is forced to formulate answers in a way that the weaker model can understand. This "generalization" prompts the "teacher" to look at its knowledge in a new light, simplifying and restructuring it.
➡️ Improvement of the Strong Model: During training, the "teacher" analyzes its own knowledge. This helps the strong model identify and eliminate its weaknesses or find new ways to solve problems. As a result, it refines algorithms, increases prediction accuracy, and improves overall performance.
❔ Why It Matters
LBT opens up new prospects for the development of AI. OpenAI is already using a new powerful model called Strawberry to train the AI model Orion, which will replace GPT-4. According to insider information, the training is going well. Moreover, one "teacher" can train several "students" at once. Using this approach to improve LLMs helps reduce dependency on human-created data.
📱 You can watch a detailed lecture on the work of the scientists from Tsinghua here.
Related Topics:
➡️ The Best Explanation of What's Happening Inside ChatGPT
➡️ When Will Data for Training LLMs Run Out?
#news #ChatGPT @hiaimediaeen