Few-Shot Learning
Few-Shot Learning involves training a machine learning model with a minimal amount of data, enabling it to make predictions with just a few examples at inference time, leveraging the knowledge learned by Large Language Models during their pre- training on extensive text datasets. This allows the model to generalise and understand new, related tasks with only a small number of examples.
Few-Shot NLP examples consist of three key components:
- The task description, which defines what the model should do (e.g., “Translate English to French”)
- The examples that demonstrate the expected predictions (e.g., “sea otter => loutre de mer”)
- The prompt, which is an incomplete example that the model completes by generating the missing text (e.g., “cheese => “).
Creating effective few-shot examples can be challenging, as the formulation and wording of the examples can significantly impact the model’s performance. Models, especially smaller ones, are sensitive to the specifics of how the examples are written.
To optimise Few-Shot Learning in production, a common approach is to learn a shared representation for a task and then train task-specific classifiers on top of this representation.
OpenAI’s research, as demonstrated in the GPT-3 Paper, indicates that the few-shot prompting ability improves as the number of parameters in the language model increases. This suggests that larger models tend to exhibit better few-shot learning capabilities.