Self-Generated In-Context Learning Explained

Nikolay Donets

Traditional in-context learning for LLMs relies on carefully curated examples, but what if the model could generate its own demonstrations? An idea called Self-Generated In-Context Learning SG-ICL explores this possibility.

How SG-ICL solves the problem of traditional LLM task setting

Typically, when we use LLMs, we rely on a process called in-context learning. We feed the model examples to demonstrate the task we want it to do. These examples are critical – bad ones mean poor results. It can be a lot of work to find the right data, and you never know if what you choose will ultimately lead to the best outcome.

Instead of looking for examples, SG-ICL prompts the LLM to create its own. Here's how it works:

Guided generation. We give the LLM a bit of context. We provide the specific task instructions and the kind of output we expect, basically telling it what it should write.
LLM writes the examples. LLM generates text samples that fit the task.
Learn from within. Self-generated examples go straight back into the context, removing the need for any external data.

graph TD Input@{ shape: lean-r, label: Input } Result@{ shape: lean-r, label: Result } subgraph SG[**Self Generation**] PR@{ shape: lean-r, label: "Here is an input: {{input}} Generate a **POSITIVE** review for this."} PRG@{ shape: lin-rect, label: "Generation of positive examples" } NR@{ shape: lean-r, label: "Here is an input: {{input}} Generate a **NEGATIVE** review for this."} NRG@{ shape: lin-rect, label: "Generation of negative examples" } subgraph Examples[**Examples**] PRR@{ shape: lean-r, label: "Positive Examples" } NRR@{ shape: lean-r, label: "Negative Examples" } end PR --> PRG --> PRR NR --> NRG --> NRR end subgraph Inference[**Inference**] Task@{ shape: lean-r, label: "Your task is {{task}} Examples: {{positive examples}} {{negative examples}}"} TaskResultGeneration@{ shape: lin-rect, label: "Generation of task result" } end Input --> PR Input --> NR PRR --> Task NRR --> Task Task --> TaskResultGeneration --> Result

Results

Improved performance. Get better results without having to find or create additional training data.
Greater consistency. As the LLM creates the examples, performance becomes more consistent.

Could SG-ICL work for you?

This approach has great potential:

Custom solutions. Adapting language models for specific tasks.
Solve data scarcity issues. Particularly helpful when a large data set is not readily available.

References

Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator