Finetuning (Phi-2)
This page details our finetuning approach for Phi-2
Last updated
This page details our finetuning approach for Phi-2
Last updated
As detailed in the Challenges and Objectives we find that:
Phi-2 is not as responsive to instructions, particularly regarding how to format the input
The full context window of the model doesn't seem well utilized. Adding chunks of context degrades the performance.
We finetune the model to help overcome both of these challenges.
We use LoRA () to keep the finetuning cheap (in terms of computational resources needed).
We select 1400 questions from the 1461 provided in the training txt file. We leave the rest (61) as unseen for validation purposes.
We use our RAG approach with chunk size 150 and k = 7 and generate static context that we add to the training txt. This can be seen in our repo: ()
We use this context for finetuning.
The prompt includes:
instructions for the QA task, and the expected formatting
context
abbreviations
question
options
The objective is simple: Given the instruction, context, abbreviations, question and options, generate the correct option and an explanation
we encourage you to refer to the LoRA paper for futher clarifications
no bias
batch size = 1
gradient accumulation steps = 4
epochs = 2 (700 steps)
max learning rate =
lr scheduler: linear
warmup steps = 100
We make available our finetuned models:
(Best):
k=3:
no context: