🛞Phi-2

As described in the METHOD our final system for the Phi-2 track is shown below

System Architecture for our Phi-2 solution

Answer Generation

Our pipeline uses a fine-tuned Phi-2 model as mentioned in Finetuning (Phi-2).

Answer Generation Process

Using the retrieved context and the prompt created we generate the answer for the question.

def generate_answer(question, options, context, abbreviations, model, tokenizer):
    prompt = create_prompt(question, options, context, abbreviations)
    input_ids = tokenizer.encode(prompt, return_tensors="pt").to("cuda")
    
    attention_mask = input_ids.ne(tokenizer.pad_token_id).long().to("cuda")

    outputs = model.generate(
        input_ids,
        attention_mask=attention_mask,
        max_new_tokens=10,
        pad_token_id=tokenizer.eos_token_id,
        num_beams=1,
        early_stopping=True,
    )
    answer = tokenizer.decode(
        outputs[0][input_ids.shape[1] :], skip_special_tokens=True
    )
    return answer

The generation parameters are set to produce a concise answer (maximum of 10 new tokens). This helps in parsing the response to get the specific answer choice.

Answer Parsing

We parse out the specific answer choice from our models response.

def parse_answer(response):
    match = re.search(r"Answer:\s*Option\s*(\d+)", response, re.IGNORECASE)
    if match:
        answer = f"Option {match.group(1)}"
    else:
        match = re.search(r"(\d+)", response, re.IGNORECASE)
        if match:
            answer = f"Option {match.group(1)}"
        else:
            answer = "Error"
    return answer

Last updated