Finetuning#
NOTE: this document is a work in progress!
This document aims to provide a step-by-step guide to finetuning a model on conversations from gptme.
The goal of fine-tuning a model for gptme is to:
Teach the tools available in gptme
Update out-of-date knowledge and conventions
Improve its ability to recover from errors
Step 1: Gather the data#
To fine-tune we need something to fine-tune on.
We will fine-tune on our own conversation history, combined with a subset of the [OpenAssistant dataset][oa-dataset] to extend the training data with relevant examples.
We collect our own conversation history by running the following command:
./train/collect.py --model "HuggingFaceH4/zephyr-7b-beta" # or whatever model you intend to fine-tune
This will create files train.csv
and train.jsonl
in the train
directory.
TODO: describe how to get the OpenAssistant dataset TODO: describe how to use exported ChatGPT conversations
Step 2: Prepare the data#
We need to prepare the data for fine-tuning. This involves:
Extend the data with examples from the OpenAssistant dataset
Splitting the data into train and validation sets
We might want to make sure that the validation set is comprised of examples from gptme, and not from the OpenAssistant dataset.
TODO…
Step 3: Fine-tune the model#
Options:
-
Does it support Mistral? (and by extension Zephyr)
[Hugging Face transformers][hf-transformers]
Examples for Llama2 by Meta
[OpenPipe][openpipe]?
Looks interesting, but not sure if it’s relevant for us.
TODO…
Model suggestions#
HuggingFaceH4/zephyr-7b-beta
teknium/Replit-v2-CodeInstruct-3B
I had issues with this one on M2, but would be good to have some 3B model as an example used in testing/debug.