A paradigm of Git for models, and interpretable model diffs
What if we started thinking about fine-tuned models and training checkpoints through the language of software version control?
The Git analogy for neural networks
If we map standard machine learning workflows to Git terminology, a few interesting concepts emerge:
- Forks: Fine-tuning a base model on a new, specific dataset is essentially forking the original model.
- Diffs: Checkpoints saved at every epoch, or parameter-efficient adapters, act as the diffs.
- Merges: Could we merge branches? Imagine taking two forks of a language model—each fine-tuned on entirely separate datasets—and successfully merging their capabilities back into a single unified model.
The interpretability angle
One goal: applying and developing interpretability techniques to explain the diff between two models.
Currently, when a model outperforms a previous state-of-the-art checkpoint by a small margin on a leaderboard, we look at the accuracy bump and call it a day. But is that increase actually semantically meaningful? We'd like a way to move beyond global accuracy stats and find a method that explains the difference between two models in human-understandable terms.
If we can isolate these model diffs, we can start asking functional questions:
- Can we map a specific loss curve to newly learned concepts?
- If we look at two models that have the same validation accuracy, how do their underlying belief states differ?
Handling merge conflicts in continual learning
This line of thinking was originally prompted by the idea of a continual learning language model — specifically, fine-tuning a model daily on ingested news data. Here we represent each day -- or even more granularly, each news outlet -- as a separate adapter module.
For example, as new information flows in, why should a model's predicted probability for "convex" change in the prompt "A Platonic solid is a [MASK] regular polyhedron."? The model shouldn't unlearn static facts just because it read today's headlines.
This is where merge conflicts come into play. If we use techniques like diff pruning (which uses a mask), detecting parameter overlaps becomes relatively straightforward. The harder question is the desired outcome -- what should we do where there is a conflict?
Tooling for such systems
This Git-based paradigm naturally lends itself to thinking about Github equivalents, and how we might view development of deep learning models like open source software.
Edit: Colin Raffel has expanded on this idea significantly in his blog post.