Visualization of embeddings with PCA during machine learning (fine-tuning) of a Vision Transformer



Fine-tuning significantly influences embeddings in image classification. Pre-fine-tuning embeddings offer general-purpose representations, whereas post-fine-tuning embeddings capture task-specific features. This distinction can lead to varying outcomes in outlier detection and other tasks. Both pre-fine-tuning and post-fine-tuning embeddings have their unique strengths and should be used in combination to achieve a comprehensive analysis in image classification and analysis tasks.

Blog: https://medium.com/@markus.stoll/changes-of-embeddings-during-fine-tuning-c22aa1615921

References:
Machine Learning Model: google/vit-base-patch16-224-in21k: Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby: An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale (2020), arXiv https://arxiv.org/abs/2010.11929

Dataset: CIFAR10: Alex Krizhevsky, Learning Multiple Layers of Features from Tiny Images (2009), University Toronto https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf https://www.cs.toronto.edu/~kriz/cifar.html

source