小样本学习介绍

小样本学习介绍#

例: 孪生网络#

构造正负样本（Train Data）: 正样本是从某个类别中选出两张图片，组成一个元组， \((class1, class1, 1)\)。负样本是从不同的类别中各选出一张图片，也组成一个三元组， \((class1, class2, 0)\)。
构造模型: 对图片提取特征，生成一个特征向量。将两张图片同时输入到网络中，可以得到两个特征向量，对这两个特征向量做差，可以得到它们之间的差异。然后，对做差后的特征向量应用全连接神经网络，映射为一个标量，通过 Sigmoid 函数后得到它们之间的相似度。
更新参数: Ground Truth 为 One hot 向量，这是 \(y\)。对模型得到的预测值，\(\hat{y}\) 与标准值 \(y\) 做 Cross Entropy，记作 Loss。为了使 Loss 最小，应用反向传播更新参数。
Triplet Loss: 首先从某个类中选出一个锚点（anchor），然后再从这个类中选出一个正样本（positive sample)，最后从另一个类中选出一个负样本（negative sample），构成三元组 \((pos, anchor, neg)\)。将这三张图片都输入到网络中，可以得到三个特征向量 \(f(x^+), f(x^a), f(x^-)\)。三个向量，两两之间分别计算欧氏距离。目标是让正样本的特征向量和锚点的特征向量之间距离越小越好，负样本的特征向量与锚点的特征向量之间的距离越大越好。 \(Loss(x^a, x^+, x^-)=max\left\{0, d^+ + \alpha - d^-\right\}\) 其中 \(\alpha\) 是一个超参。更新网络参数，最小化 Loss。

备注

近些年，准确率较高的都是 Embedding，把图片映射成特征向量，想法和 Siamese Network 相似。

预训练和微调#

这是一个很简单的思路，在大数据集 Train Set 上做 Pretraining，在小数据集 Support Set 上做 Fine Tuning。这种方式虽然简单，但是准确率相对较高。代码实现参考《迁移学习简明手册 ¹⁹ ¹⁹王晋东等. 迁移学习简明手册. 2018. URL: https://kdocs.cn/l/ch4VStxYGjpp.》上手实践部分。

Step1: Pretraining
- Pretrain a CNN on large-scale training data.
- Use the CNN for feature extraction.
Step2: Fine Tuning
- Training a classifier on the support set.
- Tricks:
  - Using \(\mathbf{M}\) to initialize \(\mathbf{W}\).
  - Entropy regularization.
  - Cosine similarity + Softmax classifier.
Step3: Few Shot Prediction
- Map images in the support set to feature vectors.
- Obtain the mean feature vector of each class, \(\mu_1, \mu_2, \dots, \mu_k\)
- Compare the feature of query with \(\mu_1, \mu_2, \dots, \mu_k\)

Cosine similarity: 衡量两个向量之间的相似度。两个向量的内积等于 \(cos\theta=\mathbf{x}^T\mathbf{w}\)。
Softmax Function: 可以把一个向量映射成一个概率分布，通常用于输出层。首先对向量的每一个元素取指数，然后做归一化。每个概率值表示对每个类别的 Confidence。 Softmax 会让最大值变大，让最小值变小，突出差异性。
Fine Tuning: 再 Support Set 上学习 \(\mathbf{W}\) 和 \(\mathbf{b}\) 就是做 Fine Tuning。之前没有学习 \(\mathbf{W}\) 和 \(\mathbf{b}\) 直接让 \(\mathbf{b} = 0\)， \(\mathbf{W} = \mathbf{M}\)。其中 \(\mathbf{M}\) 是每一个类别的均值向量组成的矩阵。

应用场景#

资料调研#

科普视频#

元学习与小样本学习王树森 on 哔哩哔哩 Slide Introduction/Siamese Network/Pretraining & Fine Tuning
深度强化学习王树森 on YouTube Slide Intro/Value-Based/Policy-Based/Actor-Critic Methods/Model-Based
王树森课程讲义深度强化学习.PDF

科普博文#

领域综述#

Generalizing from a Few Examples: A Survey on Few-Shot Learning 笔记及文章解读
Meta-Learning in Neural Networks: A Survey
A CLOSER LOOK AT FEW-SHOT CLASSIFICATION
A Baseline for Few-Shot Image Classification

教学视频#

CS 330: Deep Multi-Task and Meta Learning 主页或哔哩哔哩 17.75 小时
Chelsea Finn: Meta-Learning: from Few-Shot Learning to Rapid Reinforcement Learning 主页或哔哩哔哩
Chelsea Finn: Building Unsupervised Versatile Agents with Meta-Learning YouTube 1 小时
李宏毅：Meta Learning YouTube 或哔哩哔哩

特邀演讲#

Generalizing from Few Examples with Meta-Learning by Hugo Larochelle Video 及 Slides
Workshop on Meta-Learning (MetaLearn 2021) Video
Deep Learning: Bridging Theory and Practice Video
Challenges in Multi-Task Learning and Meta-Learning Video 及 Slides
The Big Problem with Meta-Learning and How Bayesians Can Fix It Video 及 Slides

算法实现#

Papers With Code: Few-Shot Learning

数据集#

Omniglot data set for one-shot learning 及 Paper
Tools for mini-ImageNet Dataset
ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
FGVC-Aircraft Benchmark
Caltech-UCSD Birds-200-2011
google-research/meta-dataset
relevant-awesome-datasets-repo - Few shot
评价强化学习模型效果的工具： OpenAI Gym

领域学者#

Chelsea Finn, UC Berkeley
Pieter Abbeel, UC Berkeley
Erin Grant,UC Berkeley
Raia Hadsell, DeepMind
Misha Denil, DeepMind
Adam Santoro, DeepMind
Sachin Ravi, Princeton University
David Abel, Brown University
Brenden Lake, Facebook AI Research