prompt学习

Posted by cjj on 2022-07-08
Words 692 and Reading Time 3 Minutes
Viewed Times

什么是prompt

博客阅读资料:https://wmathor.com/index.php/archives/1587/

prompt learning,提示学习,最近一年在NLP领域较火的方向。

背景

基础:transformer、self-supervision,是一种训练范式:pre-train + prompt,和 pre-train + fine-tune并驾齐驱

与之紧密联系的

language model(语言模型):BERT(完型填空) + GPT(有输入得出下一个词的概率)

prompt for LM:prompt提示语

例子:设计一个影评的情感分类模型,我们搭建一个模型: 影评 + in a word, this movie is ____。其中包含人工设计的模板 in a word …,这就是prompt的起源。

prompt相较于fine - tune的优势:1. 计算量小 2.存储空间小 (主要是模型参数少)

prompt learning

模板不用人工设计,可以由模型自己学习。方法:在输入之前使用一些embedding进行预训练

  • 本质:让下游任务的数据分布向训练集的数据分布进行迁移
  • 为什么有效:prompt(提示)可以改变transformer的注意力,适合全局特征,但不适合抽取式问答

prompt learning 和 CV

  • NLP的prompt并不成熟,很多问题为解决
  • CV的bert也没有大发展

结论:prompt在CV领域暂时不会有大发展

image-20220708144846770

类似于一个QA任务,问题相当于prompt,分为离散型(人能理解的)和连续型

实践:https://huggingface.co/hfl/chinese-roberta-wwm-ext?text=%E5%90%8E%E9%9D%A2%E7%9A%84%E4%BE%8B%E5%AD%90%E6%83%85%E6%84%9F%E9%9D%9E%E5%B8%B8%5BMASK%5D%EF%BC%8C%E4%BB%8A%E5%A4%A9%E5%A4%A9%E6%B0%94%E7%9C%9F%E5%A5%BD%EF%BC%81

最新研究趋势

EMNLP 2021 论文预讲会

https://www.bilibili.com/video/BV1zf4y1u7og?spm_id_from=333.337.search-card.all.click&vd_source=75b4a2d423c495540711366f06d73d56

image-20220708151314321

研究维度

image-20220708151343523

从不同视角看Prompt论文发表情况

视角一:Pretrained Language Models

  1. GPT comes earlier
  2. BERT is the most widely-used one
    • 分类任务
    • knowledge probing
  3. T5 is more popular recently
    • QA tasks
    • Generation-based tasks
    • support PLMs with richer sizes(大尺度)
    • support cross-lingual PLMs(多语言)

视角二:Tuning Strategies

  1. Tuning-free(不改变参数) comes first and keeps popular
  2. Prompt-only(只改变prompt的参数)tuning gradually get more attention
  3. Tuning-all(既调预训练语言模型的参数,又调prompt的参数) parameters is less investigated

视角三:tasks

  1. Commonsense reasoning comes earliest, but grows slowly
  2. Classification, Generation, Knowledge Probing are top-3 tasks
  3. Classification is a late start but growing the fastest
  4. Machine translation is relatively under-explored
    • Cross-lingual + encoder-decoder
  5. Vision(计算机视觉)comes late, but is growing now

视角四:Training Samples

  1. Zero-shot starts earliest, then few-shot follows
  2. Few-shot grows fastest, full data is also popular

未来发展

what can we do?

  1. New Tasks:将prompt learning 放到新的任务上,任务价值?
  2. New Strategy:不同的tuning 策略
  3. New Explorations:更深入的理解
    • 多任务Prompt learning
    • Prompt-based Data Generation/Bias Analysis
    • Pre-training

This is copyright.