Era 4 · Foundation Models (2020-2022)¶

From GPT-3 igniting the scaling belief to ChatGPT reshaping product form — the critical 3 years that turned large models from research paradigm into commercial paradigm.

Collected Notes¶

GPT-3 — When 175B Parameters Made Prompting the New Programming Paradigm · 2020 · Brown et al. (OpenAI)
DDPM — Recasting Generative Modeling as a Denoising Process · 2020 · Ho, Jain, Abbeel (UC Berkeley)
ViT — An Image Is Worth 16×16 Words: Transformers Enter Vision · 2020 · Dosovitskiy et al. (Google Brain)
CLIP — Learning Visual Models from 400M Image-Text Pairs · 2021 · Radford et al. (OpenAI)
AlphaFold 2 — Cracking the 50-Year Protein Folding Problem with Attention · 2021 · Jumper et al. (DeepMind)
Stable Diffusion — Democratizing T2I with Latent Diffusion · 2022 · Rombach et al. (CompVis)
InstructGPT — RLHF Turns GPT-3 into the Mother of ChatGPT · 2022 · Ouyang et al. (OpenAI)
Chain-of-Thought Prompting — "Let's Think Step by Step" Unlocks Reasoning · 2022 · Wei et al. (Google Brain)
Chinchilla — Rewriting Scaling Laws with Compute-Optimal Training · 2022 · Hoffmann et al. (DeepMind)
Flamingo — Frozen LM + Perceiver Resampler + Gated Cross-Attention: The Multimodal Few-Shot Starting Line · 2022 · Alayrac et al. (DeepMind)
LoRA — Low-Rank Adaptation Lets Anyone Fine-tune a 175B Model · 2021 · Hu et al. (Microsoft Research)
MAE — 75% Masking Brings ViT Its Own BERT Moment · 2022 · He et al. (FAIR)
NeRF — An 8-Layer MLP Compresses a Scene into a Differentiable 5D Radiance Field · 2020 · Mildenhall et al. (UC Berkeley)
Scaling Laws — 7 Orders of Magnitude Make LLM Training Predictable Engineering · 2020 · Kaplan et al. (OpenAI)

Candidate Papers (Selection)¶

PaLM (2022) — Pathways + 540B validation
DALL-E 2 (2022) — Hierarchical T2I