Era 4 · Foundation Models (2020-2022)¶
From GPT-3 igniting the scaling belief to ChatGPT reshaping product form — the critical 3 years that turned large models from research paradigm into commercial paradigm.
Collected Notes¶
- GPT-3 — When 175B Parameters Made Prompting the New Programming Paradigm · 2020 · Brown et al. (OpenAI)
- DDPM — Recasting Generative Modeling as a Denoising Process · 2020 · Ho, Jain, Abbeel (UC Berkeley)
- ViT — An Image Is Worth 16×16 Words: Transformers Enter Vision · 2020 · Dosovitskiy et al. (Google Brain)
- CLIP — Learning Visual Models from 400M Image-Text Pairs · 2021 · Radford et al. (OpenAI)
- AlphaFold 2 — Cracking the 50-Year Protein Folding Problem with Attention · 2021 · Jumper et al. (DeepMind)
- Stable Diffusion — Democratizing T2I with Latent Diffusion · 2022 · Rombach et al. (CompVis)
- InstructGPT — RLHF Turns GPT-3 into the Mother of ChatGPT · 2022 · Ouyang et al. (OpenAI)
- Chain-of-Thought Prompting — "Let's Think Step by Step" Unlocks Reasoning · 2022 · Wei et al. (Google Brain)
- Chinchilla — Rewriting Scaling Laws with Compute-Optimal Training · 2022 · Hoffmann et al. (DeepMind)
- Flamingo — Frozen LM + Perceiver Resampler + Gated Cross-Attention: The Multimodal Few-Shot Starting Line · 2022 · Alayrac et al. (DeepMind)
- LoRA — Low-Rank Adaptation Lets Anyone Fine-tune a 175B Model · 2021 · Hu et al. (Microsoft Research)
- MAE — 75% Masking Brings ViT Its Own BERT Moment · 2022 · He et al. (FAIR)
- NeRF — An 8-Layer MLP Compresses a Scene into a Differentiable 5D Radiance Field · 2020 · Mildenhall et al. (UC Berkeley)
- Scaling Laws — 7 Orders of Magnitude Make LLM Training Predictable Engineering · 2020 · Kaplan et al. (OpenAI)
Candidate Papers (Selection)¶
- PaLM (2022) — Pathways + 540B validation
- DALL-E 2 (2022) — Hierarchical T2I