Era 5 · Large Model Era (2023-present)¶
From LLaMA going open-source and SAM reshaping vision, to o1 igniting the "thinking era," and DeepSeek-R1 open-sourcing reasoning itself — in three years, AI went from "can chat" to "can reason."
Notes in this era (28)¶
- 3DGS — Bringing NeRF-Quality Radiance Fields into Real-Time Interaction · 2023 · Kerbl et al.
- AudioLM - Turning Raw Audio into a Language Modeling Problem · 2023 · Borsos et al.
- DINOv2 - Robust Visual Features without Supervision · 2023 · Oquab et al.
- DPO — Aligning LLMs Directly from Preferences without a Reward Model or PPO · 2023 · Rafailov et al.
- GPT-4 Technical Report - Capability Leap and the Black-Box Technical Report · 2023 · OpenAI
- LLaMA — How Smaller Parameters + More Tokens Helped Open-Source LLMs Catch Up to GPT-3 · 2023 · Touvron et al.
- Llama 2: Open Foundation and Fine-Tuned Chat Models · 2023 · Touvron et al.
- LLaVA - Turning GPT-4-Generated Visual Instructions into an Open Multimodal Assistant · 2023 · Liu et al.
- Mamba — How Selective State Spaces Became the First Credible Transformer Challenger in a Decade · 2023 · Gu & Dao
- Mixtral 8x7B — Open-Weight LLMs Enter the Sparse Expert Era · 2023 · Mistral AI
- QLoRA — Bringing 65B LLM Fine-Tuning onto a Single 48GB GPU · 2023 · Dettmers et al.
- RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control · 2023 · Brohan et al.
- RWKV - Reinventing RNNs for the Transformer Era · 2023 · Peng et al.
- SAM — One Prompt + 11M Images + 1B Masks: Turning Segmentation into a Foundation Model Problem · 2023 · Kirillov et al.
- Toolformer - Letting Language Models Teach Themselves When to Use Tools · 2023 · Schick et al.
- Tree of Thoughts — Deliberate Search as a Reasoning Interface for LLMs · 2023 · Yao et al.
- vLLM / PagedAttention — Rescuing LLM Serving from KV-Cache Fragmentation · 2023 · Kwon et al.
- DeepSeek-V2 / V3 - How MLA and MoE Pushed Open Models to the Frontier · 2024 · DeepSeek-AI
- Gemini 1.5 - Multimodal Understanding Across Million-Token Contexts · 2024 · Google DeepMind
- Genie: Generative Interactive Environments · 2024 · Bruce et al.
- Llama 3 Herd - An Engineering Blueprint for Open Frontier Models · 2024 · Meta AI
- Mamba-2 - When Transformers and SSMs Meet in the Same Algebra · 2024 · Dao & Gu
- OpenAI o1 - Reinforcement Learning for Deep LLM Reasoning · 2024 · OpenAI
- Sora Technical Report - Video Generation Models as World Simulators · 2024 · OpenAI
- Stable Diffusion 3 / Rectified Flow — Moving Text-to-Image from U-Net Diffusion to Scalable MMDiT · 2024 · Esser et al.
- Claude 3.5/3.7 Sonnet - Turning Frontier Models into Controllable Engineering Collaborators · 2025 · Anthropic
- DeepSeek-R1 — How Pure Reinforcement Learning Taught an Open LLM to Reason · 2025 · DeepSeek-AI
- Qwen2.5 / Qwen3 - How Alibaba Turned Open LLMs into a Full-Stack Model Family · 2025 · Alibaba