Era 5 · Large Model Era (2023-present)¶

From LLaMA going open-source and SAM reshaping vision, to o1 igniting the "thinking era," and DeepSeek-R1 open-sourcing reasoning itself — in three years, AI went from "can chat" to "can reason."

Notes in this era (28)¶

3DGS — Bringing NeRF-Quality Radiance Fields into Real-Time Interaction · 2023 · Kerbl et al.
AudioLM - Turning Raw Audio into a Language Modeling Problem · 2023 · Borsos et al.
DINOv2 - Robust Visual Features without Supervision · 2023 · Oquab et al.
DPO — Aligning LLMs Directly from Preferences without a Reward Model or PPO · 2023 · Rafailov et al.
GPT-4 Technical Report - Capability Leap and the Black-Box Technical Report · 2023 · OpenAI
LLaMA — How Smaller Parameters + More Tokens Helped Open-Source LLMs Catch Up to GPT-3 · 2023 · Touvron et al.
Llama 2: Open Foundation and Fine-Tuned Chat Models · 2023 · Touvron et al.
LLaVA - Turning GPT-4-Generated Visual Instructions into an Open Multimodal Assistant · 2023 · Liu et al.
Mamba — How Selective State Spaces Became the First Credible Transformer Challenger in a Decade · 2023 · Gu & Dao
Mixtral 8x7B — Open-Weight LLMs Enter the Sparse Expert Era · 2023 · Mistral AI
QLoRA — Bringing 65B LLM Fine-Tuning onto a Single 48GB GPU · 2023 · Dettmers et al.
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control · 2023 · Brohan et al.
RWKV - Reinventing RNNs for the Transformer Era · 2023 · Peng et al.
SAM — One Prompt + 11M Images + 1B Masks: Turning Segmentation into a Foundation Model Problem · 2023 · Kirillov et al.
Toolformer - Letting Language Models Teach Themselves When to Use Tools · 2023 · Schick et al.
Tree of Thoughts — Deliberate Search as a Reasoning Interface for LLMs · 2023 · Yao et al.
vLLM / PagedAttention — Rescuing LLM Serving from KV-Cache Fragmentation · 2023 · Kwon et al.
DeepSeek-V2 / V3 - How MLA and MoE Pushed Open Models to the Frontier · 2024 · DeepSeek-AI
Gemini 1.5 - Multimodal Understanding Across Million-Token Contexts · 2024 · Google DeepMind
Genie: Generative Interactive Environments · 2024 · Bruce et al.
Llama 3 Herd - An Engineering Blueprint for Open Frontier Models · 2024 · Meta AI
Mamba-2 - When Transformers and SSMs Meet in the Same Algebra · 2024 · Dao & Gu
OpenAI o1 - Reinforcement Learning for Deep LLM Reasoning · 2024 · OpenAI
Sora Technical Report - Video Generation Models as World Simulators · 2024 · OpenAI
Stable Diffusion 3 / Rectified Flow — Moving Text-to-Image from U-Net Diffusion to Scalable MMDiT · 2024 · Esser et al.
Claude 3.5/3.7 Sonnet - Turning Frontier Models into Controllable Engineering Collaborators · 2025 · Anthropic
DeepSeek-R1 — How Pure Reinforcement Learning Taught an Open LLM to Reason · 2025 · DeepSeek-AI
Qwen2.5 / Qwen3 - How Alibaba Turned Open LLMs into a Full-Stack Model Family · 2025 · Alibaba