AI Research Hub — AI Megacity

Scaling Laws for Neural Language Models

Empirical investigation of how loss scales with model size, compute, and dataset size. Key findings inform the training compute-optimal approach now used in all frontier models.

scalingLLMcompute

Kaplan et al. · OpenAI · 2020 · 4,800 citations

Constitutional AI: Harmlessness from AI Feedback

Anthropic introduces a method for training a harmless AI assistant without human labels on harms, leveraging AI-generated critiques and revisions grounded in a set of principles.

RLHFsafetyalignmentClaude

Bai et al. · Anthropic · 2022 · 2,100 citations

GPT-4 Technical Report

OpenAI's multimodal model achieving human-level performance on professional exams. Describes safety mitigations, evaluation methodology, and training process at scale.

GPT-4multimodalevaluation

OpenAI · 2023 · 8,900 citations

Gemini: A Family of Highly Capable Multimodal Models

Google DeepMind presents Gemini Ultra, Pro, and Nano — models natively multimodal from pretraining on text, image, audio, and video data across 32K+ token context windows.

GeminimultimodalGoogle DeepMind

Team Gemini · Google DeepMind · 2023 · 5,200 citations

Apple Intelligence: On-Device and Server Foundation Models

Technical description of Apple's foundation model stack: a 3B on-device model and larger server models, trained on carefully curated licensed and synthetic data with a focus on privacy-preserving inference.

Appleon-deviceprivacySiri

Apple ML Research · 2024 · 890 citations

LLaMA 3: Herd of Models

Meta releases open-weight models from 8B to 405B parameters with 15T training tokens, multilingual capability, long context, and tool use — setting new open-source benchmarks.

LLaMAMetaopen-sourceinstruction tuning

Meta AI · 2024 · 3,400 citations

Data Provenance and Copyright in AI Training: Legal and Technical Survey

Comprehensive survey of 40+ AI companies' data sourcing practices, robots.txt compliance, licensing terms, and the emerging legal landscape around training data.

datalegalweb crawlcopyright

Data Provenance Initiative · 2024 · 1,100 citations