API Documentation

REST API v2.1 — Access datasets, model cards, and paper metadata programmatically. Base URL: https://www.aimegacity.xyz/v2

GET/datasets

List all available datasets with metadata. Supports filtering, sorting, and pagination.

?limitNumber of results (default: 20, max: 100)
?typeFilter by type: text, image, code, multimodal
?licenseFilter by license: open, cc-by, mit, apache
GET /datasets?type=text&limit=5

{
  "total": 847,
  "results": [
    {
      "id": "fineweb-2024",
      "name": "FineWeb 2024",
      "size_tokens": 15000000000000,
      "license": "odc-by",
      "language": "en",
      "url": "/data/fineweb"
    }
  ]
}
GET/models/{id}

Retrieve full model card including benchmarks, training data provenance, and API access info.

GET /models/gpt-4o

{
  "id": "gpt-4o",
  "name": "GPT-4o",
  "creator": "OpenAI",
  "params": "~1.8T (MoE estimate)",
  "context_length": 128000,
  "benchmarks": {
    "MMLU": 88.7,
    "HumanEval": 90.2,
    "MATH": 76.6,
    "MT-Bench": 9.1
  },
  "training_data": ["WebText2", "Common Crawl", "Books", "Code"]
}
GET/papers/search

Full-text search across 18,400 papers. Returns structured metadata with abstracts.

?qSearch query string
?venueFilter by venue (NeurIPS, ICML, ICLR, arXiv...)
?yearPublication year or range (e.g. 2023-2024)
POST/crawl/submit

Submit a URL for metadata extraction and inclusion in the AI Megacity index. Data is publicly accessible upon indexing.

POST /crawl/submit
Content-Type: application/json

{
  "url": "https://example.com/research-paper",
  "type": "paper",
  "tags": ["LLM", "alignment"]
}