Ingest Works

Batch Ingest

Submit one or many identifiers for ingestion and optionally ask the API to give you artifacts back—either as presigned URLs or as files saved directly on disk. Everything mirrors the /works/ingest API (here).

const res = await client.works.ingest({
    works: [
      {
        externalIdType: "doi", // required; identifier type (doi|pmid|pmcid|biorxivId|medrxivId|arxivId|openalexId)
        id: "10.1234/example", // required; identifier value matching externalIdType
      },
    ], // required; at least one record to ingest
    
    expand: ["sections", "blocks"],   // optional: extra payload data (all|sections|blocks|assets|citations)
    
    download: "raw",                  // optional: raw|minxml|plain; enables artifact return

    idempotencyKey: "dedupe-123",     // optional: reuse to make ingest idempotent
    downloadExpiresIn: 900,           // optional: presigned TTL (seconds 30–3600); ignored when saveTo is used
    saveTo: "tmp/batch-ingest/",      // optional: relative directory; each work is saved as 
                                      // tmp/batch-ingest/work-<public_id>_<version>_<kind>.xml.gz                                                         
  });

Notes on the usage of the download options:

  1. If you omit saveTo, res.results[i].download contains { mode: "presigned", url, expires_in }.

  2. When saveTo is present, the SDK writes files locally and returns { mode: "saved", path, size, content_type }.

  3. In batch mode, saveTo is always treated as a directory prefix. Filenames are auto-generated and of the format work-{WORK_ID}-{VERSION_LABEL}-{raw|minxml|plain}.xml.gz.

Ingest Single Work (compact shorthand)

We also support the following syntax for the ingestion of a single work. We recommend using the batch form even for one item, but this syntax works and honours the same download options. When saveTo ends with /, the canonical filename is appended.

 const res = await client.works.ingest({
    externalIdType: "medrxivId", // required; identifier type (same options as batch)
    id: "2025.08.07.25333034", // required; identifier value
    include: ["sections"], // optional; legacy alias for expand (string or string[])
    idempotencyKey: "dedupe-123", // optional; reuse to make ingest idempotent
    
    download: "minxml",
    saveTo: "tmp/single-ingest/",   // trailing slash → treat as directory
  });

The current list of supported identifier IDs can be found here.

Identifier Shorthand Helpers

Shorthand aliases that ingest a single work of the named identifier type while still allowing include/ idempotency options. They accept expand, download, saveTo, and downloadExpiresIn options as well.

const res = await client.works.ingest.medrxiv(
    "2025.08.07.25333034", // required; identifier value (externalIdType is fixed by the helper)
    {
      include: ["sections"], // optional; alias for expand (string or string[])
      idempotencyKey: "dedupe-123", // optional; reuse to make ingest idempotent
    }
  );

Available helpers:

client.works.ingest.doi(...)
client.works.ingest.biorxiv(...)
client.works.ingest.medrxiv(...)
client.works.ingest.pmid(...)
client.works.ingest.pmcid(...)

Identifier reference

The full list of supported identifier types is documented here.

Batch status values

Status
Description

completed

All works processed successfully

partial

Some works succeeded, some failed

failed

All works failed to process

Notes

  • downloadExpiresIn has no effect when saveTo is provided because no pre-signed URL is returned.

  • Saved artifacts are written relative to your process’s working directory; ensure the target folder exists or can be created.

Last updated