Skip to content

Downloads & API

The entire corpus is freely available for download and programmatic access. Every work carries its own license; check the /api/v1/works endpoint or the works index for per-work license details.

Bulk downloads

SQLite database

A self-contained SQLite database with the full corpus: all works, verses, cross-references, word data, and notes. Regenerated periodically after new imports.

Download SQLite

CSV exports

Per-work verse exports (one CSV per work) plus a combined cross-references file. Useful for spreadsheets, R, Python, or any tool that reads CSV.

REST API

All endpoints return JSON. No authentication required. Rate limited to 60 requests per minute. This server has finite bandwidth, so please be considerate. The base URL is:

https://openscriptorium.org/api/v1/

List all works

GET /api/v1/works

Returns every work in the corpus with its metadata, license, language, traditions, and feature flags.

Work detail with book listing

GET /api/v1/works/:slug

Example: /api/v1/works/bsb returns the BSB metadata plus a list of all books (canonical works) with verse counts.

Verses (chapter or single verse)

GET /api/v1/works/:work/:book/:chapter
GET /api/v1/works/:work/:book/:chapter/:verse

Returns verse text, notes, and markup for a chapter or single verse. Add ?words=true to include word-level data (surface form, morphology, Strong’s number, and linked lemma with gloss). Examples:

  • /api/v1/works/bsb/genesis/1 — BSB Genesis 1
  • /api/v1/works/kjv/matthew/5/3 — KJV Matthew 5:3
  • /api/v1/works/wlc/genesis/1?words=true — WLC Genesis 1 with word-level data
  • /api/v1/works/bavli/berakhot/2 — Gemara Berakhot daf 2

Cross-references

GET /api/v1/cross-references?book=:book&chapter=:ch&verse=:v

Returns all cross-references whose source overlaps the given passage. The verse parameter is optional; omitting it returns cross-references for the entire chapter. Limited to 500 results.

  • /api/v1/cross-references?book=genesis&chapter=1&verse=1
  • /api/v1/cross-references?book=berakhot&chapter=2&verse=1 — Talmud xrefs (daf=chapter, segment=verse)

Full-text search

GET /api/v1/search?q=:query

Case-insensitive text search with automatic query classification (text, citation, Strong’s, Hebrew/Greek script, lemma, morphological, proximity, regex). Returns up to 100 results (default 25) with pagination metadata.

Optional parameters:

  • work — limit to a specific work slug
  • book — limit to a canonical book slug
  • section[] — filter by section: ot, nt, deuterocanon, etc.
  • tradition[] — filter by tradition: ש, χ-protestant, academic, etc.
  • language — filter by language code (en, grc, he, etc.)
  • work_types — pass all to include commentary and treatises (scripture-only by default)
  • limit — results per page (max 100, default 25)
  • page — page number (default 1)

Examples:

  • /api/v1/search?q=in+the+beginning
  • /api/v1/search?q=logos&work=nestle1904
  • /api/v1/search?q=H7225 — Strong’s number search
  • /api/v1/search?q=בראשׁית — Hebrew script search
  • /api/v1/search?q=faith+NEAR/3+works — proximity search

Search completions (type-ahead)

GET /api/v1/search/completions?q=:prefix

Returns book and work suggestions for type-ahead. Minimum 2 characters.

Lexicon (lemma data)

GET /api/v1/lemmas?q=:query
GET /api/v1/lemmas/:id
GET /api/v1/lemmas/strong/:number

Search, retrieve, and look up lemmas (dictionary headwords). The search endpoint matches on gloss, transliteration, normalized form, or Strong’s number. Each lemma includes part of speech, glosses, senses, etymology, frequency, and frequency-by-book distribution.

  • /api/v1/lemmas?q=beginning&language=grc — Greek lemmas glossed as “beginning”
  • /api/v1/lemmas/strong/G746 — resolve Strong’s G746 to its lemma(s)

Critical apparatus

GET /api/v1/apparatus?book=:book&chapter=:ch&verse=:v

Returns variant units, readings, and manuscript witnesses for a passage. The verse parameter is optional; omitting it returns apparatus for the entire chapter. Each variant unit includes lemma text, context, and all readings with their supporting manuscripts.

Manuscripts

GET /api/v1/manuscripts
GET /api/v1/manuscripts/:siglum

List and retrieve manuscript metadata: siglum, Gregory-Aland number, INTF Liste number, text family, material, date, shelfmark, and lacunae. Filter by family (alexandrian, byzantine, western, caesarean).

Licensing

Each work in the corpus has its own license (Public Domain, CC BY-SA, etc.). The /api/v1/works endpoint includes license details for every work. If you redistribute data from this API, you must comply with the license of each work you include. Most works are Public Domain; those that require attribution have attribution_required: true in their license object.

The API itself and the Open Scriptorium source code are available under the ISC License.

Feedback

Bug reports, feature requests, and API suggestions are welcome on the issue tracker or the mailing list.