MISHALE
Docs

Schema Reference

The three-object model: ProtocolSpec (the plan) → ProtocolExecution (the run) → ProtocolOutcome (the result). JSON Schema Draft 2020-12. Pydantic v2 models. JSON Schema at GET /v1/schema.

ProtocolSpec

The canonical representation of a biological protocol as a causal program. All Mishale packages produce and consume this format.

FieldTypeReqDescription
protocol_idstringyesDeterministic UUID derived from content hash. Stable across re-ingestion.
titlestringyesShort descriptive name.
descriptionstringFree-text abstract or summary.
domainstring (enum)yesResearch domain identifier. See Domain Taxonomy below.
stepsIntervention[]yesOrdered list of intervention steps — the causal program.
reagentsReagent[]Named reagents with concentration, vendor, and ChEBI ID.
equipmentstring[]Equipment identifiers (OBI-anchored where possible).
cell_type_initialstringInitial cell type. CL ontology ID or free text.
cell_type_targetstringTarget cell type. CL ontology ID or free text.
speciesstringModel organism. NCBI Taxon ID or common name.
efficiency_meanfloat | nullWelford running mean of measured efficiency (0–1).
n_measurementsintNumber of wet-lab measurements contributing to the mean.
tagsstring[]Free-form keyword tags.
sourcestringOriginating connector ID (e.g. 'benchling', 'pubmed', 'opentrons').
external_idstringSource-system primary key.
doistringDOI for literature-derived protocols.
paper_idstringInternal paper identifier (used for contrastive pair construction).
metadataobjectConnector-specific key-value pairs. Not used for model training.

Intervention (step)

Each element of steps is an Intervention — the atomic unit of a biological causal program.

FieldTypeDescription
actionstringVerb describing what is done (OBI-normalised where possible).
targetstringBiological target: gene (HGNC), cell type (CL), molecule (ChEBI).
target_classstringOne of: transcription_factor, small_molecule, protein_factor, viral_vector, crispr_component, other.
dosestringDose with units. Normalised to µM (small molecules) or MOI (viral).
deliverystringDelivery method: lentiviral, retroviral, aav, mrna, plasmid, protein, electroporation, crispr.
timing_daysfloatDay relative to protocol start when this intervention occurs.
duration_daysfloatDuration of the intervention in days.
is_pioneerbooleanWhether the TF is a pioneer factor (can bind closed chromatin). KB-derived.
is_induciblebooleanWhether this is a dox-inducible transgene.

Data Tier Model

0
Public

Published protocols. No restrictions. Apache 2.0.

1
Anonymised

Internal procedures without outcomes. Feature vectors only.

2
Outcomes

Procedures + efficiency measurements. Requires DPA / IRB.

3
Proprietary

Full IP records. Never transmitted. Local extraction only.

Domain Taxonomy

direct_reprogrammingipsc_derivationipsc_differentiationcell_culturecrispr_kocrispr_activationcrispr_interferencebase_editingprime_editingorganoidflow_cytometrywestern_blotsequencingcloningviral_transductionmicrobial_fermentationt_cell_engineeringcar_t

Example ProtocolSpec

{
  "protocol_id": "prot_sha256_a1b2c3",
  "title": "BAM Factor Direct Reprogramming — Fibroblast to Neuron",
  "domain": "direct_reprogramming",
  "cell_type_initial": "CL:0000057",
  "cell_type_target":  "CL:0000540",
  "species": "9606",
  "steps": [
    {
      "action": "transduce",
      "target": "ASCL1",
      "target_class": "transcription_factor",
      "delivery": "lentiviral",
      "dose": "MOI 5",
      "timing_days": 0,
      "duration_days": 1,
      "is_pioneer": true,
      "is_inducible": false
    },
    {
      "action": "transduce",
      "target": "BRN2",
      "target_class": "transcription_factor",
      "delivery": "lentiviral",
      "dose": "MOI 5",
      "timing_days": 0,
      "duration_days": 1,
      "is_pioneer": false,
      "is_inducible": false
    }
  ],
  "reagents": [
    { "name": "doxycycline", "concentration": "2 µg/mL", "chebi_id": "CHEBI:50845" }
  ],
  "efficiency_mean": 0.34,
  "n_measurements": 3,
  "source": "pubmed",
  "doi": "10.1038/s41593-019-0548-1",
  "paper_id": "PMID:31768042"
}