Documentation
Reference documentation for all Mishale packages — organized by layer: data, intelligence, and platform. All 12 packages are built. Phase 0 software is complete; the next action is running the extraction pipeline.
Data Layer
Schema Reference
mshale-schema v1.0.0
ProtocolSpec, ProtocolExecution, ProtocolOutcome JSON Schema. Every entity anchored to CL, HGNC, ChEBI, GO, OBI, UBERON, UO, NCBI Taxon.
Knowledge Base
mshale-kb v1.0.0
Biological constraint checking, delivery compatibility, pioneer factor awareness, cell type transition feasibility. 7 entity types.
Extraction Pipeline
mshale-extract v0.1.0
PubMed corpus builder → PMC XML parser → Claude tool-use extractor → ontology resolver → deterministic converter → QC report.
Intelligence Layer
Model & Ranking
mshale-model v0.1.0
223-dim feature engineering, contrastive pairs, XGBoost LambdaRank (mshale-1 v0), C-index eval, Geneformer embeddings, generalizability harness.
Federated Learning
mshale-fl v0.1.0
Local extraction (Tier 1) → feature federation (Tier 2) → gradient federation with FedProx + Opacus DP + SecAgg + audit chain (Tier 3).
Platform
API Reference
mshale-api v0.1.0
FastAPI: corpus search, ranking with SHAP, Claude AI agent (sync + SSE streaming), Welford closed-loop, active learning queue. 42 tests.
Closed-Loop CLI
mshale-loop v0.1.0
recommend · submit · campaign · history · retrain. Manages optimization campaigns end-to-end from API recommendation to model retraining.
Connectors
mshale-connectors v0.1.0
Benchling, PubMed, CSV, Addgene, S3, Slack, Webhook, Opentrons OT-2. ConnectorRuntime with incremental pull. 91 tests.
Build Order
All packages have strict upstream dependencies and are now complete: mshale-schema → mshale-kb → mshale-extract → mshale-model → mshale-api → mshale-studio. The next unblocked action is running mshale-extract corpus to produce mshale Data v0 (≥150 records).