System Detection Based on Models#

The systems command enables the detection of biological systems in pangenomes using predefined functional models.

This detection relies on gene family annotations, and a model file defining the presence/absence of a specific function and the genomic organization.

Model Detection Workflow#

The detection process runs as follows:

Load Pangenomes

Based on a .tsv file, .h5 pangenomes are loaded with the required annotation and metadata sources.
Load System Models

The models are parsed from a list provided by --models.
Search for System Units

For each functional unit of each model:
1. Gene families are matched based on annotation metadata.
2. A context graph is built based on the gene neighborhood (window, transitivity).
3. Jaccard similarity filters edges in the graph.
4. Connected components are checked for necessary and forbidden families.
Assemble Systems

Functional units are grouped into systems if they satisfy:
- presence/absence rules
- constrained distance
Write Systems to File

Detected systems are saved back into the pangenome .h5 file, under the given source name.

Command Line Usage#

System detection command is used as such:

panorama systems \
--pangenomes pangenomes.tsv \
--models models.tsv \
--source defense_finder \
--annotation_sources  defense_finder CasFinder\
--jaccard 0.8 \
--sensitivity 3 \
--threads 8

Key Options#

Shortcut	Argument	Description
-p	–pangenomes	TSV file listing .h5 pangenomes
-m	–models	Path to the models list file (see panorama utils –models)
-s	–source	Name of the source used to annotate and detect systems
—	–annotation_sources	List of annotation sources to use (defaults to the one from –source)
—	–jaccard	Minimum Jaccard similarity to keep an edge in context graph (default: 0.8)
—	–sensitivity	Sensitivity mode: 1, 2, or 3. Higher = more precise, slower (default: 3)
—	–threads	Number of threads to use for parallel model evaluation

Output#

PANORAMA integrates multiple outputs: textual, graph-based representations and figures to summarize results.

See the write_systems