Pangenome Comparison Analysis#

PANORAMA provides two specialized commands for comparing biological features across multiple pangenomes using Gene Family Relatedness Relationship (GFRR) metrics.


Compare spots across pangenomes#

The compare_spots command identifies conserved genomic spots across multiple pangenomes by comparing gene family composition at spot borders.

Compare spots command-line usage#

# Basic conserved spots comparison
panorama compare_spots \
    --pangenomes pangenomes.tsv \
    --output conserved_spots_results \
    --gfrr_cutoff 0.8 0.8 \
    --threads 8

# With systems analysis
panorama compare_spots \
    --pangenomes pangenomes.tsv \
    --output conserved_spots_results \
    --systems \
    --models defense_systems.tsv \
    --sources defense_finder \
    --gfrr_cutoff 0.8 0.8 \
    --graph_formats gexf graphml \
    --threads 8

Compare spots key Arguments#

Argument

Description

--pangenomes

TSV file listing .h5 pangenomes with computed spots

--output

Output directory for conserved spots results

--gfrr_cutoff

Two thresholds for min_gfrr and max_gfrr (default: 0.8 0.8)

--gfrr_metrics

GFRR metric for clustering: min_gfrr or max_gfrr

--systems

Enable systems analysis within conserved spots

--models

Path(s) to system model files (required with –systems)

--sources

System source names (required with –systems)

For complete parameter details and GFRR metrics explanation, see the compare_spots documentation.


Compare systems across pangenomes#

The compare_systems command identifies conserved biological systems across multiple pangenomes and generates distribution heatmaps.

Quick Usage#

# Basic systems comparison with heatmaps
panorama compare_systems \
    --pangenomes pangenomes.tsv \
    --models defense_systems.tsv \
    --sources defense_finder \
    --output systems_comparison_results \
    --heatmap \
    --threads 8

# Full analysis with conserved systems clustering
panorama compare_systems \
    --pangenomes pangenomes.tsv \
    --models defense_systems.tsv cas_systems.tsv \
    --sources defense_finder CasFinder \
    --output systems_comparison_results \
    --heatmap \
    --gfrr_metrics min_gfrr_models \
    --gfrr_cutoff 0.8 0.8 \
    --gfrr_models_cutoff 0.2 0.2 \
    --graph_formats gexf graphml \
    --threads 8

Compare systems key Arguments#

Argument

Description

--pangenomes

TSV file listing .h5 pangenomes with detected systems

--models

Path(s) to system model files

--sources

Name(s) of systems sources

--output

Output directory for comparison results

--heatmap

Generate distribution heatmaps

--gfrr_cutoff

Two thresholds for min_gfrr and max_gfrr (default: 0.5 0.8)

--gfrr_models_cutoff

GFRR thresholds for model gene families (default: 0.4 0.6)

--gfrr_metrics

GFRR metric for clustering conserved systems

For complete parameter details and GFRR metrics explanation, see the compare_systems documentation.


GFRR Metrics Overview#

Both commands use Gene Family Relatedness Relationship metrics to assess conservation:

Metric

Formula

Usage

min_gfrr

shared_families / min(families_A, families_B)

Conservative: high overlap required

max_gfrr

shared_families / max(families_A, families_B)

Liberal: partial overlap accepted

Sensitivity Control#

Cutoff Level

min_gfrr

max_gfrr

Behavior

Strict

0.8

0.8

High-confidence conservation only

Moderate

0.6

0.7

Balanced sensitivity and specificity

Permissive

0.4

0.5

Detects distant conservation patterns


Output Summary#

Compare Spots Outputs#

  • Individual conserved spot files (conserved_spot_X.tsv)

  • Summary file (all_conserved_spots.tsv)

  • Optional graph files (GEXF, GraphML)

  • Optional systems linkage graphs

Compare Systems Outputs#

  • Interactive heatmaps showing system distribution

  • Optional network graphs of conserved system clusters

  • Optional summary tables of conserved systems

For detailed output descriptions, see the respective documentation:


Prerequisites#

For Compare Spots#

For Compare Systems#

  • Pangenomes must have detected systems for the specified sources

  • Models files corresponding to the systems sources