panorama.workflow package#

Submodules#

panorama.workflow.pansystems module#

This module provides functions to detect biological systems in pangenomes all in one workflow command

panorama.workflow.pansystems.check_input_files(models_path: Path, table: Path = None, hmm: Path = None, disable_bar: bool = False) Tuple[DataFrame, Dict[str, List[HMM]], DataFrame, Models]#

Check the metadta table, the hmm and the models, in order to stop program before to read pangenome.

Parameters:
  • models_path (Path) – The path to the models list file.

  • table (Path, optional) – The path to a table with annotation information. Defaults to None.

  • hmm (Path, optional) – The path to a tab-separated file with HMM information and path. Defaults to None.

  • disable_bar (bool, optional) – Whether to disable the progress bar. Defaults to False.

Returns:
  • pd.Dataframe, optional – Dataframe containing for each pangenome a path to a table with annotation information.

  • Dict[str, List[HMM]] – A dictionary to identify which cutoff use to align HMM.

  • pd.Dataframe – Dataframe with hmm metadata information

  • Models – The models to detect systems.

Raises:

AssertionError – If neither table nor hmm is provided.

panorama.workflow.pansystems.check_pangenome_pansystems(pangenome: Pangenome, source: str, force: bool = False) None#

Checks the annotation of a pangenome and its systems.

Parameters:
  • pangenome (Pangenome) – The pangenome to check.

  • source (str) – The source of the annotation.

  • force (bool, optional) – Whether to force the erased of already computed systems. Defaults to False.

Raises:

ValueError – If systems are already detected based on the source and force is False.

panorama.workflow.pansystems.check_pansystems_parameters(args: Namespace) Tuple[Dict[str, Any], Dict[str, Any]]#

Checks and validates the parameters for the pansystems function.

Parameters:

args (argparse.Namespace) – The arguments passed to the function.

Returns:

Tuple[Dict[str, Any], Dict[str, Any]] – A tuple containing the necessary information and HMM keyword arguments.

Raises:

argparse.ArgumentError – If no type of systems writing is chosen.

panorama.workflow.pansystems.launch(args)#

Launch functions to detect systems in pangenomes

Parameters:

args – argument given in CLI

panorama.workflow.pansystems.pansystems(pangenomes: Pangenomes, source: str, models: Models, hmm: Dict[str, List[HMM]] = None, table: DataFrame = None, k_best_hit: int = None, jaccard_threshold: float = 0.8, sensitivity: int = 1, projection: bool = False, association: List[str] = None, partition: bool = False, proksee: str = None, threads: int = 1, force: bool = False, disable_bar: bool = False, **hmm_kwgs: Any) None#

Detects systems in multiple pangenomes.

Parameters:
  • pangenomes (Pangenomes) – The pangenomes to analyze.

  • source (str) – The source of the annotation.

  • models (Models) – The models to detect systems.

  • table (pd.Dataframe, optional) – Dataframe containing for each pangenome a path to a table with annotation information. Defaults to None.

  • hmm (Dict[str, List[HMM]], optional) – A dictionary to identify which cutoff use to align HMM . Defaults to None.

  • k_best_hit (int, optional) – The number of best annotation hits to keep per gene family. Defaults to None.

  • jaccard_threshold (float, optional) – The minimum Jaccard similarity used to filter edges between gene families. Defaults to 0.8.

  • sensitivity (int, optional) – Sensitivity level for detection. Defaults to 1. - 1. corresponds to a global Jaccard filtering on the context without looking at all the combinations. - 2. corresponds to a global Jaccard filtering on the specific context of each combination. - 3. corresponds to a local Jaccard filtering on the specific context of each combination.

  • projection (bool, optional) – Whether to project the systems on organisms. Defaults to False.

  • association (List[str], optional) – The type of association to write between systems and other pangenome elements. Defaults to None.

  • partition (bool, optional) – Whether to write a heatmap file with for each organism, partition of the systems. Defaults to False.

  • proksee (str, optional) – Whether to write a proksee file with systems. Defaults to None.

  • threads (int, optional) – The number of available threads. Defaults to 1.

  • force (bool, optional) – Whether to force erasing already computed systems. Defaults to False.

  • disable_bar (bool, optional) – Whether to disable the progress bar. Defaults to False.

  • **hmm_kwgs (Any) – Additional keyword arguments for HMM annotation.

Raises:

AssertionError – If neither table nor hmm is provided.

panorama.workflow.pansystems.parser_pansystems(parser)#

Add argument to parser for systems command

Parameters:

parser – parser for systems argument

panorama.workflow.pansystems.subparser(sub_parser) ArgumentParser#

Subparser to launch PANORAMA in Command line

Parameters:

sub_parser – sub_parser for systems command

Returns:

argparse.ArgumentParser – parser arguments for align command

Module contents#