panorama.workflow package#
Submodules#
panorama.workflow.pansystems module#
This module provides functions to detect biological systems in pangenomes all in one workflow command
- panorama.workflow.pansystems.check_input_files(models_path: Path, table: Path = None, hmm: Path = None, disable_bar: bool = False) Tuple[DataFrame, Dict[str, List[HMM]], DataFrame, Models]#
Check the metadta table, the hmm and the models, in order to stop program before to read pangenome.
- Parameters:
models_path (Path) – The path to the models list file.
table (Path, optional) – The path to a table with annotation information. Defaults to None.
hmm (Path, optional) – The path to a tab-separated file with HMM information and path. Defaults to None.
disable_bar (bool, optional) – Whether to disable the progress bar. Defaults to False.
- Returns:
pd.Dataframe, optional – Dataframe containing for each pangenome a path to a table with annotation information.
Dict[str, List[HMM]] – A dictionary to identify which cutoff use to align HMM.
pd.Dataframe – Dataframe with hmm metadata information
Models – The models to detect systems.
- Raises:
AssertionError – If neither table nor hmm is provided.
- panorama.workflow.pansystems.check_pangenome_pansystems(pangenome: Pangenome, source: str, force: bool = False) None#
Checks the annotation of a pangenome and its systems.
- Parameters:
pangenome (Pangenome) – The pangenome to check.
source (str) – The source of the annotation.
force (bool, optional) – Whether to force the erased of already computed systems. Defaults to False.
- Raises:
ValueError – If systems are already detected based on the source and force is False.
- panorama.workflow.pansystems.check_pansystems_parameters(args: Namespace) Tuple[Dict[str, Any], Dict[str, Any]]#
Checks and validates the parameters for the pansystems function.
- Parameters:
args (argparse.Namespace) – The arguments passed to the function.
- Returns:
Tuple[Dict[str, Any], Dict[str, Any]] – A tuple containing the necessary information and HMM keyword arguments.
- Raises:
argparse.ArgumentError – If no type of systems writing is chosen.
- panorama.workflow.pansystems.launch(args)#
Launch functions to detect systems in pangenomes
- Parameters:
args – argument given in CLI
- panorama.workflow.pansystems.pansystems(pangenomes: Pangenomes, source: str, models: Models, hmm: Dict[str, List[HMM]] = None, table: DataFrame = None, k_best_hit: int = None, jaccard_threshold: float = 0.8, sensitivity: int = 1, projection: bool = False, association: List[str] = None, partition: bool = False, proksee: str = None, threads: int = 1, force: bool = False, disable_bar: bool = False, **hmm_kwgs: Any) None#
Detects systems in multiple pangenomes.
- Parameters:
pangenomes (Pangenomes) – The pangenomes to analyze.
source (str) – The source of the annotation.
models (Models) – The models to detect systems.
table (pd.Dataframe, optional) – Dataframe containing for each pangenome a path to a table with annotation information. Defaults to None.
hmm (Dict[str, List[HMM]], optional) – A dictionary to identify which cutoff use to align HMM . Defaults to None.
k_best_hit (int, optional) – The number of best annotation hits to keep per gene family. Defaults to None.
jaccard_threshold (float, optional) – The minimum Jaccard similarity used to filter edges between gene families. Defaults to 0.8.
sensitivity (int, optional) – Sensitivity level for detection. Defaults to 1. - 1. corresponds to a global Jaccard filtering on the context without looking at all the combinations. - 2. corresponds to a global Jaccard filtering on the specific context of each combination. - 3. corresponds to a local Jaccard filtering on the specific context of each combination.
projection (bool, optional) – Whether to project the systems on organisms. Defaults to False.
association (List[str], optional) – The type of association to write between systems and other pangenome elements. Defaults to None.
partition (bool, optional) – Whether to write a heatmap file with for each organism, partition of the systems. Defaults to False.
proksee (str, optional) – Whether to write a proksee file with systems. Defaults to None.
threads (int, optional) – The number of available threads. Defaults to 1.
force (bool, optional) – Whether to force erasing already computed systems. Defaults to False.
disable_bar (bool, optional) – Whether to disable the progress bar. Defaults to False.
**hmm_kwgs (Any) – Additional keyword arguments for HMM annotation.
- Raises:
AssertionError – If neither table nor hmm is provided.
- panorama.workflow.pansystems.parser_pansystems(parser)#
Add argument to parser for systems command
- Parameters:
parser – parser for systems argument
- panorama.workflow.pansystems.subparser(sub_parser) ArgumentParser#
Subparser to launch PANORAMA in Command line
- Parameters:
sub_parser – sub_parser for systems command
- Returns:
argparse.ArgumentParser – parser arguments for align command