Software used in the reports module

general_json_report

Convert input files to JSON for the general report.

Rule

rule general_json_report:
    input:
        files=[f'{filedef["input"]}' for filedef in general_report["files"]],
        output_files=config.get("general_report", {}),
    output:
        json="reports/general_json_report/{sample}_{type}.general.json",
    params:
        sample="{sample}_{type}",
        pipeline_version=pipeline_version,
        pipeline_name=pipeline_name,
        tc=get_tc_general_report,
        units=units,
        reference_genome=config.get("reference", {}).get("fasta", ""),
    log:
        "reports/general_json_report/{sample}_{type}.general_report.log",
    benchmark:
        repeat(
            "reports/general_json_report/{sample}_{type}.output.benchmark.tsv",
            config.get("general_json_report", {}).get("benchmark_repeats", 1),
        )
    threads: config.get("general_json_report", {}).get("threads", config["default_resources"]["threads"])
    resources:
        mem_mb=config.get("general_json_report", {}).get("mem_mb", config["default_resources"]["mem_mb"]),
        mem_per_cpu=config.get("general_json_report", {}).get("mem_per_cpu", config["default_resources"]["mem_per_cpu"]),
        partition=config.get("general_json_report", {}).get("partition", config["default_resources"]["partition"]),
        threads=config.get("general_json_report", {}).get("threads", config["default_resources"]["threads"]),
        time=config.get("general_json_report", {}).get("time", config["default_resources"]["time"]),
    container:
        config.get("general_json_report", {}).get("container", config["default_container"])
    message:
        "{rule}: generate general html report from json config"
    script:
        "../scripts/general_json_report.py"

input / output files

Rule parameters	Key	Value	Description
input	files	`[f'{filedef["input"]}' for filedef in general_report["files"]]`	Files that should be compiled into JSON.
input	output_files	`config.get("general_report", {})`	Path to yaml file with definitions of the input files.
output	json	`"reports/general_json_report/{sample}_{type}.general.json"`	JSON file with data to be presented in the general report.

Configuration

Resources settings (`resources.yaml`)

Key	Type	Description
mem_mb	integer	max memory in MB to be available
mem_per_cpu	integer	memory in MB used per cpu
partition	string	partition to use on cluster
threads	integer	number of threads to be available
time	string	max execution time

general_html_report

Generate a general HTML report for a sample.

Rule

rule general_html_report:
    input:
        config_schema=workflow.source_path("../schemas/general_html_report_json.schema.yaml"),
        html_template=workflow.source_path("../templates/general_html_report/index.html"),
        json="reports/general_json_report/{sample}_{type}.general.json",
        css_files=[
            workflow.source_path("../templates/general_html_report/style.css"),
            workflow.source_path("../templates/assets/css/datatables.min.css"),
        ],
        js_files=[workflow.source_path("../templates/assets/js/datatables.min.js")],
        additional_json={},
    output:
        html="reports/general_html_report/{sample}_{type}.general_report.html",
    params:
        final_directory_depth=config.get("general_html_report", {}).get("final_directory_depth", 1),
        multiqc_config=config.get("general_html_report", {}).get("multiqc_config", ""),
        units=units,
        extra=config.get("general_html_report", {}).get("extra", ""),
    log:
        "reports/general_html_report/{sample}_{type}.general_report.log",
    benchmark:
        repeat(
            "reports/general_html_report/{sample}_{type}.output.benchmark.tsv",
            config.get("general_html_report", {}).get("benchmark_repeats", 1),
        )
    threads: config.get("general_html_report", {}).get("threads", config["default_resources"]["threads"])
    resources:
        mem_mb=config.get("general_html_report", {}).get("mem_mb", config["default_resources"]["mem_mb"]),
        mem_per_cpu=config.get("general_html_report", {}).get("mem_per_cpu", config["default_resources"]["mem_per_cpu"]),
        partition=config.get("general_html_report", {}).get("partition", config["default_resources"]["partition"]),
        threads=config.get("general_html_report", {}).get("threads", config["default_resources"]["threads"]),
        time=config.get("general_html_report", {}).get("time", config["default_resources"]["time"]),
    container:
        config.get("general_html_report", {}).get("container", config["default_container"])
    message:
        "{rule}: generate general html report from json config {input.json}"
    script:
        "../scripts/general_html_report.py"

input / output files

Rule parameters	Key	Value	Description
input	config_schema	`workflow.source_path("../schemas/general_html_report_json.schema.yaml")`	Validation schema for the JSON input.
	html_template	`workflow.source_path("../templates/general_html_report/index.html")`	HTML template that the report should be based on.
	json	`"reports/general_json_report/{sample}_{type}.general.json"`	JSON that should be rendered in the report as produced by `general_json_report`.
	css_files	`[ workflow.source_path("../templates/general_html_report/style.css"), workflow.source_path("../templates/assets/css/datatables.min.css"), ]`	CSS files that should be included in the template.
	additional_json	`{}`	Additional JSON that should be included in the report.
output	html	`"reports/general_html_report/{sample}_{type}.general_report.html"`	Interactive HTML report for a sample.

Configuration

Software settings (`config.yaml`)

Key	Type	Description
final_directory_depth	integer	How deep in the final results directory the report will be. This will be used to correctly resolve relative paths in the JSON config. For example, if the report is located in the directory `results/reports`, the depth would be 2.
multiqc_config	string	Path to multiqc config file in cases where you have custom content in the multiqc report or if you want to hide certain general statistics columns in the general report that are also hidden in the multiqc report.
benchmark_repeats	integer	set number of times benchmark should be repeated
container	string	name or path to docker/singularity container
extra	string	parameters that should be forwarded

Resources settings (`resources.yaml`)

Key	Type	Description
mem_mb	integer	max memory in MB to be available
mem_per_cpu	integer	memory in MB used per cpu
partition	string	partition to use on cluster
threads	integer	number of threads to be available
time	string	max execution time

cnv_html_report

Generate an HTML report for CNVs.

Rule

rule cnv_html_report:
    input:
        json="reports/cnv_html_report/{sample}_{type}.{tc_method}.merged.json",
        html_template=workflow.source_path("../templates/cnv_html_report/index.html"),
        js_files=[
            workflow.source_path("../templates/assets/js/d3.v7.min.js"),
            workflow.source_path("../templates/cnv_html_report/01-chromosome-plot.js"),
            workflow.source_path("../templates/cnv_html_report/02-genome-plot.js"),
            workflow.source_path("../templates/cnv_html_report/03-results-table.js"),
            workflow.source_path("../templates/cnv_html_report/04-window-summary.js"),
            workflow.source_path("../templates/cnv_html_report/05-main.js"),
        ],
        css_files=[
            workflow.source_path("../templates/assets/css/icons.css"),
            workflow.source_path("../templates/cnv_html_report/style.css"),
        ],
        tc_file=get_tc_file,
        extra_table_files=[t["path"] for t in config.get("cnv_html_report", {}).get("extra_tables", [])],
    output:
        html=temp("reports/cnv_html_report/{sample}_{type}.{tc_method}.cnv_report.html"),
    params:
        include_table=config.get("cnv_html_report", {}).get("show_table", True),
        extra_tables=config.get("cnv_html_report", {}).get("extra_tables", []),
        tc=get_tc,
        tc_method=lambda wildcards: wildcards.tc_method,
        include_cytobands=config.get("cnv_html_report", {}).get("cytobands", False),
    log:
        "reports/cnv_html_report/{sample}_{type}.{tc_method}.cnv_report.html.log",
    benchmark:
        repeat(
            "reports/cnv_html_report/{sample}_{type}.{tc_method}.cnv_report.html.benchmark.tsv",
            config.get("cnv_html_report", {}).get("benchmark_repeats", 1),
        )
    threads: config.get("cnv_html_report", {}).get("threads", config["default_resources"]["threads"])
    resources:
        mem_mb=config.get("cnv_html_report", {}).get("mem_mb", config["default_resources"]["mem_mb"]),
        mem_per_cpu=config.get("cnv_html_report", {}).get("mem_per_cpu", config["default_resources"]["mem_per_cpu"]),
        partition=config.get("cnv_html_report", {}).get("partition", config["default_resources"]["partition"]),
        threads=config.get("cnv_html_report", {}).get("threads", config["default_resources"]["threads"]),
        time=config.get("cnv_html_report", {}).get("time", config["default_resources"]["time"]),
    container:
        config.get("cnv_html_report", {}).get("container", config["default_container"])
    message:
        "{rule}: Compile a CNV HTML report for {wildcards.sample}_{wildcards.type}"
    script:
        "../scripts/cnv_html_report.py"

input / output files

Rule parameters	Key	Value	Description
input	json	`"reports/cnv_html_report/{sample}_{type}.{tc_method}.merged.json"`	Merged JSON file with CNV calls and other sample information.
	html_template	`workflow.source_path("../templates/cnv_html_report/index.html")`	Path to the html template
	js_files	`[ workflow.source_path("../templates/assets/js/d3.v7.min.js"), workflow.source_path("../templates/cnv_html_report/01-chromosome-plot.js"), workflow.source_path("../templates/cnv_html_report/02-genome-plot.js"), workflow.source_path("../templates/cnv_html_report/03-results-table.js"), workflow.source_path("../templates/cnv_html_report/04-window-summary.js"), workflow.source_path("../templates/cnv_html_report/05-main.js"), ]`	List of javascript files that should be included in the report. The order of the files is significant. If you have dependencies between the files, you need to supply the dependencies before the script(s) that depends on them.
	css_files	`[ workflow.source_path("../templates/assets/css/icons.css"), workflow.source_path("../templates/cnv_html_report/style.css"), ]`	List of css files that should be included in the report. Files are included in the order given.
	tc_file	`get_tc_file`	Path to a text file containing the tumor cell content estimated by the method `tc_method`.
output	html	`"reports/cnv_html_report/{sample}_{type}.{tc_method}.cnv_report.html"`	Interactive HTML report for CNVs.

Configuration

Software settings (`config.yaml`)

Key	Type	Description
benchmark_repeats	integer	set number of times benchmark should be repeated
container	string	name or path to docker/singularity container
cytobands	boolean	Whether or not to add cytoband information to the plots and the CNV table.
show_table	boolean	Whether or not to display a table of called CNVs in the report. If this is true, then the attributes `filtered_cnv_vcfs` and `unfiltered_cnv_vcfs` under `merge_cnv_json` are required.
extra_tables	array	Additional tables that should be added to the report. The tables will be based on the columns of the TSV file listed, and column names are required and assumed to be present.

Resources settings (`resources.yaml`)

Key	Type	Description
mem_mb	integer	max memory in MB to be available
mem_per_cpu	integer	memory in MB used per cpu
partition	string	partition to use on cluster
threads	integer	number of threads to be available
time	string	max execution time

cnv_json

Convert CNV results from a particular CNV caller to JSON that is compatible with the final report.

Rule

rule cnv_json:
    input:
        ratios=get_cnv_ratios,
        segments=get_cnv_segments,
    output:
        json=temp("reports/cnv_html_report/{sample}_{type}.{caller}.{tc_method}.json"),
    params:
        skip_chromosomes=config.get("reference", {}).get("skip_chrs"),
    log:
        "reports/cnv_html_report/{sample}_{type}.{caller}.{tc_method}.json.log",
    benchmark:
        repeat(
            "reports/cnv_html_report/{sample}_{type}.{caller}.{tc_method}.json.benchmark.tsv",
            config.get("cnv_json", {}).get("benchmark_repeats", 1),
        )
    threads: config.get("cnv_json", {}).get("threads", config["default_resources"]["threads"])
    resources:
        mem_mb=config.get("cnv_json", {}).get("mem_mb", config["default_resources"]["mem_mb"]),
        mem_per_cpu=config.get("cnv_json", {}).get("mem_per_cpu", config["default_resources"]["mem_per_cpu"]),
        partition=config.get("cnv_json", {}).get("partition", config["default_resources"]["partition"]),
        threads=config.get("cnv_json", {}).get("threads", config["default_resources"]["threads"]),
        time=config.get("cnv_json", {}).get("time", config["default_resources"]["time"]),
    container:
        config.get("cnv_json", {}).get("container", config["default_container"])
    message:
        "{rule}: Create JSON representation for CNV results from {wildcards.caller} "
        "for {wildcards.sample}_{wildcards.type}"
    script:
        "../scripts/cnv_json.py"

input / output files

Rule parameters	Key	Value	Description
input	ratios	`get_cnv_ratios`	Path to a file with log₂ ratios for a specific caller. Determined by an input function that returns a path to the correct file based on which caller was used.
input	segments	`get_cnv_segments`	Path to a file with CNV segments for a specific caller. Determined by an input function that returns a path to the correct file based on which caller was used.
output	json	`"reports/cnv_html_report/{sample}_{type}.{caller}.{tc_method}.json"`	A JSON representation of the CNV results from a specific caller.

Configuration

Software settings (`config.yaml`)

Key	Type	Description
benchmark_repeats	integer	set number of times benchmark should be repeated
container	string	name or path to docker/singularity container

Resources settings (`resources.yaml`)

Key	Type	Description
mem_mb	integer	max memory in MB to be available
mem_per_cpu	integer	memory in MB used per cpu
partition	string	partition to use on cluster
threads	integer	number of threads to be available
time	string	max execution time

merge_cnv_json

Merge JSON files from multiple CNV callers and add annotations and other sample specific data.

Rule

rule merge_cnv_json:
    input:
        json=get_json_for_merge_cnv_json,
        fai=config.get("reference", {}).get("fai", ""),
        annotation_bed=config.get("merge_cnv_json", {}).get("annotations", []),
        germline_vcf=get_germline_vcf,
        filtered_cnv_vcfs=get_filtered_cnv_vcf,
        cnv_vcfs=get_unfiltered_cnv_vcf,
        cytobands=config.get("merge_cnv_json", {}).get("cytobands", []),
    output:
        json=temp("reports/cnv_html_report/{sample}_{type}.{tc_method}.merged.json"),
    params:
        skip_chromosomes=config.get("reference", {}).get("skip_chrs", []),
        cytobands=config.get("cnv_html_report", {}).get("cytobands", False),
    log:
        "reports/cnv_html_report/{sample}_{type}.{tc_method}.merged.json.log",
    benchmark:
        repeat(
            "reports/cnv_html_report/{sample}_{type}.{tc_method}.merged.json.benchmark.tsv",
            config.get("merge_cnv_json", {}).get("benchmark_repeats", 1),
        )
    threads: config.get("merge_cnv_json", {}).get("threads", config["default_resources"]["threads"])
    resources:
        mem_mb=config.get("merge_cnv_json", {}).get("mem_mb", config["default_resources"]["mem_mb"]),
        mem_per_cpu=config.get("merge_cnv_json", {}).get("mem_per_cpu", config["default_resources"]["mem_per_cpu"]),
        partition=config.get("merge_cnv_json", {}).get("partition", config["default_resources"]["partition"]),
        threads=config.get("merge_cnv_json", {}).get("threads", config["default_resources"]["threads"]),
        time=config.get("merge_cnv_json", {}).get("time", config["default_resources"]["time"]),
    container:
        config.get("merge_cnv_json", {}).get("container", config["default_container"])
    message:
        "{rule}: Merge CNV JSON data for {wildcards.sample}_{wildcards.type}"
    script:
        "../scripts/merge_cnv_json.py"

input / output files

Rule parameters	Key	Value	Description
input	json	`get_json_for_merge_cnv_json`	One or more JSON files containing CNV information, one file per caller. Supplied by input function that determines the paths based on the included callers, and the callers that should be included is determined by the `svdb_merge` config. The paths to the files are based on the cnv_sv module.
	fai	`config.get("reference", {}).get("fai", "")`	Reference genome FASTA index.
	annotation_bed	`config.get("merge_cnv_json", {}).get("annotations", [])`	Zero or more BED files with regions that should be annotated in the chromosome plot.
	germline_vcf	`get_germline_vcf`	Optional VCF file with germline variants. Supplied by input function that takes it from the config.
	filtered_cnv_vcfs	`get_filtered_cnv_vcf`	Zero or more VCF files containing filtered CNV calls that should be displayed in the results table. Supplied by input function that takes these from the config.
	cnv_vcfs	`get_unfiltered_cnv_vcf`	Zero or more VCF files containing unfiltered CNV calls that should be displayed in the results table. Supplied by input function that takes these from the config.
	cytobands	`config.get("merge_cnv_json", {}).get("cytobands", [])`	Optional path to a file with cytoband definitions. This file should conform to the UCSC schema for cytobands. See https://www.genome.ucsc.edu/cgi-bin/hgTables?db=hg19&hgta_group=map&hgta_track=cytoBand&hgta_table=cytoBand&hgta_doSchema=describe+table+schema
output	json	`"reports/cnv_html_report/{sample}_{type}.{tc_method}.merged.json"`	Merged JSON file with CNV calls and other sample information that is ready to be included in the final report.

Configuration

Software settings (`config.yaml`)

Key	Type	Description
benchmark_repeats	integer	set number of times benchmark should be repeated
container	string	name or path to docker/singularity container
annotations	array	List of BED files with custom annotations.
cytoband_config	object	Cytoband visualisation configuration.
cytobands	string	Tab-separated file containing cytoband information. The file should have five columns: chromosome name, start position (0-based), end position (exclusive), name, and Giemsa stain result. The Giemsa stain results should match with the config under `cnv_html_report.cytoband_config`
filtered_cnv_vcfs	array	VCF files containing filtered CNV calls that should be displayed in the results table. Will only have an effect if `show_table` is true, and if `show_table` is true, this attribute is required.
germline_vcf	string	Path to a germline VCF file that will be used to display VAFs in the plots of the report. The path supports the wildcards `sample` and `type` wildcards.
unfiltered_cnv_vcfs	array	VCF files containing unfiltered CNV calls that should be displayed in the results table. Will only have an effect if `show_table` is true, and if `show_table` is true, this attribute is required.

Resources settings (`resources.yaml`)

Key	Type	Description
mem_mb	integer	max memory in MB to be available
mem_per_cpu	integer	memory in MB used per cpu
partition	string	partition to use on cluster
threads	integer	number of threads to be available
time	string	max execution time

Software used in the reports module

general_json_report

Rule

input / output files

Configuration

Resources settings (resources.yaml)

general_html_report

Rule

input / output files

Configuration

Software settings (config.yaml)

Resources settings (resources.yaml)

cnv_html_report

Rule

input / output files

Configuration

Software settings (config.yaml)

Resources settings (resources.yaml)

cnv_json

Rule

input / output files

Configuration

Software settings (config.yaml)

Resources settings (resources.yaml)

merge_cnv_json

Rule

input / output files

Configuration

Software settings (config.yaml)

Resources settings (resources.yaml)

Resources settings (`resources.yaml`)

Software settings (`config.yaml`)

Resources settings (`resources.yaml`)

Software settings (`config.yaml`)

Resources settings (`resources.yaml`)

Software settings (`config.yaml`)

Resources settings (`resources.yaml`)

Software settings (`config.yaml`)

Resources settings (`resources.yaml`)