Skip to content

create_target_indices

create_target_indices(sample_sheet_file, output_json_file)

Reads the sample sheet CSV and creates a dictionary mapping Sample_ID to a list of four target index strings. Writes the dictionary to a JSON file.

The sample sheet is assumed to have the following columns: Sample_ID, Sample_Name, Sample_Plate, Sample_Well, Index_Plate_Well, I7_Index_ID, index, I5_Index_ID, index2, Sample_Project, Description

For each sample, it creates: - i7_variants: [original I7 index, 'N' + I7[1:]] - i5_variants: [original I5 index, 'N' + I5[1:]]

And then combines them as: "i7_variant + '+' + i5_variant"

filter_reads

filter_reads(input_file, output_file, target_indices)

Filters reads from a FASTQ file (gzip-compressed) based on whether any of the target index strings appear in the read header.

process_sample

process_sample(
    sample_id,
    target_indices,
    input_r1,
    input_r2,
    output_dir,
)

For a given sample, filter reads from the undetermined R1 and R2 FASTQ files using its target indices. Writes output files into output_dir.

undetermined_demultiplexer

undetermined_demultiplexer(
    sample_sheet: str,
    input_r1: str,
    input_r2: str,
    *,
    output_dir: str,
    json_output: str,
    threads: int = 4,
) -> None

Filter undetermined FASTQ files for multiple samples using index information.

Parameters:

Name Type Description Default
sample_sheet str

Path to the sample sheet CSV file.

required
input_r1 str

Path to the undetermined R1 FASTQ.gz file.

required
input_r2 str

Path to the undetermined R2 FASTQ.gz file.

required
output_dir str

Directory to store the filtered FASTQ files.

required
json_output str

Path to output JSON file for sample target indices.

required
threads int

Number of threads to use, default is 4.

4

Returns:

Type Description
None