create_target_indices ¶

create_target_indices(sample_sheet_file, output_json_file)

Reads the sample sheet CSV and creates a dictionary mapping Sample_ID to a list of four target index strings. Writes the dictionary to a JSON file.

The sample sheet is assumed to have the following columns: Sample_ID, Sample_Name, Sample_Plate, Sample_Well, Index_Plate_Well, I7_Index_ID, index, I5_Index_ID, index2, Sample_Project, Description

For each sample, it creates: - i7_variants: [original I7 index, 'N' + I7[1:]] - i5_variants: [original I5 index, 'N' + I5[1:]]

And then combines them as: "i7_variant + '+' + i5_variant"

filter_reads ¶

filter_reads(input_file, output_file, target_indices)

Filters reads from a FASTQ file (gzip-compressed) based on whether any of the target index strings appear in the read header.

process_sample ¶

process_sample(
    sample_id,
    target_indices,
    input_r1,
    input_r2,
    output_dir,
)

For a given sample, filter reads from the undetermined R1 and R2 FASTQ files using its target indices. Writes output files into output_dir.

undetermined_demultiplexer ¶

undetermined_demultiplexer(
    sample_sheet: str,
    input_r1: str,
    input_r2: str,
    *,
    output_dir: str,
    json_output: str,
    threads: int = 4,
) -> None

Filter undetermined FASTQ files for multiple samples using index information.

Parameters:

Name	Type	Description	Default
`sample_sheet`	`str`	Path to the sample sheet CSV file.	required
`input_r1`	`str`	Path to the undetermined R1 FASTQ.gz file.	required
`input_r2`	`str`	Path to the undetermined R2 FASTQ.gz file.	required
`output_dir`	`str`	Directory to store the filtered FASTQ files.	required
`json_output`	`str`	Path to output JSON file for sample target indices.	required
`threads`	`int`	Number of threads to use, default is 4.	`4`

Returns:

Type	Description
`None`