Parameters

Input/output options

Define where the pipeline should find input data and save output data.

Parameter Description Default
dataset_name The name of this dataset, used in output files and within visualisations
input Path to comma-separated file containing information about the samples in the experiment. TODO: finish samplesheet
outdir The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure. results

Pipeline parameters

Parameter Description Default
mod_method The method used to call the RNA modification (accepted: dorado|m6anet) dorado
diff_method The method used to call differentially expressed sites (accepted: lmer|modkit) lmer
genome The reference genome, in .fasta format
transcriptome The reference transcriptome, in .fasta format
min_reads Minimum reads needed at a site (across all samples) to be called 20
prob_threshold Minimum probability detected at a site to be called 0

Samplesheet

The samplesheet is a CSV file which contains information about the samples to be analysed in the pipeline. It should have the following columns. A header is required. Optional columns should be left empty.

  • name: a unique name for each sample
  • group: the experimental group or condition for each sample
  • path_dorado: (optional) path to pre-basecalled Dorado modification data for each sample
  • path_m6anet: (optional) path to pre-basecalled m6Anet modification data for each sample
  • path_pod5: (optional) path to raw POD5

If you call modifications using Dorado, you must provide either path_dorado or path_pod5 for each sample. If you call modifications using m6Anet, you must provide either path_m6anet or path_pod5 for each sample.

Two groups should be provided to call differential modifications between conditions. Group names should be alphanumeric and without spaces. The underlying model take the first group alphabetically as the reference level, and the second group alphabetically as the treatment level.

An example samplesheet is shown below:

name,group,path_dorado,path_m6anet,path_pod5
sample1,group1,,,pod5_path
namegrouppath_doradopath_m6anetpath_pod5
sample1group1pod5_path

Mako is maintained by the Shim Lab @ the University of Melbourne.