Overview
This section goes over how to execute the workflow
The following topics will be covered:
- (Optional) Setting up Screen
- Using SnakeMake’s dry run
- Executing the workflow
(Optional) Using Screen
Unfortunately, Snakemake does not offer a method of closing the terminal while keeping the jobs running. This makes sense, as the main snakemake --profile cluster
command is tied directly to the main terminal process. To overcome this, we will simply start a Screen session. This allows us to close the main terminal window, while keeping our SSH connection/instance alive.
Alternatively, you can run snakemake in a bash script submitted to SLURM, as explained in this Google Doc
First, set a large scrollback for Screen, so we can view more lines after we have detached from the terminal. Execute the following:
echo "defscrollback 10000" >> ~/.screenrc
Once this is done, we can start a screen session with the following command:
screen -S snakemake
To leave the screen session while keeping it running, do the following:
- Press and hold the
control
key - Press
a
. **Continue holding control`` - Press
d
- The session will exit. Verify the session is alive, but detached by executing
screen -ls
- It should say
(Detacted)
next to the session name
- It should say
To re-enter a screen session, execute the following:
# View all screen sessions
screen -ls
# This will show the following output (if a screen session is running)
> There is a screen on:
> 184700.snakemake (Detached)
> 1 Socket in /run/screen/S-joshl.
# Pick the session you would like to enter (we are going to re-enter the `snakemake` session)
screen -r snakemake
SnakeMake Dry Run
It is highly recommended to run the workflow in dry run mode first to ensure that the workflow will run as expected. A dry run does several things:
- It checks the syntax of the Snakefile
- Allows you to see what steps in the workflow will be executed
- Ensures preliminary configuration is set up properly
A dry run does not truly execute any components of the pipeline. No results will be generated
Execute the following to perform a dry run
# Activate our conda environment
module load mamba
mamba activate snakemake
# Change to the FastqToGeneCounts directory
cd /work/helikarlab/joshl/FastqToGeneCounts
# Perfom a dry run
snakemake --profile cluster --dry-run
cluster
directory to something else, replace the --profile cluster
with the name of your directorysnakemake --profile cluster --dry-run
, replcae cluster
with ./cluster
After several seconds, many lines should move through the terminal.
It should end with This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.
If this is not the case, an error has occured, and it will need to be investigated before continuing. If you are having troubles, please Open an Issue
Execution
Once you have confirmed that a dry-run will execute successfully, it is time to start a real run of the workflow.
To see what sessions are available:
screen -lsTo re-enter a session:
screen -r SESSION_NAME
The following steps will start the workflow:
# Activate the snakemake environment
module load mamba
mamba activate snakemake
# Make sure you are in the FastqToGeneCounts directory!
cd /work/helikarlab/joshl/FastqToGeneCounts
# Start the workflow
snakemake --profile cluster
CTRL+a
, d
Any log files will be found in the logs
directory of the project directory.
Each rule has its own output folder, with output files containing the information they are running on (tissue name, run number, etc.)