CHO Coding Transcriptomes

This work can be reproduced by installing conda and the Snakemake environment in workflow/envs. For the analysis, install the conda environment in workflow/envs/r.yaml or the corresponding "pinned", i.e. explicit environment definition file.

File structure

The most important/interesting stuff is probably only contained in a handfull of directories. The workflow/ directory which includes all data processing scripts as well as workflow definitions for Snakemake. The analysis/ and results/analysis directories contain the analysis of the data in raw R Markdown and rendered HTML, respectively.

./
├── analysis/                 # (!!!) Analysis conducted in R; rendered to HTML
├── logs/                     # Logs from the data processing steps
├── plots/                    # (!!!) Plots produced in the workflow
├── reports/                  # Snakemake reports and rulegraphs
├── resources/                # Raw-data and other resources
│   ├── adapters/             # Adapters for trimming
│   └── raw_data/             # Raw-data downloaded in here
├── results/                  # Results produced in the workflow
│   └── analysis/             # (!!!) Rendered analysis reports
├── workflow/                 # (!!!) Workflow definitions
│   ├── envs/                 # Conda environments
│   ├── profile/              # Snakemake profile
│   ├── rules/                # Snakemake rules
│   ├── scripts/              # Scripts (Python, R, Bash)
│   ├── config.yaml           # Workflow config
│   └── Snakefile             # Main Snakefile
└── README.md

Pipeline

NCBI specific pre-processing workflow steps

Prefetch SRA files using prefetch utility (01_preprocessing.smk).
Convert SRA reads to .fastq.gz using fastq-dump (01_preprocessing.smk).

In-House specific pre-processing workflow steps

Gather all file paths for raw data as well as metadata from in-house datasets using the inhouse_data.R script.
Convert BAM files to .fastq.gz using bamtofastq (01_preprocessing.smk).

General Workflow

Data processing

Quality check of reads using fastqc. Gather all results for each dataset in a multiqc report (qc.smk).
Trimming reads using trimmomatic (02_trimming.smk).
Quality check of trimmed reads using fastqc. Again, gather results for each dataset with multiqc (02_trimming.smk, qc.smk).
Align reads to the PICRH genome using STAR (03_alignment.smk).
Quantify mapped reads using featureCounts (04_quantification.smk).
Chromatin states enrichments with chromHMM (05_chromatin-states.smk).

Analysis

This is mostly done in Quarto files. See directories analysis/ for .qmd files and results/analysis/ for corresponding HTML reports. The corresponding Snakemake rules are defined in 06_analysis.smk.

Figures

Figure 1: analysis/figure1.R
Figure 2: analysis/figure2.R
Figure 3: analysis/indeterminate-genes.R
Figure 4: analysis/figure4.R
Figure S1: analysis/figureS1.R
Figure S2: analysis/figureS2.R
Figure S3: analysis/figureS3.R
Figure S4: analysis/figureS4.R
Figure S5: analysis/indeterminate-genes.R
Figure S6: analysis/figureS6.R
Figure S7: analysis/06_chromatin-states.qmd

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
analysis		analysis
reports		reports
resources		resources
workflow		workflow
.Rprofile		.Rprofile
.gitignore		.gitignore
.lintr		.lintr
.style.yapf		.style.yapf
.tmuxp.yaml		.tmuxp.yaml
P02-Transcriptome.Rproj		P02-Transcriptome.Rproj
README.md		README.md
_quarto.yml		_quarto.yml
pyrightconfig.json		pyrightconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CHO Coding Transcriptomes

File structure

Pipeline

NCBI specific pre-processing workflow steps

In-House specific pre-processing workflow steps

General Workflow

Data processing

Analysis

Figures

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

NBorthLab/CHO-coding-transcriptome

Folders and files

Latest commit

History

Repository files navigation

CHO Coding Transcriptomes

File structure

Pipeline

NCBI specific pre-processing workflow steps

In-House specific pre-processing workflow steps

General Workflow

Data processing

Analysis

Figures

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages