Skip to content

clusteredLamentationSequence-24/SSB-MRes-2025

Repository files navigation

Scripts breakdown


  • hg_intronGrabbing.py Grabs introns from the UCSC Genome Browser via API calls

  • TF_assess_RE-removal.py Intelligently mutates out RE site(s) while maintaining transcription factor binding profile strength among categorized PWMs in the HOCOMOCO database, scoring with SARUS.jar

  • codon_optimization.py Fancy and greedy codon optimizer. Matches fractions of codons per amino acid to a webscraped host genome for an input amino acid sequence, while scoring and sorting by:

    • Renyi entropy (measure of evenness of ATCG distribution)
    • CpG/ApU scores (calculated from count and desity)
    • chosen list of enzymes which do/don't cut in specified region(s)
  • BioNeutral_RandDNAseq.py Functionially similar structure to the codon optimizer, but instead creates random sequence of input length, matching base-ratio of a webscraped host genome, and sorting against known structures and sites

  • SpliceAl-multi-seq_inContext.py Takes the many-sequence output of the codon optimizer, and scores the top x sequences for forward (and optionially reverse orientation) splice sites within an input sequence context. Averages results from 5 AI models trained on splicing in the human genome, courtesy of Illumina.

  • sites_grapher.py Interactively graphs accuracy of splice prediction against experimentally-derrived sites. Uses merged_sites.csv

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages