-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Hello, amazing work!
I have a question regarding the annotation of non-model species that are not present in the OMA Database. To obtain orthologous protein evidence, I first ran OMA Standalone using protein sequences with default parameters, and I confirmed that the EstimatedSpeciesTree.nwk aligns with my expectations. This yielded an OrthologousGroups.orthoxml file. However, the Extract Consensus Annotation step failed with the following error:
INFO:pyham.ham:Parse Orthoxml: 21740 top level hogs and 86018 extant genes extract.
INFO:pyham.ham:Set up Ham analysis: ready to go with 21740 hogs founded within 2 species.
Traceback (most recent call last):
File "/strage_151/home_187/wuyh/soft/OMAnnotator/OMAnnotation/OMAnnotation.py", line 494, in <module>
args.func(args)
File "/strage_151/home_187/wuyh/soft/OMAnnotator/OMAnnotation/OMAnnotation.py", line 92, in extract_consensus
cons_fasta, cons_gff = select_consensus_sequence(protids, gff_corr, fasta_corr)
File "/strage_151/home_187/wuyh/soft/OMAnnotator/OMAnnotation/OMAnnotation.py", line 261, in select_consensus_sequence
hog_record_list = [corr_fasta_map["_".join([hid.split(' ')[0], hif])] for hid, hif in hog_prot_id]
File "/strage_151/home_187/wuyh/soft/OMAnnotator/OMAnnotation/OMAnnotation.py", line 261, in <listcomp>
hog_record_list = [corr_fasta_map["_".join([hid.split(' ')[0], hif])] for hid, hif in hog_prot_id]
KeyError: 'evm.model.Hic_asm_11.2212_china_protein'
My question is: Could this error be due to the fact that I ran OMA Standalone using protein sequences, but for OMAnnotator I used genome files and annotation files? Or is it because I did not run the "Preparing the data" step?
In other words, when working with species that are not present in the OMA Database, what is the recommended workflow for using OMAnnotator? Do you have any suggestions for an improved process?
Any help or advice would be greatly appreciated!
Best regards