Skip to content

Conversation

@ryanghunter
Copy link
Contributor

Purpose

This PR adds DSWx-NI triggering logic support to the daac_data_subscriber system. It supports time based queries and native-id queries natively through the daac_data_subscriber top-level script.

Proposed Changes

  • [ADD] support for querying, cataloging, and triggering DSWx-NI jobs based on L2 NISAR GCOV products
  • [ADD] --query-replacement-file options to daac_data_subscriber for using pre-canned json instead of query for testing

Issues

Testing

  • end-to-end manually tested triggering from mozart on dev cluster using UAT query results for L2 GCOV
    • validated that the expected 6 DSWx-NI jobs are triggered
  • added unit tests for MGRS database logic and interactions

@ryanghunter
Copy link
Contributor Author

ryanghunter commented Aug 5, 2025

Proposed TODO before this is merged:

  • fix jobs failing on submission to PCM
  • properly stage MGRS database, it's currently hardcoded to read from test data on mozart
  • test end-to-end as query job running on mozart
  • create scenario tests for more complicated cases
  • use queue argument instead of hardcoded queue

here's the stack trace I'm getting from failing jobs:

Traceback (most recent call last):
  File "/home/ops/verdi/ops/opera-pcm/opera_chimera/run_sciflo.py", line 121, in <module>
    sys.exit(main(args.sfl_file, args.context_file, args.output_folder))
  File "/home/ops/verdi/ops/opera-pcm/opera_chimera/run_sciflo.py", line 107, in main
    accountability = get_accountability_class(context_file)
  File "/home/ops/verdi/ops/opera-pcm/opera_chimera/run_sciflo.py", line 91, in get_accountability_class
    cls_object = cls(context, work_dir)
  File "/home/ops/verdi/ops/opera-pcm/opera_chimera/accountability.py", line 105, in __init__
    self.product_paths = metadata["product_paths"]["L2_NISAR_GCOV"]
KeyError: 'L2_NISAR_GCOV'

@ryanghunter
Copy link
Contributor Author

ryanghunter commented Aug 5, 2025

for MGRS db staging -
add to opera-ancillaries s3 bucket under dswx_ni/ directory, use get_ancillary_from_s3 and use s3 path in settings.yaml and have fall back location filesystem

for testing lambda triggered as query job -
trigger via event-bridge timer for dswx-ni

@hhlee445 hhlee445 added bug Something isn't working enhancement New feature or request pcm.r04 PCM Release 4 - DSWx-NI labels Aug 6, 2025
@ryanghunter ryanghunter requested a review from chrisjrd August 19, 2025 17:31
@ryanghunter ryanghunter marked this pull request as ready for review August 19, 2025 18:16
@chrisjrd
Copy link
Contributor

can you flatten the commits? (reset branch to parent develop commit, then force push the reported changes. This will simplify the commit history)

Copy link
Contributor

@chrisjrd chrisjrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes look good.
you may want to move the _extract* methods outside of the CmrQuery class (or just leave them in. i would organize them separately since they are low level, to keep the methods at a higher level of abstraction than parsing string/granule).
Some of the docstrings are redundant, and simply repeat the function name, so you could omit those docstrings without a loss in readability.

@chrisjrd chrisjrd self-requested a review August 25, 2025 20:36
@ryanghunter
Copy link
Contributor Author

ryanghunter commented Aug 26, 2025

Hyun and I had a conversation offline and talked about some changes:

  • gcov should also have a download job in the case we need to use https and stage in s3, also to be consistent with other data_subscriber jobs
  • misnamed timer event rule still says dswx-ni instead of gcov
  • sciflo job queues CAN'T use public VPC
  • if chunk-size>1, a download job should create/submit N chunk-size sciflo jobs

@hhlee445
Copy link
Contributor

Sample command to download data from HTTPS url
python3 ~/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py query -c NISAR_L2_GCOV_BETA_V1 --job-queue=opera-job_worker-gcov_query --chunk-size 1 --native-id=NISAR_L2_PR_GCOV_016_156_A_011_2005_DVDV_A_20230701T000816_20230701T000835_T00408_N_P_J_001 --endpoint UAT --transfer-protocol https

@hhlee445
Copy link
Contributor

@ryanghunter getting this error

Traceback (most recent call last):
  File "/home/ops/verdi/ops/opera-pcm/util/exec_util.py", line 35, in wrapper
    status = func(*args, **kwargs)
  File "/home/ops/verdi/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 45, in main
    run(sys.argv)
  File "/home/ops/verdi/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 78, in run
    results["download"] = run_download(args, token, es_conn, netloc, username, password, cmr, job_id)
  File "/home/ops/verdi/ops/opera-pcm/data_subscriber/daac_data_subscriber.py", line 134, in run_download
    downloader.run_download(args, token, es_conn, netloc, username, password, cmr, job_id)
  File "/home/ops/verdi/ops/opera-pcm/data_subscriber/asf_gcov_download.py", line 31, in run_download
    return self.submit_dswx_ni_job_submission_handler(sets_to_process)
  File "/home/ops/verdi/ops/opera-pcm/data_subscriber/asf_gcov_download.py", line 35, in submit_dswx_ni_job_submission_handler
    jobs = self.trigger_dswx_ni_jobs(sets_to_process)
  File "/home/ops/verdi/ops/opera-pcm/data_subscriber/asf_gcov_download.py", line 84, in trigger_dswx_ni_jobs
    return [submit_dswx_ni_job(
  File "/home/ops/verdi/ops/opera-pcm/data_subscriber/asf_gcov_download.py", line 86, in <listcomp>
    job_queue=self.args.job_queue,
AttributeError: 'AsfDaacGcovDownload' object has no attribute 'args'

@hhlee445
Copy link
Contributor

The following command should trigger three DSWx-NI jobs, since the track number 155 and frame number 67 are used by MS_155_64, MS_155_65, and MS_155_66:

python3 ~/mozart/ops/opera-pcm/data_subscriber/daac_data_subscriber.py query \
  -c NISAR_L2_GCOV_BETA_V1 \
  --job-queue=opera-job_worker-gcov_query \
  --chunk-size 1 \
  --native-id=NISAR_L2_PR_GCOV_016_156_A_011_2005_DVDV_A_20230701T000816_20230701T000835_T00408_N_P_J_001 \
  --endpoint UAT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request pcm.r04 PCM Release 4 - DSWx-NI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants