Skip to content

Pipeline logic #29

@jeremyestein

Description

@jeremyestein

How the constituent operations (de-id, ftps etc) are turned into a coherent pipeline.

This is arguably an epic.

Such concerns as:

  • how are the operations run/scheduled?
  • How one step knows the previous step has finished before proceding
  • How do we know each step succeeded. Detecting and re-running failures.
  • how to deal with upstream data changing? (ie. some new data suddenly arrived because it was late or intentionally being replayed)

Careful naming of files will be necessary throughout this process.

Logging of operations (especially what we have uploaded to the DSH)

Using snakemake is an option, as it already solves the dependency graph stuff. But an adhoc implementation may also do just fine.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions