Skip to content

Add params to event gather pipeline to allow long-runnable and log errors / skipped events #195

@evamaxfield

Description

@evamaxfield

Feature Description

A clear and concise description of the feature you're requesting.

Add parameters:

  • batch-size an optional integer that will be used to iteratively slice and run the pipeline on that many events at a time. I.e. if the gather for the specified time range finds 50 events but the batch size is 10, the pipeline will run 5 independent times each with 10 events to process.
  • skip-errored-events-during-processing that will ignore events that raise an error during processing. Enough debug info should be gathered / kept that the log printed out after the pipeline finishes contains the event details and "the thing that errored".
  • skip-errored-events-during-gather that will ignore events that fail to scrape / gather. Similar to the above parameter, enough debug info should be printed after scraping. "Found 20 events, skipping 2 due to errors" for example.

Also would be really interesting to see if I can allow certain errors. retry-errors=[ConnectionError]

Use Case

Please provide a use case to help us understand your request in context.

I am backfilling a lot of data for certain instances and it is becoming annoying to process week by week. This is generally required for a couple of reasons:

  • storage space on machine (GHA runners only have 16 GB of disk so can't download and process more than ~4 meeting videos at a time) -- hence batch size
  • there are errors in less than 1% of events that aren't random connection errors. These are things like the video page being parsed incorrectly and such.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestevent gather pipelineA feature or bugfix relating to event processing

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions