[DRAFT][PRMP-188] Refactor bulk upload #763

PedroSoaresNHS · 2025-08-26T07:07:01Z

Created a new class called bulk_upload_service_v2
Refactored this class to better follow good practices

https://nhsd-jira.digital.nhs.uk/browse/PRMT-580

…tests

NogaNHS · 2025-08-26T08:07:56Z

lambdas/services/bulk_upload_service_v2.py

+                logger.info(
+                    "Cannot validate patient due to PDS responded with Too Many Requests"
+                )
+                logger.info("Cannot process for now due to PDS rate limit reached.")
+                logger.info(
+                    "All remaining messages in this batch will be returned to sqs queue to retry later."
+                )


This is not necessary true, not every pds failure is rate limit failure.

this is a refactor of bulk upload, so no logic was/can be changed in this ticket

Can't you still remove/edit the logs? that is not a logic change.

lambdas/services/bulk_upload_service_v2.py

NogaNHS · 2025-08-26T08:14:16Z

lambdas/services/bulk_upload_service_v2.py

+            return StagingMetadata.model_validate_json(staging_metadata_json)
+        except (pydantic.ValidationError, KeyError) as e:
+            logger.error(f"Got incomprehensible message: {message}")
+            logger.error(e)


You log the error here and then raise it and log it again.

trying to keep the bulk upload logic as much possible

NogaNHS · 2025-08-26T08:16:02Z

lambdas/services/bulk_upload_service_v2.py

+            staging_metadata_json = message["body"]
+            return StagingMetadata.model_validate_json(staging_metadata_json)
+        except (pydantic.ValidationError, KeyError) as e:
+            logger.error(f"Got incomprehensible message: {message}")


This would error when body does exist, so this log does not reflect the error.

trying to keep the bulk upload logic as much possible

lambdas/services/bulk_upload_service_v2.py

NogaNHS · 2025-08-26T08:20:32Z

lambdas/services/bulk_upload_service_v2.py

+        file_names = [
+            os.path.basename(metadata.file_path) for metadata in staging_metadata.files
+        ]
+        request_context.patient_nhs_no = staging_metadata.nhs_number


When you set this, every log will have this nhs number in them until the value is reset. Which mean this will be in the logs of the next message that belongs to another nhs number. You need to reset this to avoid confusing info.

this is a refactor so cant change logic in this ticket

lambdas/services/bulk_upload_service_v2.py

steph-torres-nhs · 2025-08-26T15:00:12Z

lambdas/services/bulk_upload_service_v2.py

+        if self.unhandled_messages:
+            logger.info("Unable to process the following messages:")
+            for message in self.unhandled_messages:
+                message_body = json.loads(message.get("body", "{}"))
+                request_context.patient_nhs_no = message_body.get(
+                    "NHS-NO", "no number found"
+                )
+                logger.info(message_body)


what happens to these messages, do we a dlq setup for thme?

ticket created for it

NogaNHS · 2025-08-28T13:44:51Z

lambdas/services/bulk_upload_service_v2.py

+        This method performs the following steps:
+        1. Parses the message and constructs staging metadata.
+        2. Validates the NHS number and file names.
+        3. Performs additional validation checks such as patient access conditions
+           (e.g., deceased, restricted) and virus scan results.
+        4. Initiates transactional operations and transfers the validated files.
+        5. Removes the ingested files from the staging bucket.
+        6. Logs the completion of ingestion and writes the report to DynamoDB.
+        7. Sends metadata to the stitching queue for further processing.


If this method is doing 7 steps then it is doing too much.
Functions Should Do One Thing

lambdas/services/bulk_upload_service_v2.py

NogaNHS · 2025-08-29T08:52:23Z

lambdas/services/bulk_upload_service_v2.py

I think this service need breaking down to smaller services.

for new we will keep it as is, but that is something that needs to be done

lambdas/services/bulk_upload_service_v2.py

Co-authored-by: Mohammad Iqbal <[email protected]>

MohammadIqbalAD-NHS · 2025-09-01T10:06:51Z

Unit tests are failing.

ghost · 2025-09-01T10:54:44Z

lambdas/services/bulk_upload_service_v2.py

+
+        logger.info(
+            "NHS Number and filename validation complete."
+            "Validated strict mode, and patient information is accessible (e.g. patient not deceased/restricted)


Missing closing quotes

MohammadIqbalAD-NHS

Unit tests failing.

bethany-kish-nhs · 2025-09-03T13:38:01Z

I thought the V2 code was going to be added before this refactor, this way we won't have visibility/ traceability of what's been refactored once it all gets squashed

bethany-kish-nhs

If you compare with bulk upload on main it looks like changes have been made there and the V2 code here is out of date

eg. custodian logic has been changed on main here:
f9941b0

We need to make sure this ticket only refactors what's there on main and doesn't miss anything, we'll need to have a process for keeping both up to date as well

robg-nhs

I'm concerned that you have a PR with a title of Refactor Bulk Upload but I'm unsure what the intention is?

While I have seen the JIRA ticket which does have information on what your attempting to accomplish, I think your trying to do too much at once.

I suggest you breakdown the work further so the PR's have intent e.g. I'm doing XYZ

PedroSoaresNHS added 8 commits August 21, 2025 11:08

[PRMT-580]- Created a copy of bulk upload service

f60eeed

[PRMT-580]- refactored building staging metadata

55fda20

[PRMT-580]- refactored validate entry

8ab0fdf

[PRMT-580]- refactored virus scan

1017cf3

[PRMT-580]- refactored initiate transaction

a737487

[PRMT-580]- refactored transfer files

78a2f25

[PRMT-580]- refactored add information to stiching queue and updated …

60ed752

…tests

Merge remote-tracking branch 'origin/main' into PRMT-580

c438e53

PedroSoaresNHS temporarily deployed to development August 26, 2025 07:07 — with GitHub Actions Inactive

NogaNHS reviewed Aug 26, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Outdated Show resolved Hide resolved

NogaNHS reviewed Aug 26, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Outdated Show resolved Hide resolved

NogaNHS reviewed Aug 26, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Outdated Show resolved Hide resolved

NogaNHS reviewed Aug 26, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Show resolved Hide resolved

NogaNHS reviewed Aug 26, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Show resolved Hide resolved

NogaNHS reviewed Aug 26, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Outdated Show resolved Hide resolved

NogaNHS reviewed Aug 26, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Outdated Show resolved Hide resolved

NogaNHS reviewed Aug 26, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Outdated Show resolved Hide resolved

steph-torres-nhs reviewed Aug 26, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Outdated Show resolved Hide resolved

steph-torres-nhs reviewed Aug 26, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Outdated Show resolved Hide resolved

steph-torres-nhs reviewed Aug 26, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Outdated Show resolved Hide resolved

steph-torres-nhs reviewed Aug 26, 2025

View reviewed changes

chrisbloe-nhse changed the title ~~[PRMT-580]- Refactor bulk upload~~ [PRMT-580] Refactor bulk upload Aug 27, 2025

[PRMT-580]- refactored process_message_queue and addressed comments

336b9cc

PedroSoaresNHS temporarily deployed to development August 27, 2025 15:51 — with GitHub Actions Inactive

Merge remote-tracking branch 'origin/main' into PRMT-580

b7d35cf

PedroSoaresNHS temporarily deployed to development August 27, 2025 15:52 — with GitHub Actions Inactive

[PRMT-580]- added a bit of docstring

17b87f3

PedroSoaresNHS temporarily deployed to development August 28, 2025 08:08 — with GitHub Actions Inactive

NogaNHS reviewed Aug 28, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Outdated Show resolved Hide resolved

NogaNHS reviewed Aug 28, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Outdated Show resolved Hide resolved

[PRMT-580]- fixed comments

58b4371

PedroSoaresNHS temporarily deployed to development August 29, 2025 07:54 — with GitHub Actions Inactive

NogaNHS reviewed Aug 29, 2025

View reviewed changes

MohammadIqbalAD-NHS reviewed Aug 29, 2025

View reviewed changes

lambdas/services/bulk_upload_service_v2.py Outdated Show resolved Hide resolved

Update lambdas/services/bulk_upload_service_v2.py

8e65da9

Co-authored-by: Mohammad Iqbal <[email protected]>

PedroSoaresNHS temporarily deployed to development August 29, 2025 13:35 — with GitHub Actions Inactive

ghost reviewed Sep 1, 2025

View reviewed changes

[PRMT-580]- added closing quotes

b781389

PedroSoaresNHS temporarily deployed to development September 2, 2025 07:41 — with GitHub Actions Inactive

MohammadIqbalAD-NHS approved these changes Sep 2, 2025

View reviewed changes

ghost approved these changes Sep 2, 2025

View reviewed changes

bethany-kish-nhs requested changes Sep 3, 2025

View reviewed changes

robg-nhs suggested changes Sep 3, 2025

View reviewed changes

PedroSoaresNHS changed the title ~~[PRMT-580] Refactor bulk upload~~ [DRAFT][PRMT-580] Refactor bulk upload Sep 11, 2025

PedroSoaresNHS marked this pull request as draft September 11, 2025 13:45

adamwhitingnhs changed the title ~~[DRAFT][PRMT-580] Refactor bulk upload~~ [DRAFT][PRMP-188] Refactor bulk upload Sep 25, 2025

[DRAFT][PRMP-188] Refactor bulk upload #763

Are you sure you want to change the base?

[DRAFT][PRMP-188] Refactor bulk upload #763

Uh oh!

Conversation

PedroSoaresNHS commented Aug 26, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MohammadIqbalAD-NHS commented Sep 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MohammadIqbalAD-NHS left a comment

Choose a reason for hiding this comment

Uh oh!

bethany-kish-nhs commented Sep 3, 2025

Uh oh!

bethany-kish-nhs left a comment

Choose a reason for hiding this comment

Uh oh!

robg-nhs left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

robg-nhs left a comment •

edited

Loading