Skip to content

Conversation

@PedroSoaresNHS
Copy link
Contributor

Created a new class called bulk_upload_service_v2
Refactored this class to better follow good practices

https://nhsd-jira.digital.nhs.uk/browse/PRMT-580

Comment on lines 72 to 78
logger.info(
"Cannot validate patient due to PDS responded with Too Many Requests"
)
logger.info("Cannot process for now due to PDS rate limit reached.")
logger.info(
"All remaining messages in this batch will be returned to sqs queue to retry later."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not necessary true, not every pds failure is rate limit failure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a refactor of bulk upload, so no logic was/can be changed in this ticket

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you still remove/edit the logs? that is not a logic change.

return StagingMetadata.model_validate_json(staging_metadata_json)
except (pydantic.ValidationError, KeyError) as e:
logger.error(f"Got incomprehensible message: {message}")
logger.error(e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You log the error here and then raise it and log it again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trying to keep the bulk upload logic as much possible

staging_metadata_json = message["body"]
return StagingMetadata.model_validate_json(staging_metadata_json)
except (pydantic.ValidationError, KeyError) as e:
logger.error(f"Got incomprehensible message: {message}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would error when body does exist, so this log does not reflect the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trying to keep the bulk upload logic as much possible

file_names = [
os.path.basename(metadata.file_path) for metadata in staging_metadata.files
]
request_context.patient_nhs_no = staging_metadata.nhs_number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you set this, every log will have this nhs number in them until the value is reset. Which mean this will be in the logs of the next message that belongs to another nhs number. You need to reset this to avoid confusing info.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a refactor so cant change logic in this ticket

Comment on lines 101 to 108
if self.unhandled_messages:
logger.info("Unable to process the following messages:")
for message in self.unhandled_messages:
message_body = json.loads(message.get("body", "{}"))
request_context.patient_nhs_no = message_body.get(
"NHS-NO", "no number found"
)
logger.info(message_body)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens to these messages, do we a dlq setup for thme?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ticket created for it

@chrisbloe-nhse chrisbloe-nhse changed the title [PRMT-580]- Refactor bulk upload [PRMT-580] Refactor bulk upload Aug 27, 2025
Comment on lines +108 to +116
This method performs the following steps:
1. Parses the message and constructs staging metadata.
2. Validates the NHS number and file names.
3. Performs additional validation checks such as patient access conditions
(e.g., deceased, restricted) and virus scan results.
4. Initiates transactional operations and transfers the validated files.
5. Removes the ingested files from the staging bucket.
6. Logs the completion of ingestion and writes the report to DynamoDB.
7. Sends metadata to the stitching queue for further processing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this method is doing 7 steps then it is doing too much.
Functions Should Do One Thing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this service need breaking down to smaller services.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for new we will keep it as is, but that is something that needs to be done

@MohammadIqbalAD-NHS
Copy link
Contributor

Unit tests are failing.


logger.info(
"NHS Number and filename validation complete."
"Validated strict mode, and patient information is accessible (e.g. patient not deceased/restricted)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing closing quotes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

@MohammadIqbalAD-NHS MohammadIqbalAD-NHS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests failing.

@bethany-kish-nhs
Copy link
Contributor

I thought the V2 code was going to be added before this refactor, this way we won't have visibility/ traceability of what's been refactored once it all gets squashed

Copy link
Contributor

@bethany-kish-nhs bethany-kish-nhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you compare with bulk upload on main it looks like changes have been made there and the V2 code here is out of date

eg. custodian logic has been changed on main here:
f9941b0

We need to make sure this ticket only refactors what's there on main and doesn't miss anything, we'll need to have a process for keeping both up to date as well

Copy link
Contributor

@robg-nhs robg-nhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that you have a PR with a title of Refactor Bulk Upload but I'm unsure what the intention is?

While I have seen the JIRA ticket which does have information on what your attempting to accomplish, I think your trying to do too much at once.

I suggest you breakdown the work further so the PR's have intent e.g. I'm doing XYZ

@PedroSoaresNHS PedroSoaresNHS changed the title [PRMT-580] Refactor bulk upload [DRAFT][PRMT-580] Refactor bulk upload Sep 11, 2025
@PedroSoaresNHS PedroSoaresNHS marked this pull request as draft September 11, 2025 13:45
@adamwhitingnhs adamwhitingnhs changed the title [DRAFT][PRMT-580] Refactor bulk upload [DRAFT][PRMP-188] Refactor bulk upload Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants