-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Integration of Department Metadata into EMAP for Efficient Inpatient Location Filtering
Background
We need department level data in EMAP. This is necessary so that we can efficiently identify patients who are occupying inpatient beds at a point in time, or patients in the ED at a point in time.
Useful metadata includes:
- which site a department is on (Grafton Way, Queen's Square etc)
- whether a department is a ward or flow area (ie relevant to those managing patient flow)
- whether a department that is a ward or flow area is classified as an inpatient overnight location (from RPT_GRP_NINE field in Clarity's CLARITY_DEP table)
Current State
- EMAP stores only location strings without department metadata
- Current location filtering uses hardcoded list of ~500 location strings converted to location IDs
- Performance bottleneck: large array filtering in SQL queries
- Missing ~3,500 additional inpatient ward locations identified in Clarity analysis
Problem Statement
Need efficient method to filter EMAP data by ward/flow area classification for:
- Large-scale model training data extraction
- Real-time inpatient prediction applications
Current query pattern:
-- Slow: filtering by large array of location IDs
SELECT * FROM [Join of hospital_visit and location_visit]
WHERE location_id = ANY(ARRAY[123, 456, 789, ...]) -- 500+ IDsDesired query pattern:
-- Fast: filtering by metadata flags
SELECT * FROM emap_table e
WHERE e.is_ward_or_flow_area = true
AND e.site_location_id IN ('10201', '10218')
AND e.classification in ('2', '20') -- Inpatient - Overnight Ward or ICU locationProposed Solutions
Option 1: Extend EMAP Hoover
Description: Modify the EMAP data ingestion pipeline to populate department metadata from Clarity during hoover process.
Implementation:
- Add columns to EMAP location/department tables:
rpt_grp_nine,is_ward_or_flow_area,site_location_id, etc. - Modify hoover to lookup metadata from Clarity during ingestion
- Create indexes on metadata columns
Pros:
- Best query performance (single table, no joins)
- Self-contained EMAP database
- Scales efficiently with location growth
- Optimal for real-time applications
Cons:
- Requires hoover development resources
- Testing overhead for data pipeline changes
- One-time implementation effort
Option 2: Cross-Database Views/Federated Queries
Description: Create views in EMAP that join to Clarity metadata tables using foreign data wrappers.
Implementation:
CREATE FOREIGN TABLE clarity_dept_metadata (...)
SERVER clarity_server OPTIONS (...);
CREATE VIEW emap_with_metadata AS
SELECT e.*, cmd.is_ward_or_flow_area, cmd.site_location_id
FROM emap_locations e
JOIN clarity_dept_metadata cmd ON e.location_string = cmd.location_string;Pros:
- No changes to core EMAP pipeline
- Always current metadata
- Quick to implement
Cons:
- Cross-database query performance issues
- Network latency impact
- System dependencies
- Poor for real-time applications
Option 3: Periodic Metadata Sync Table
Description: Create lookup table in EMAP refreshed periodically from Clarity.
Implementation:
CREATE TABLE location_metadata (
location_string VARCHAR(255) PRIMARY KEY,
department_id VARCHAR(50),
department_name VARCHAR(255),
rpt_grp_nine VARCHAR(10),
is_ward_or_flow_area BOOLEAN,
site_location_id VARCHAR(20),
last_updated TIMESTAMP
);Pros:
- Good query performance (local joins)
- No hoover changes required
- Quick implementation
- Immediate performance gains
Cons:
- Metadata staleness between syncs
- Additional sync process maintenance
- Scheduling/monitoring overhead
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
In Review