Data Flow Name: df_transform_hospital_admissions
Pipeline Steps:
-
Source (HospitalAdmissionSource):
-
Pulls data from
ds_raw_hospital_admission
.
-
-
SelectReqdFields:
-
Renames or selects specific fields:
country
,indicator
, etc.
-
-
LookupCountry:
-
Performs a lookup using
CountrySource
(likely fromds_country_lookup
) to enrich the data.
-
-
SelectReqdFields2:
-
Refines the result further with a new set of selected or renamed fields.
-
-
Split into Weekly and Daily:
-
A Conditional Split divides the data into two branches:
-
Weekly (9 columns total)
-
Daily (filtering on
indicator
column, likely conditional logic)
-
-
Right Panel:
-
Shows general properties.
-
Name:
df_transform_hospital_admissions
. -
Description: Empty.
Bottom Panel (Data preview):
-
Currently loading: “Fetching data…”.
-
Status:
Data flow debug
is enabled (green). -
Operation counts like
INSERT
,UPDATE
,DELETE
, etc., are N/A, meaning this is likely a preview run or the data hasn’t loaded yet.
🔁 Complete Transformation Breakdown
🟦 1. Source (ds_raw_hospital_admission)
-
What it does:
-
Reads raw hospital admission data from a source dataset (e.g., CSV, database).
-
Fields:
country
,reported_date
,hospital_occupancy_count
,icu_occupancy_count
, etc.
-
🟨 2. fields2 (Conditional Split)
-
What it does:
-
Splits incoming data into two branches: Weekly and Daily.
-
Based on a condition, likely using a flag or pattern in the data like:
-
-
Why:
-
Enables separate transformation logic for weekly and daily reporting formats.
-
🟩 3. Weekly Branch
🔷 a. JoinWithDate (Join)
-
What it does:
-
Joins raw data with a Date Dimension (likely
AggDimDate
). -
Join keys:
reported_date
from source anddate
from the dimension.
-
-
Why:
-
Enriches records with derived values like
year_week
,week_start_date
, etc.
-
🔷 b. PivotWeekly (Pivot)
-
What it does:
-
Pivots indicators (like hospital and ICU occupancy counts) into separate columns.
-
-
Group by:
-
Likely
year_week
,country
-
-
Values:
-
Transforms rows into a wider format with columns like:
-
hospital_occupancy_count
-
icu_occupancy_count
-
-
-
Why:
-
Aggregates and reshapes data for weekly reporting.
-
🔷 c. SortWeekly (Sort)
-
What it does:
-
Sorts the data by
reported_year_week
andcountry
-
-
Why:
-
Ensures data is consistently ordered before writing to sink.
-
🔷 d. SelectWeekly (Select)
-
What it does:
-
Keeps only required columns and renames as needed.
-
Final schema might include:
-
country
,reported_year_week
,hospital_occupancy_count
,icu_occupancy_count
-
-
-
Why:
-
Cleans and prepares data for export.
-
🔷 e. WeeklySink (Sink)
-
What it does:
-
Writes the transformed weekly data to a target dataset.
-
Sink:
ds_processed_hospital_admission_weekly
-
-
Why:
-
Makes weekly data available for reporting/analytics.
-
🟩 4. Daily Branch
🔷 a. PivotDaily (Pivot)
-
What it does:
-
Similar to
PivotWeekly
, but operates on daily granularity.
-
-
Group by:
-
reported_date
,country
-
-
Why:
-
Converts long-format daily data into a wide format for daily analysis.
-
🔷 b. SortDaily (Sort)
-
What it does:
-
Sorts by
reported_date
andcountry
-
-
Why:
-
Ensures orderliness and data consistency in final output.
-
🔷 c. SelectDaily (Select)
-
What it does:
-
Selects relevant fields like:
-
country
,reported_date
,hospital_occupancy_count
,icu_occupancy_count
,population
,source
-
-
-
Why:
-
Aligns with target schema and ensures only meaningful data is exported.
-
🔷 d. DailySink (Sink)
-
What it does:
-
Writes the final daily data to
ds_processed_hospital_admission_daily
-
-
Why:
-
Makes daily data available for downstream use (dashboards, exports).
Transformation | Type | Description |
---|---|---|
ds_raw_hospital_admission |
Source | Loads raw hospital admission data |
fields2 |
Conditional Split | Splits data into Daily and Weekly pipelines |
JoinWithDate |
Join | Adds weekly context by joining with date dimension |
PivotWeekly |
Pivot | Converts indicator rows into columns (weekly) |
SortWeekly |
Sort | Sorts by week and country |
SelectWeekly |
Select | Keeps/renames columns for export |
WeeklySink |
Sink | Outputs to weekly processed dataset |
PivotDaily |
Pivot | Converts indicator rows into columns (daily) |
SortDaily |
Sort | Sorts by date and country |
SelectDaily |
Select | Keeps/renames columns for export |
DailySink |
Sink | Outputs to daily processed dataset |
Comments
Post a Comment