Skip to main content

Posts

 Creating the pipleine in azure  inorder to login to azure portal we need microsoft account  --> using that crEATE AZURE ACCOUN AFTER NAVIGATE INTO THE AZURE PORTAL  GO AND CREATE THE RESOURSE GROUP UNDER THAT CREATE THE RESOURES NED TO CREATE THE PIPELINE  1. AZURE BLOG STORAGE ( ENABLE HIRTENIAL NAMESPACE TO CHANGE INTO GEN2) INSIDE CREATE THE CONTAINERS (BLOG CONTAINER ) INDIDE THAT SUBFLODER CAN ABE CREATED USING ADD DICTIONATIES  CREATED A STORAGE ACCOUNT sdmm  UNDER THAT SDMM . GOLD , SILVER , BRONZE INSIDE THAT WE CREATE SD , MM FOLDERS IN THREE EXPECT GOLD WE CREATE PROCEMENT, SALES  2. DATA FACTORY  INSIDE DATA FACTORY --> NAVIGATE TO MANAGED INDENTIES  UNDER THAT ENABLE SYSTEM ASSIGNED THEN SAVE THEN NAVIGATE BACK TO STORAGE AND NAVIAGE TO IAM ACCESS CONTROL IN THAT ADD ROLES AS STORAGE BLOG CONTRIBUTER AND THEN ASSIGNED MEMBER AS DATA FACTORY . THEN LAUNCH STUDIO IN DATA FACTORY THEN GO TO MANAGE THEN LINKED SERVICE TO L...

Session 7 data flow part 2

  Data Flow Name : df_transform_hospital_admissions Pipeline Steps : Source (HospitalAdmissionSource) : Pulls data from ds_raw_hospital_admission . SelectReqdFields : Renames or selects specific fields: country , indicator , etc. LookupCountry : Performs a lookup using CountrySource (likely from ds_country_lookup ) to enrich the data. SelectReqdFields2 : Refines the result further with a new set of selected or renamed fields. Split into Weekly and Daily : A Conditional Split divides the data into two branches: Weekly (9 columns total) Daily (filtering on indicator column, likely conditional logic) Right Panel : Shows general properties. Name: df_transform_hospital_admissions . Description: Empty. Bottom Panel (Data preview) : Currently loading: “Fetching data…”. Status: Data flow debug is enabled (green). Operation counts like INSERT , UPDATE , DELETE , etc., are N/A , meaning this is likely a preview r...

Transformation - section 6 - data flow

  Feature from Slide Explanation ✅ Code-free data transformations Data Flows in ADF allow you to build transformations using a drag-and-drop visual interface , with no need for writing Spark or SQL code. ✅ Executed on Data Factory-managed Databricks Spark clusters Internally, ADF uses Azure Integration Runtimes backed by Apache Spark clusters , managed by ADF, not Databricks itself . While it's similar in concept, this is not the same as your own Databricks workspace . ✅ Benefits from ADF scheduling and monitoring Data Flows are fully integrated into ADF pipelines, so you get all the orchestration, parameterization, logging, and alerting features of ADF natively. ⚠️ Important Clarification Although it says "executed on Data Factory managed Databricks Spark clusters," this does not mean you're using your own Azure Databricks workspace . Rather: ADF Data Flows run on ADF-managed Spark clusters. Azure Databricks notebooks (which you trigger via an "Exe...