Course Catalog Help
DATAENG 05 (Builder): Transform Projects in Pipeline Builder

DATAENG 05 (Builder): Transform Projects in Pipeline Builder

Create a transform project and process non-linear data formats in Pipeline Builder.

rate limit

Code not recognized.

About this course

For those following the “Builder” path of the Data Engineering (DATAENG) learning path, this tutorial offers additional practice implementing project and transform best practices you’ve learned up to this point.

You’ve set up a Datasource project and pipeline for your flights alerts data, and in this tutorial you’ll be creating one for passengers. You’ll then move on to the next stage of your pipeline by creating a Transform project and generating a series of outputs that enable specific downstream workflows. Along the way, you’ll also get a feel for how Pipeline Builder processes non-linear data formats (JSON in this case).


⚠️ Course Prerequisites

DATAENG 04: Scheduling Data Pipelines: If you have not completed the previous course in this track, please do so now.


🥅 Learning Objectives

  1. Gain additional practice with Pipeline Builder and project structure primitives.
  2. Process non-linear data formats in Pipeline Builder.
  3. Create a Transform project and associated outputs.

💪 Foundry Skills

  • Use Pipeline Builder’s JSON parser transform.
  • Generate multiple outputs from a Pipeline Builder transform.
  • Generate a Data Lineage graph as documentation for the Datasource project segment of your production pipeline.

Curriculum

  • About this course
  • Create a Datasource Project for Passenger Data
  • Create and hydrate your passenger datasource project, part 1
  • Create and hydrate your passenger datasource project, part 2
  • Create a cleaned output
  • Document the passengers pipeline
  • Schedule the passengers pipeline
  • Exercise Summary
  • Create a Transform Project for Passenger Data
  • Create a transform project
  • Join flight alerts and passengers
  • Generating multiple outputs
  • More practice with multiple outputs
  • Exercise Summary
  • Document and Schedule Your Transform Pipeline
  • Document Your Pipeline with Notepad
  • Add a Data Lineage graph as documentation
  • Configure a Connecting Build Schedule
  • Exercise Summary
  • Conclusion
  • Key Takeaways
  • Next Steps

About this course

For those following the “Builder” path of the Data Engineering (DATAENG) learning path, this tutorial offers additional practice implementing project and transform best practices you’ve learned up to this point.

You’ve set up a Datasource project and pipeline for your flights alerts data, and in this tutorial you’ll be creating one for passengers. You’ll then move on to the next stage of your pipeline by creating a Transform project and generating a series of outputs that enable specific downstream workflows. Along the way, you’ll also get a feel for how Pipeline Builder processes non-linear data formats (JSON in this case).


⚠️ Course Prerequisites

DATAENG 04: Scheduling Data Pipelines: If you have not completed the previous course in this track, please do so now.


🥅 Learning Objectives

  1. Gain additional practice with Pipeline Builder and project structure primitives.
  2. Process non-linear data formats in Pipeline Builder.
  3. Create a Transform project and associated outputs.

💪 Foundry Skills

  • Use Pipeline Builder’s JSON parser transform.
  • Generate multiple outputs from a Pipeline Builder transform.
  • Generate a Data Lineage graph as documentation for the Datasource project segment of your production pipeline.

Curriculum

  • About this course
  • Create a Datasource Project for Passenger Data
  • Create and hydrate your passenger datasource project, part 1
  • Create and hydrate your passenger datasource project, part 2
  • Create a cleaned output
  • Document the passengers pipeline
  • Schedule the passengers pipeline
  • Exercise Summary
  • Create a Transform Project for Passenger Data
  • Create a transform project
  • Join flight alerts and passengers
  • Generating multiple outputs
  • More practice with multiple outputs
  • Exercise Summary
  • Document and Schedule Your Transform Pipeline
  • Document Your Pipeline with Notepad
  • Add a Data Lineage graph as documentation
  • Configure a Connecting Build Schedule
  • Exercise Summary
  • Conclusion
  • Key Takeaways
  • Next Steps