Course Catalog Help
DATAENG 03 (Builder): Creating a Project Output in Pipeline Builder

DATAENG 03 (Builder): Creating a Project Output in Pipeline Builder

Engineer a clean output for your project to be consumed by downstream pipelines and use cases.

rate limit

Code not recognized.

About this course

Foundry applications and project structures that support data pipelines provide ample opportunities for your to let your current and future team know the relevant facts about your data transformations. Having preprocessed your data, it’s time to clean it and prepare it for use downstream. This means airtight transform logic and documenting the scope every step of the way.

In this tutorial, you’ll create “clean” outputs for your project to be consumed by downstream pipelines and use cases. In doing so, you’ll get additional practice within a recommended pipeline project structure.

Course Prerequisites
  • DATAENG 02 (Builder): If you have not completed the previous course in this track, please do so now.
 
Learning Objectives
  1. Distinguish between preprocessing and cleaning steps.
  2. Document the datasource stage of your pipeline.
  3. Gain additional practice transforming data in Pipeline Builder.
 
Foundry Skills
  • Generate multiple outputs from a Pipeline Builder transform.
  • Implement branching and pipeline documentation best practices.
  • Generate a Data Lineage graph as documentation for the Datasource project segment of your production pipeline.

Curriculum

  • Introduction
  • About this course
  • Cleaning your Data
  • Create your cleaning pipeline
  • Add your cleaning logic
  • Create a clean output
  • Making changes to your pipeline logic with branching
  • Exercise Summary
  • Documenting your Pipeline
  • Documenting your Pipeline with a Data Lineage Graph
  • Documenting your Pipeline with Notepad
  • Exercise Summary
  • Conclusion
  • Key Takeaways
  • Next Steps

About this course

Foundry applications and project structures that support data pipelines provide ample opportunities for your to let your current and future team know the relevant facts about your data transformations. Having preprocessed your data, it’s time to clean it and prepare it for use downstream. This means airtight transform logic and documenting the scope every step of the way.

In this tutorial, you’ll create “clean” outputs for your project to be consumed by downstream pipelines and use cases. In doing so, you’ll get additional practice within a recommended pipeline project structure.

Course Prerequisites
  • DATAENG 02 (Builder): If you have not completed the previous course in this track, please do so now.
 
Learning Objectives
  1. Distinguish between preprocessing and cleaning steps.
  2. Document the datasource stage of your pipeline.
  3. Gain additional practice transforming data in Pipeline Builder.
 
Foundry Skills
  • Generate multiple outputs from a Pipeline Builder transform.
  • Implement branching and pipeline documentation best practices.
  • Generate a Data Lineage graph as documentation for the Datasource project segment of your production pipeline.

Curriculum

  • Introduction
  • About this course
  • Cleaning your Data
  • Create your cleaning pipeline
  • Add your cleaning logic
  • Create a clean output
  • Making changes to your pipeline logic with branching
  • Exercise Summary
  • Documenting your Pipeline
  • Documenting your Pipeline with a Data Lineage Graph
  • Documenting your Pipeline with Notepad
  • Exercise Summary
  • Conclusion
  • Key Takeaways
  • Next Steps