Data Engineer Basics - Integrate

Learn the basics of integrating your data with Cognite Data Fusion.

rate limit

Code not recognized.

Learn how to get data reliably from various siloed systems into CDF using its interfaces, extractors, and SDKs. This learning path consists of several comprehensive courses and a final assessment.


The courses are split into three groups:

  • Integrating Data from Custom Solutions
  • Integrating Data from Cloud Services
  • Integrating Data from PI Historian

Click the buttons below for more information.
 
 
  • Build custom extractors to ingest data from source systems.
  • Extract data present in cloud storages or databases.
  • Ingest data into CDF using ETL tools.
  • Setup the Cognite PI extractor (demonstration).
  • Before you start this learning path, we suggest that you go through Cognite Data Fusion Fundamentals firsr. Also, it is recommended to have some previous experience with Python, REST APIs, ETL tools (i.e Azure data factory, Informatica Powercenter), Postgres database, Apache Spark, PI server.
  • This is the first of two learning paths for data engineers. You'll earn a badge for each and a certificate upon completion of both learning paths and passing the final assessment.
Course Learning goals
Learn to Use the Cognite Python SDK
  • Use the most common functions of Cognite Python SDK and available documentation as a reference.
  • Setup authentication to connect to Cognite Data Fusion.
  • List, search, and retrieve various resource types and data.
  • Create, update, insert, and delete various resource types and data.
Introduction to API v1 using Postman
  • Set environment variables
  • Make a test API request
  • Find information about time series, assets, and events
  • Download data
Extractor-utils Library for Cognite Python SDK
  • Articulate the advantages of using the Cognite extractor-utils package for Python.
  • Describe the modules of the library.
PostgreSQL Gateway
  • Explain when to use PostgreSQL gateway to integrate data into Cognite Data Fusion.
  • Create a data factory in Azure.
  • Add the source and the sink, which define the source you are reading data from and the destination of the data.
  • Do the mapping between the source and the sink.
  • Run the gateway and check that you have the data in Cognite Data Fusion.
Cognite Spark Data Source
  • Describe what Spark and Databricks are and when to use them.
  • Interact through a Databricks notebook (import a notebook, set up secret scope).
  • Use different features of Cognite Spark Data Source.
  • Display, aggregate, and analyze the Open Industrial Data (OID) in Cognite Data Fusion (CDF) with Cognite Spark Data Source.
PI Extractor
  • Find requirements for your host machine to be able to run the extractor.
  • Install, configure, and verify the PI extractor.
  • View data points from PI tags being ingested into CDF time-series.
  • Upgrade your extractor that is running right now.
  • Export metrics that can be used to monitor the extractor health and operation over time.
Data Engineer Basics - Integrate Assessment
  • The final assessment covers all six courses in this learning path and takes around 30 minutes. Earn a badge upon successful completion. Once finished, you can move on to the next learning path for data engineers.

Connect with other learners and the rest of Cognite's Community on Cognite Hub.