The Open Source Data Engineering Landscape

20/09/2021 22 min

Listen "The Open Source Data Engineering Landscape"

Episode Synopsis

Open Source Solutions are booming. Open Source projects are easy to clone from Github or Gitlab. The projects´ licenses mostly allow you to use the code in your own product with a kind of contribution. Especially the data engineering space is exploding within the Open Source Landscape since Open Source software is reliable and many contributors extend the solutions. In this episode, we give a brief overview of the data science workflow. The projects we are talking about are available on the following Links: 

Airbyte | Python (Data Integration): https://github.com/airbytehq/airbyte
Great Expectations | Python (Data Governance): https://github.com/great-expectations/great_expectations
Marquez | Java (Data Lineage): https://github.com/MarquezProject/marquez
Data Explorer | R ( Data Exploration): https://github.com/boxuancui/DataExplorer
DBT | SQL (Data Analytics): https://github.com/dbt-labs/dbt