The open source high performance ELT framework powered by Apache Arrow
-
Updated
Jun 26, 2024 - Go
The open source high performance ELT framework powered by Apache Arrow
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
The Frank!Framework is an easy-to-use, stateless integration framework which allows (transactional) messages to be modified and exchanged between different systems.
Logstash - transport and process your logs, events, or other data
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.
Airflow DAGs for the Stellar ETL project
Stellar ETL will enable real-time analytics on the Stellar network
Documentation for the TriplyDB and TriplyETL products
(Spatial) data harmonisation with hale studio (formerly HUMBOLDT Alignment Editor)
Global Biotic Interactions provides access to existing species interaction datasets
This project demonstrates a complete ETL (Extract, Transform, Load) pipeline that integrates data extraction from an API, data transformation and normalization, and data exposure via an API endpoint. The solution leverages Meltano for ETL orchestration, FastAPI for API Endpoint, and Docker containerization.
DataForge helps data teams write functional transformation pipelines by leveraging software engineering principles
A tool for building feature stores.
high performance better alternative to Airbyte, Singer, Meltano
DIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics control framework that can be used to monitor, log, audit and control data integration / ETL processes.
Bridging the gap between IEnumerable and IDataReader for dealing with unstructured and loosely-structured data, plus fast ETL + SQL Bulk Copy.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
ETL (Extract, Transform and Load) library for .NET
Add a description, image, and links to the etl-framework topic page so that developers can more easily learn about it.
To associate your repository with the etl-framework topic, visit your repo's landing page and select "manage topics."