apache-beam
Here are 249 public repositories matching this topic...
GCP Dataflow pipeline with BigQuery as source and side input
-
Updated
Aug 9, 2018 - Python
Serverless data ingest pipeline on Google Cloud Platform
-
Updated
Dec 5, 2023 - Java
Pipeline para ingestão e tratamento de dados utilizando o Apache Beam
-
Updated
Sep 28, 2021 - Python
Desenvolvimento de um pipeline de dados utilizando Apache Beam para orquestrar o fluxo e Python para capturar e tratar os dados. Com os dados já refinados, foram utilizadas as bibliotecas Pandas e Matplotlib para desenvolver uma análise exploratória dos dados.
-
Updated
Feb 26, 2023 - Jupyter Notebook
Implementing K-means clustering in sequential, streaming, and distributed formats using Apache Beam.
-
Updated
Jan 7, 2024 - Jupyter Notebook
This is a data scraping project that sources data from the Houzz e-commerce platform, the CNN YouTube channel, and the TedTalk official website. The implementation uses the Apache Beam framework to build an ETL pipeline and write the results into an Elasticsearch database. The final step visualizes the crawler results using Kibana.
-
Updated
Feb 11, 2024 - HTML
🍦 Serve doddle-model in a pipeline implemented with Apache Beam
-
Updated
Nov 19, 2018 - Scala
GCP Dataflow pipeline with mapreduce in python
-
Updated
Aug 11, 2018 - Python
BigData knowledge system(大数据知识体系).
-
Updated
Sep 16, 2020
Adtech Logs processing Pipeline with Apache Beam, Cloud Dataflow, Java, Protocol Buffer. | Data Analysis with BigQuery
-
Updated
Jun 11, 2021 - Java
A data pipeline to ingest, process, store storm events datasets so we can access them through different means.
-
Updated
Apr 7, 2021 - Jupyter Notebook
This video present a real world use case developed with Apache Beam Java and launched with the serverless Dataflow runner in Google Cloud Platform. The job read a Json file from Cloud Storage, applies some transformations and write the result to a BigQuery table.
-
Updated
Apr 27, 2023 - Java
The scripts in this repo will build the Apache Beam Java SDK packages, using Cloud Build and Artifact Registry, for a personal Beam fork.
-
Updated
Feb 20, 2024 - HCL
Efficient Python data pipeline leveraging Apache Beam and Google Cloud Dataflow to update a Bucket with data concerning daily prices of instruments extracted from BMF website, serving as input for other data pipelines. The code generates a dataflow template, which is then scheduled to run periodically using Cloud Scheduler + Cloud Functions.
-
Updated
Feb 28, 2024
Mirror of Apache Beam
-
Updated
Sep 26, 2017 - Java
Evaluating Apache Beam batch, streaming, SQL, etc
-
Updated
Jan 27, 2019 - Java
Improve this page
Add a description, image, and links to the apache-beam topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the apache-beam topic, visit your repo's landing page and select "manage topics."