Skip to content

Low-code ETL for structured and unstructured data. Generates Python code you can deploy anywhere.

License

Notifications You must be signed in to change notification settings

amphi-ai/amphi-etl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Low-code Python-Based ETL for structured and unstructured data.


Test Slack License

Screenshot 2024-05-29 at 3 47 44 PM

English · Changelog · Report Bug · Request Feature

Table of contents

TOC

📦 Installation

To install amphi-etl, run the following command:

pip install amphi-etl

Note

If you prefer to install Amphi's Jupyterlab extension, use pip install jupyterlab-amphi in your environment. More information here.


🔨 Usage

To start Amphi, simply run:

amphi start

Use the following parameters to specify your workspace (where you can access files and create pipelines on your system) and port to use:

amphi start -w /your/workspace/path -p 8888

✨ Features

Note

Amphi focuses on structured and unstructured data manipulation for data and AI pipelines. It aims to empower data scientists and data engineers to easily develop pipelines with an intuitive low-code interface while generating Python code you can deploy anywhere.

amphi-screenshot-github

Modern ETL for the AI age:

  • 🧑‍💻 Low-code: Accelerate data and AI pipeline development and reduce maintenance time.
  • 🐍 Python-code Generation: Generate native Python code leveraging common libraries such as pandas, DuckDB and LangChain that you can run anywhere.
  • 🔒 Private and Secure: Self-host Amphi on your laptop or in the cloud for complete privacy and security over your data.

Structured & Unstructured

  • 🔢 Structured - Import data from various sources, including CSV and Parquet files, as well as databases. Transform structured data using aggregation, filters, joins, SQL queries, and more. Export the transformed data into common files or databases.
  • 📝 Unstructured - Extract data from PDFs, Word documents, and websites (HTML). Perform parsing, chunking and embedding processing. Load the processed data into vector stores such as Pinecone and ChromaDB.
  • 🔁 Convert - Easily convert structured data into unstructured document for vector stores and vice versa for RAG pipelines.

Features In Progress

  • Custom components - Add the ability to develop your own component and wrap configured ones.
  • Implement connections - Add the ability to securely create connections to reuse in components
  • Developer documentation - Write comprehensive documentation to allow extensions

🤝 Contributing

  • Use and Innovate: Try Amphi and share your use case with us. Your real-world usage and feedback help us improve our product.
  • Voice Your Insights: Encounter a glitch? Have a query? Share them by submitting issues and help us enhance the user experience.
  • Shape the Future: Have code enhancements or feature ideas? We invite you to propose pull requests and contribute directly.

Every contribution, big or small, is celebrated. Join us in our mission to refine and elevate the world of ETL for data and AI. 😃


🛣️ Ecosystem

Amphi is available as an extension for Jupyterlab, and Amphi ETL is based on Jupyterlab. Therefore Jupyterlab extensions can be installed on Amphi ETL.


📝 License

Copyright © 2024 - present Amphi Labs.
This project is ELv2 licensed.

About

Low-code ETL for structured and unstructured data. Generates Python code you can deploy anywhere.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages