Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom s3 endpoint: Unable to execute HTTP request: Remote host terminated the handshake #10490

Open
samueljackson92 opened this issue Jun 13, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@samueljackson92
Copy link

Apache Iceberg version

1.5.2 (latest release)

Query engine

None

Please describe the bug 馃悶

Hi,

I am experimenting with setting up Iceberg locally and I am trying to connect to a custom s3 endpoint to use as the backend for my project.

I am getting the following HTTP error when trying to create a new table:

Traceback (most recent call last):
  File "/Users/rt2549/miniconda3/envs/iceberg/lib/python3.11/site-packages/pyiceberg/catalog/rest.py", line 470, in create_table
    response.raise_for_status()
  File "/Users/rt2549/miniconda3/envs/iceberg/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Server Error for url: http://localhost:8181/v1/namespaces/default/tables

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/rt2549/projects/test-db/ingest.py", line 29, in <module>
    main()
  File "/Users/rt2549/projects/test-db/ingest.py", line 22, in main
    table = catalog.create_table(
            ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rt2549/miniconda3/envs/iceberg/lib/python3.11/site-packages/pyiceberg/catalog/rest.py", line 472, in create_table
    self._handle_non_200_response(exc, {409: TableAlreadyExistsError})
  File "/Users/rt2549/miniconda3/envs/iceberg/lib/python3.11/site-packages/pyiceberg/catalog/rest.py", line 382, in _handle_non_200_response
    raise exception(response) from exc
pyiceberg.exceptions.ServerError: SdkClientException: Unable to execute HTTP request: Remote host terminated the handshake

My ingestion script looks like the following:

from pyiceberg.catalog import load_catalog
import pyarrow as pa
import pyarrow.parquet as pq


def main():
    s3_config = {
        "uri": "http://localhost:8181",
        "s3.endpoint": "https://s3.echo.stfc.ac.uk",
        "s3.access-key-id": "<my-key>",
        "s3.secret-access-key": "<my-secret>",
        "s3.region": "us-east-1",
        "py-io-impl": "pyiceberg.io.pyarrow.PyArrowFileIO",
    }
    catalog = load_catalog("default", **s3_config)
    df: pa.Table = pq.read_table("signals.parquet")
    df = df.drop_columns("description")

    catalog.create_namespace("default")

    table = catalog.create_table(
        "default.signals",
        schema=df.schema,
    )


if __name__ == "__main__":
    main()

My docker compose is copied from the iceberg tutorial:

version: "3"

services:
  rest:
    image: tabulario/iceberg-rest
    container_name: iceberg-rest
    networks:
      iceberg_net:
    ports:
      - 8181:8181
    environment:
      - AWS_ACCESS_KEY_ID=<access-key>
      - AWS_SECRET_ACCESS_KEY=<access-secret>
      - AWS_REGION=us-east-1
      - CATALOG_WAREHOUSE=s3://mast/test/warehouse/
      - CATALOG_IO__IMPL=org.apache.iceberg.aws.s3.S3FileIO
      - CATALOG_S3_ENDPOINT=https://s3.echo.stfc.ac.uk
networks:
  iceberg_net:

I can create and ls files with my s3 credentials at that endpoint with other tools with no problem.

@samueljackson92 samueljackson92 added the bug Something isn't working label Jun 13, 2024
@nastra
Copy link
Contributor

nastra commented Jun 14, 2024

It complains because it can't access http://localhost:8181/v1/namespaces/default/tables. Make sure that the REST server is accessible via that URI

@samueljackson92
Copy link
Author

Hi @nastra thanks for your suggestion. I am not sure if this is the issue. If I navigate locally to that URI http://localhost:8181/v1/namespaces/default/tables I can see the following output:

{
    "identifiers": []
}

If I navigate to http://localhost:8181/v1/namespaces/default I can see this output:

{
  "namespace": [
    "default"
  ],
  "properties": {
    "location": "s3://mast/test/warehouse/default"
  }
}

So the REST server seems to be accessible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants