Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"413 Request Entity Too Large" when uploading files to ClearML #1291

Open
Bunoviske opened this issue Jun 24, 2024 · 4 comments
Open

"413 Request Entity Too Large" when uploading files to ClearML #1291

Bunoviske opened this issue Jun 24, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@Bunoviske
Copy link

Bunoviske commented Jun 24, 2024

Describe the bug

Hello,

I always use ClearML to store the latest checkpoint from my model training, and I noticed last week my experiments were not saving correctly the models. I checked the console logs of my experiments and I found this every time I upload an artifact:

2024-06-24 11:46:37,590 - clearml.storage - ERROR - Exception encountered while uploading Failed uploading object /cho/Brumas_v2-256-allNormal-allPerClass-2-aiCrowd-food201-allWithDrink-fromPretraining.91735e850ed245449e328149647e0a17/artifacts/latest.ckpt/last.ckpt (413): <!doctype html>
<html lang=en>
<title>413 Request Entity Too Large</title>
<h1>Request Entity Too Large</h1>
<p>The data value transmitted exceeds the capacity limit.</p>
2024-06-24 11:46:37,591 - clearml.metrics - WARNING - Failed uploading to https://files.clear.ml (Failed uploading object /cho/Brumas_v2-256-allNormal-allPerClass-2-aiCrowd-food201-allWithDrink-fromPretraining.91735e850ed245449e328149647e0a17/artifacts/latest.ckpt/last.ckpt (413): <!doctype html>
<html lang=en>
<title>413 Request Entity Too Large</title>
<h1>Request Entity Too Large</h1>
<p>The data value transmitted exceeds the capacity limit.</p>
)
2024-06-24 11:46:37,594 - clearml.metrics - ERROR - Not uploading 1/1 events because the data upload failed

I did not change anything in my code and looks like big .ckpt are not being uploaded. My .ckpt files have around 400 MB, but I could test files with 180 MB and they also failed.

Was there any changes in the ClearML server deployment? I am using the cloud free tier option and uploading the files to https://files.clear.ml. (I still have 50 GB free for storage)

To reproduce

from clearml import Task

Task.add_requirements("requirements.txt")
task = Task.init(project_name="cho", task_name="test",
                 output_uri=None) # set output_uri=True to log all lightning models in clearml
task.upload_artifact(name="adjustedIds.json", artifact_object="adjustedIds.json") # it works (small file)
task.upload_artifact(name="model.ckpt", artifact_object="model.ckpt") # it fails (big file)

Expected behaviour

Expected behaviour would be able to download the models from ClearML UI. But since the upload fails, I receive 404 NOT FOUND.

Environment

  • Server type = app.clear.ml
  • ClearML SDK Version = 1.16.2
  • Python Version = 3.10
  • OS (Windows \ Linux \ Macos) = Linux
@Bunoviske Bunoviske added the bug Something isn't working label Jun 24, 2024
@boosterdre
Copy link

boosterdre commented Jun 24, 2024

Can reproduce on Saas deployment in PRO tier.
Have this bug since 19/06 or 20/06.
Nothing change in my code.

@jkhenning
Copy link
Member

Hi @boosterdre, @Bunoviske,

This is really strange since these sizes are not close to the limit. We're trying to reproduce now.

@jkhenning
Copy link
Member

Hi @boosterdre, @Bunoviske,

We've found the issue (missing factor in the limit check) and the issue has been fixed and redeployed.

@boosterdre
Copy link

It's working, thanks @jkhenning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants