-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: offload shards to S3 upload/download paths #5119
base: main
Are you sure you want to change the base?
Conversation
8a701d7
to
e7ddc29
Compare
a3ed0e4
to
d7dd6f5
Compare
d7dd6f5
to
f47f779
Compare
defer idx.backupMutex.RUnlock() | ||
|
||
eg := enterrors.NewErrorGroupWrapper(m.logger) | ||
eg.SetLimit(_NUMCPU * 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should not go over _NUMCPU value, wdyt? I have seen that in most of the places we are constraining goroutines to just _NUMCPU
value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, then we have a mix because in the same file we have it eg.SetLimit(_NUMCPU * 2)
, i don't have a hard opinion on it. for sure can adjust it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have gave it a thought and i think it would be beneficial to speed up download/upload 🤔 , given both are I/O-bound ops than CPU heavy ops
d54fcf0
to
785fc26
Compare
3903e92
to
178eacf
Compare
271ffeb
to
d710625
Compare
c67d276
to
4d2198e
Compare
4d2198e
to
a918af8
Compare
|
@@ -58,6 +58,7 @@ func startMinIO(ctx context.Context, networkName string) (*DockerContainer, erro | |||
} | |||
envSettings := make(map[string]string) | |||
envSettings["BACKUP_S3_ENDPOINT"] = fmt.Sprintf("%s:%s", MinIO, port.Port()) | |||
envSettings["S3_ENDPOINT_URL"] = fmt.Sprintf("http://%s:%s", MinIO, port.Port()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the difference between OFFLOAD_S3_ENDPOINT
and S3_ENDPOINT_URL
? can we just use OFFLOAD_S3_ENDPOINT
? this brings a little bit of ambiguity
What's being changed:
this PR implements the tenant offloading feature to cloud
UpdateTenants
endpoint to be able to return the state after updating it byread after write
to handle cases like fromFROZEN
toFREEZING
.Limitations to be addressed later on
State machine for tenant status
HOT
HOT
(this includes pulling from cloud storage if it wasFROZEN
before)HOT
, it’s a no-opCOLD
FROZEN
, the shard is warmed toCOLD
, i.e. pulled from cloud storage <-- newHOT
, the shard is deactivated toCOLD
<--- already existsCOLD
, it’s a no-opFROZEN
FROZEN
, i.e. offloading it to cloud storage <-- newFROZEN
, it’s a no-opUNFREEZING
FROZEN
to target status. going fromCOLD->HOT
is fast enough to skip the status checkFREEZING
missing in current implementation
Links
offload module
Issue
Usage
if we wanna run it locally we need to run minio 1st and then run weaviate cluster
Offload module config env vars source code
OFFLOAD_S3_ENDPOINT
default will ready AWS config from environmentdefault: will read AWS config
S3_ENDPOINT_URL
default will read AWS config
e.g. incase we want to assign local minio, otherwise don’t do anything and it will be read automatically when AWS config is exported
OFFLOAD_S3_BUCKET
: the name of the bucket used to store the offloaded tenantsdefault
weaviate-offload
OFFLOAD_S3_CONCURRENCY
: pass concurrency to s5cmddefault
25
OFFLOAD_TIMEOUT
: this is the context timeout for all the module requests (upload , download, create bucket)default
10 sec
Review checklist