[FR] Config per registered model #12469
Labels
area/model-registry
Model registry, model registry APIs, and the fluent client calls for model registry
enhancement
New feature or request
Willingness to contribute
Yes. I would be willing to contribute this feature with guidance from the MLflow community.
Proposal Summary
Add an optional
config
parameter when registering a model where we can add a JSON serializable configuration per each model that can be retrieved from the API when querying the model registry. This would allow us to use more sophisticated configurations that model specific tags are not currently suited for.Motivation
Right now we are able to set tags per registered models which satisfies most cases for most people, however for the project I am working on, my models rely on a rather large JSON configuration that is used to dynamically generate the required inputs for the model. I tried storing this json as a string on a tag but ran into a character limitation.
My current implementation is to store this JSON as an artifact but this only exists for each experiment run. This results in a very clunky way of retrieving the config when trying to run inference, where I need to first search the model registry for the model I need, store a tag on the model for which run was associated with it in order to then find the artifact, use the run id to find the artifact directory with the JSON, deserialize the JSON, and then use that to draw inference from my model. At scale with multiple models, this becomes a clear bottleneck for performance that can be easily avoided.
The easiest way to solve this problem is to just allow us to store a dictionary on the registered model directly. I can't see why we can't have a
config
parameter in the API that allows us to register a model or log a registered model. That way when we query the model registry via the API, this dictionary can just come up under theconfig
key on the API response. If there are certain things that might break allowing us to add dictionaries with any datatypes as values, we can just ensure that the dictionary is JSON serializable. We don't need to do anything fancy like indexing the keys of thisconfig
dictionary to be searchable or anything via a filter query; tags already solve this problem. This is only for allowing advanced configurations to be made available for each registered model so that they can be used for inference more efficiently if needed.I don't think this would be a hard feature to implement but I don't know the codebase for ml flow enough to know where to start implementing this.
Details
No response
What component(s) does this affect?
area/model-registry
: Model Registry service, APIs, and the fluent client calls for Model RegistryThe text was updated successfully, but these errors were encountered: