[BUG] Memory leak when using `add_coco_labels` for instance segmentation with coco_id_field set #4407

h-fernand · 2024-05-22T13:18:51Z

Describe the problem

When trying to add COCO format instance segmentation prediction data to my dataset using add_coco_labels the program will begin rapidly using RAM until eventually it runs out of RAM and crashes. This only happens if I set the coco_id_field to coco_id so that I can sync up my annotations with my samples properly. If I omit the coco_id_field and let the function run with the default behavior, my annotations get mismatched but the program does not eat nearly as much RAM and actually does finish running. This code also produces the same erroneous behavior if I provide add_coco_labels with a view containing only the test data split instead of the whole dataset.

Code to reproduce issue

import fiftyone as fo
import fiftyone.utils.coco as fouc

dataset_name = "dataset"
splits = ['train', 'val', 'test']
dataset_root = '/path/to/dataset/root'
annotations_dir = 'annotations
annfile_template = 'instances_{split}.json'

predictions_file = '/path/to/predictions/file.json'

combined_dataset = fo.Dataset(name=dataset_name, persistent=True)

for split in splits:
    print(f"Loading: {split} dataset")

    annfile = f"{dataset_root}/{annotations_dir}/{annfile_template.format(split=split)}"
    data_path = f"{dataset_root}/{split}"
    split_dataset_name = f"ground_truth_{split}"

    split_dataset = fo.Dataset.from_dir(
        data_path=data_path,
        labels_path=annfile,
        dataset_type=fo.types.COCODetectionDataset,
        name=split_dataset_name,
        include_id=True,
        persistent=True
    )
    split_dataset.tag_samples(split)
    combined_dataset.merge_samples(split_dataset)

with open(predictions_file, 'r') as f:
    prediction_data = json.load(f)

predictions = prediction_data['annotations']
classes = prediction_data['categories']
classes = [x['name'] for x in classes]

fouc.add_coco_labels(combined_dataset, "predictions", predictions, classes, label_type="segmentations", coco_id_field="coco_id")

System information

OS Platform and Distribution: Linux Ubuntu 22.04
Python version: Python 3.10.12
FiftyOne version (fiftyone --version): v0.23.8
FiftyOne installed from (pip or source): pip

Willingness to contribute

The FiftyOne Community encourages bug fix contributions. Would you or another
member of your organization be willing to contribute a fix for this bug to the
FiftyOne codebase?

Yes. I can contribute a fix for this bug independently
Yes. I would be willing to contribute a fix for this bug with guidance
from the FiftyOne community
No. I cannot contribute a bug fix at this time

The text was updated successfully, but these errors were encountered:

h-fernand · 2024-05-22T15:16:02Z

As an update, it appears that this dramatic memory usage occurs no matter how the function is used successfully. The reason it did not eat all of the RAM without the coco_id_field set was because the annotations were created in the wrong order. When fixing the order of the annotations the memory leak occurs. I'm convinced this is a memory leak because my prediction annotation file is only 2GB, there are only 1000 images in the test set that I'm adding predictions to, and the program ends up eating all of the RAM on a system with 256GB of RAM.

brimoor · 2024-05-23T03:49:56Z

@h-fernand this sounds similar to the issue reported in #4293 which has been resolved in #4354.

(FYI the above patch will be released in fiftyone==0.24.0 which is scheduled for next week)

h-fernand · 2024-05-23T13:47:06Z

That's great news, I'll try the patch out once it's released and hopefully it resolves the issue. I'll post an update in this thread once it's been released.

h-fernand added the bug Bug fixes label May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Memory leak when using `add_coco_labels` for instance segmentation with coco_id_field set #4407

[BUG] Memory leak when using `add_coco_labels` for instance segmentation with coco_id_field set #4407

h-fernand commented May 22, 2024 •

edited

Loading

h-fernand commented May 22, 2024

brimoor commented May 23, 2024

h-fernand commented May 23, 2024

[BUG] Memory leak when using add_coco_labels for instance segmentation with coco_id_field set #4407

[BUG] Memory leak when using add_coco_labels for instance segmentation with coco_id_field set #4407

Comments

h-fernand commented May 22, 2024 • edited Loading

Describe the problem

Code to reproduce issue

System information

Willingness to contribute

h-fernand commented May 22, 2024

brimoor commented May 23, 2024

h-fernand commented May 23, 2024

[BUG] Memory leak when using `add_coco_labels` for instance segmentation with coco_id_field set #4407

[BUG] Memory leak when using `add_coco_labels` for instance segmentation with coco_id_field set #4407

h-fernand commented May 22, 2024 •

edited

Loading