Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip polling providers still processing #5181

Merged
merged 16 commits into from
Jun 28, 2024
Merged

Conversation

lcouzens
Copy link
Contributor

@lcouzens lcouzens commented Jun 21, 2024

Jira Ticket

COST-5180

Description

This change will add an additional filter to polling batch collection to not add provider to batch that has not completed yet (XL customers that take longer than 24 hours to process)

Testing

  1. Checkout Branch
  2. Restart Koku
  3. Make sure you set your POLLING_TIMER appropriately (E.I NOT 1 second)
  4. Load AWS data (everything should process correct)
  5. Hit download - Nothing new should attempt to be queued
  6. Run the following SQL (Change uuid for your provider)
  7. Run # update api_provider set polling_timestamp = '2024-06-23 00:00:00+00' where uuid = '9dfc1842-d654-42d8-8e70-efc49b33fa5f';
  8. Hit download again and observe provider is polled
  9. Run # update api_provider set data_updated_timestamp = '2024-06-17 00:00:00+00' where uuid = '9dfc1842-d654-42d8-8e70-efc49b33fa5f';
  10. Hit download and observe provider not added to polling (Since data_updated_timestamp is older than polling time and the polling time is within 24 hours, this simulates a large provider still processing)
  11. Run # update api_provider set polling_timestamp = '2024-06-23 00:00:00+00' where uuid = '9dfc1842-d654-42d8-8e70-efc49b33fa5f';
  12. Hit download, (This time we should collect the provider since the updated timestamp is outside our 7 day window and the polling time is outside 24 hours)
  13. Run # update api_provider set polling_timestamp = null where uuid = '9dfc1842-d654-42d8-8e70-efc49b33fa5f';
  14. Download again and see polling for provider (This simulates a new provider that has no polling time set yet)

Release Notes

  • proposed release note
* [COST-5180](https://issues.redhat.com/browse/COST-5180) Only attempt to fetch new reports if current processing is complete

Notes:

I left a TODO in here around checking data_updated_timestamp being older and not getting updated. For example if for any reason our download processing fails, maybe we kill tasks or something. It's conceivable that the data_updated_timestamp wont be updated, meaning we never poll that provider again without manual intervention.
CORRECTION I added the following to handle this:

| Q(data_updated_timestamp__lte=process_wait_delta)

I have some interesting metrics on this if people are interested. We currently have 62 providers in that state 28 of those are currently processing though! The rest (34) we just keep queueing up but have not completed in years. Probably in bad states.

@lcouzens lcouzens added the smoke-tests pr_check will build the image and run minimal required smokes label Jun 21, 2024
@lcouzens lcouzens requested review from a team as code owners June 21, 2024 10:59
Copy link

codecov bot commented Jun 21, 2024

Codecov Report

Attention: Patch coverage is 96.15385% with 1 line in your changes missing coverage. Please review.

Project coverage is 94.2%. Comparing base (1810fa1) to head (e6e0bda).

Additional details and impacted files
@@          Coverage Diff          @@
##            main   #5181   +/-   ##
=====================================
  Coverage   94.2%   94.2%           
=====================================
  Files        376     376           
  Lines      31233   31257   +24     
  Branches    3727    3735    +8     
=====================================
+ Hits       29412   29435   +23     
  Misses      1161    1161           
- Partials     660     661    +1     

@lcouzens
Copy link
Contributor Author

/retest

Copy link
Contributor

@myersCody myersCody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we have unleash short circuits inside of the Orchestrator:

def get_polling_batch(self):
        if self.provider_uuid:
            providers = Provider.objects.filter(uuid=self.provider_uuid)
        else:
            filters = {}
            if self.provider_type:
                filters["type"] = self.provider_type
            providers = Provider.polling_objects.get_polling_batch(settings.POLLING_BATCH_SIZE, filters=filters)
            LOG.info(f"providers: {len(providers)}")

        batch = []
        for provider in providers:
            provider.polling_timestamp = self.dh.now
            provider.save(update_fields=["polling_timestamp"])
            schema_name = provider.account.get("schema_name")
            if is_cloud_source_processing_disabled(schema_name):
                LOG.info(log_json("get_polling_batch", msg="processing disabled for schema", schema=schema_name))
                continue
            if is_source_disabled(provider.uuid):
                LOG.info(
                    log_json(
                        "get_polling_batch",
                        msg="processing disabled for source",
                        schema=schema_name,
                        provider_uuid=provider.uuid,
                    )
                )
                continue
            batch.append(provider)
        return batch

We may want to consider updating the polling timestamp after the is_cloud_source_processing_disabled and is_source_disabled checks. Otherwise they will look like they are "still processing" until we hit our "something went wrong" deadline; which may be confusing behavior.

Update:
Chatted with Luke about this and it could be problematic to move it down cause we would be collecting X disabled + w/e. Which sounds more problematic then running a download task.

@lcouzens lcouzens enabled auto-merge (squash) June 25, 2024 19:24
myersCody
myersCody previously approved these changes Jun 26, 2024
@lcouzens lcouzens disabled auto-merge June 26, 2024 13:08
myersCody
myersCody previously approved these changes Jun 27, 2024
myersCody
myersCody previously approved these changes Jun 27, 2024
myersCody
myersCody previously approved these changes Jun 27, 2024
@lcouzens
Copy link
Contributor Author

/retest

@lcouzens lcouzens enabled auto-merge (squash) June 27, 2024 16:24
@lcouzens lcouzens merged commit 4baa511 into main Jun 28, 2024
11 checks passed
@lcouzens lcouzens deleted the COST-tweak-polling-batch branch June 28, 2024 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
smoke-tests pr_check will build the image and run minimal required smokes smokes-required
Projects
None yet
2 participants