Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Push monitors' PENDING-status not respecting retries #4785

Open
thielj opened this issue May 22, 2024 · 3 comments
Open

Push monitors' PENDING-status not respecting retries #4785

thielj opened this issue May 22, 2024 · 3 comments
Labels
area:monitor Everything related to monitors feature-request Request for new features to be added type:enhance-existing feature wants to enhance existing monitor

Comments

@thielj
Copy link

thielj commented May 22, 2024

πŸ“‘ I have found these related issues/pull requests

🏷️ Feature Request Type

API / automation options, Change to existing monitor

πŸ”– Feature description

Sending status=pending&msg=backup%20started would immediately set the monitor to 'retry mode' without waiting for the usual period to expire.

βœ”οΈ Solution

There have been several mentions in the past.

Another example is monitoring processes that occasionally exit or need to be deliberately stopped but are expected to be restarted and become available within the retry period (think systemd unit).

Or a daily job where I generally expect UP once a day. When the job is being started I send PENDING, after which the monitor would go into retry mode and expect the UP to arrive within say 1h instead of waiting a full day before raising a notification (think remote job dying or stalling or blocking somehow).

❓ Alternatives

For the above-mentioned examples, I couldn't find alternatives short of implementing my own "pending logic" somehow. None of these would add a suitable event record to Uptime Kuma either.

πŸ“ Additional Context

No response

@thielj thielj added the feature-request Request for new features to be added label May 22, 2024
@CommanderStorm CommanderStorm changed the title push monitor status=pending Push monitors' PENDING-status not respecting retries May 22, 2024
@CommanderStorm CommanderStorm added the area:monitor Everything related to monitors label May 23, 2024
@CommanderStorm
Copy link
Collaborator

CommanderStorm commented May 23, 2024

Where you are entirely correct is that the current way we are communicating this and how our retry/.. logic for this monitor works is super weird.

Note

As context PENDING means that a monitor either

  • has not had a push,
  • has failed in the past and is currently retrying or
  • is in some other transitionary step between UP or DOWN (such as docker containers starting up).

=> I don't know how setting a monitor to PENDING SHOULD behave. Our accounting around this is a bit messy and the behaviour is entirely undocumented. Likely this should skip one retry but behave as if DOWN for other purposes, but unsure..
=> That setting a monitor to DOWN does not trigger the correct retries is definitively a bug..

Tip

If you want to use the retry logic in the current system, you should instead not send a push => let the push-monitor time out and go into the PENDING-state independently

@CommanderStorm CommanderStorm added the type:enhance-existing feature wants to enhance existing monitor label May 23, 2024
@thielj
Copy link
Author

thielj commented May 23, 2024

@CommanderStorm As I mentioned already, and others have mentioned before, letting something go into the pending state isn't really a solution if you have e.g. a job running once a day or are transitioning through a unit or container restart.

PENDING for me is - in the context of push notifications at least - that something isn't fully up or completed yet, but due shortly and long before the regular period expires. Most important, if it doesn't come fully UP within the retry period, I want it to be considered DOWN and notified immediately.

Everything else either delays notifications unnecessarily or creates too many false positives.

It's not that different from something actively monitored by U-K, except that retries and retry periods just pass without actively retrying.

@CommanderStorm
Copy link
Collaborator

letting something go into the pending state isn't really a solution

You are explicitly setting it to PENDING, so how can you not want the pending state??
I think something was left in translation here. Frank is confused ^^

It's not that different from something actively monitored by U-K, except that retries and retry periods just pass without actively retrying

I am going to repeat myself as i am 5% unshure if my last communication was clear (no offense intended, just trying to not mis-communicate ^^)

Tip

If you want to use the retry logic in the current system, you should NOT send a push in the interval.
This lets the push-monitor time-out and go into the PENDING-state. The retry logic is triggered via this path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:monitor Everything related to monitors feature-request Request for new features to be added type:enhance-existing feature wants to enhance existing monitor
Projects
None yet
Development

No branches or pull requests

2 participants