You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I understand that AWX is open source software provided for free and that I might not receive a timely response.
I am NOT reporting a (potential) security vulnerability. (These should be emailed to [email protected] instead.)
Bug Summary
If a job with slices is being created and a new jobevent partition is needed, it can happen that multiple slices try to create a new partition and one of the jobs is faster, thus the job might go into "error" state because the partition already exists.
It's a very time-specific issue and I was only able to trigger it twice at this time.
AWX version
24.4.0
Select the relevant components
UI
UI (tech preview)
API
Docs
Collection
CLI
Other
Installation method
kubernetes
Modifications
no
Ansible version
No response
Operating system
No response
Web browser
No response
Steps to reproduce
Create a job with multiple slices. Make sure that the latest jobevent table partition is big enough to create a new one.
Expected results
All jobs will be created and started.
Actual results
One or more jobs fails with DuplicateObject / ProgrammingError error.
Additional information
Traceback (most recent call last):
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 87, in _execute
return self.cursor.execute(sql)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/cursor.py", line 732, in execute
raise ex.with_traceback(None)
psycopg.errors.DuplicateObject: type "main_jobevent_20240611_13" already exists
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/tasks/jobs.py", line 499, in run
self.pre_run_hook(self.instance, private_data_dir)
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/tasks/jobs.py", line 1073, in pre_run_hook
super(RunJob, self).pre_run_hook(job, private_data_dir)
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/tasks/jobs.py", line 427, in pre_run_hook
create_partition(instance.event_class._meta.db_table, start=instance.created)
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/utils/common.py", line 1154, in create_partition
cursor.execute(
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
return executor(sql, params, many, context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 84, in _execute
with self.db.wrap_database_errors:
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/utils.py", line 91, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 87, in _execute
return self.cursor.execute(sql)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/cursor.py", line 732, in execute
raise ex.with_traceback(None)
django.db.utils.ProgrammingError: type "main_jobevent_20240611_13" already exists
The text was updated successfully, but these errors were encountered:
Please confirm the following
[email protected]
instead.)Bug Summary
If a job with slices is being created and a new jobevent partition is needed, it can happen that multiple slices try to create a new partition and one of the jobs is faster, thus the job might go into "error" state because the partition already exists.
It's a very time-specific issue and I was only able to trigger it twice at this time.
AWX version
24.4.0
Select the relevant components
Installation method
kubernetes
Modifications
no
Ansible version
No response
Operating system
No response
Web browser
No response
Steps to reproduce
Create a job with multiple slices. Make sure that the latest jobevent table partition is big enough to create a new one.
Expected results
All jobs will be created and started.
Actual results
One or more jobs fails with
DuplicateObject
/ProgrammingError
error.Additional information
The text was updated successfully, but these errors were encountered: