Change psql concurrency from autocommit to serializable.#1190
Open
acceso wants to merge 2 commits intoSpriteLink:masterfrom
acceso:concurrentdeletes
Open
Change psql concurrency from autocommit to serializable.#1190acceso wants to merge 2 commits intoSpriteLink:masterfrom acceso:concurrentdeletes
acceso wants to merge 2 commits intoSpriteLink:masterfrom
acceso:concurrentdeletes
Conversation
Autocommit is giving concurrency errors in PostgreSQL when operations
are sent in parallel. Using serializable transactions seems to fix it.
For ex:
ERROR: deadlock detected
DETAIL: Process 176184 waits for ShareLock on transaction 15529683; blocked by process 191002.
Process 191002 waits for ShareLock on transaction 15529684; blocked by process 178678.
Process 178678 waits for ExclusiveLock on tuple (1386,16) of relation 43815 of database 16391; blocked by process 176184.
Process 176184: DELETE FROM ip_net_plan AS p WHERE vrf_id = 0 AND prefix = '10.0.10.240/28'
Process 191002: DELETE FROM ip_net_plan AS p WHERE vrf_id = 0 AND prefix = '10.0.11.0/28'
Process 178678: DELETE FROM ip_net_plan AS p WHERE vrf_id = 0 AND prefix = '10.0.10.208/28'
HINT: See server log for query details.
CONTEXT: while locking tuple (1386,16) in relation "ip_net_plan"
SQL statement "UPDATE ip_net_plan SET children =
(SELECT COUNT(1)
FROM ip_net_plan
WHERE vrf_id = OLD.vrf_id
AND iprange(prefix) << iprange(old_parent.prefix)
AND indent = old_parent.indent+1)
WHERE id = old_parent.id"
PL/pgSQL function tf_ip_net_plan__prefix_iu_after() line 92 at SQL statement
STATEMENT: DELETE FROM ip_net_plan AS p WHERE vrf_id = 0 AND prefix = '10.0.10.240/28'
houndci-bot
reviewed
May 29, 2018
nipap/nipap/backend.py
Outdated
| try: | ||
| self._con_pg = psycopg2.connect(**db_args) | ||
| self._con_pg.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT) | ||
| self._con_pg.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_SERIALIZABLE) |
There was a problem hiding this comment.
line too long (98 > 79 characters)
Author
|
I have more feedback about this change. It works if concurrency is low, but when concurrency increases, the deadlocks cause timeouts. I think the actual problem is harder to fix: there are concurrency issues in the database code. Unfortunately a fix for this is outside my reach. So far we have mitigated the problem by adding retries in the frontend code. This hides the problem from our users. |
Author
|
Here is a different patch that undoes the previous change and locks the tables for the deletes. We have tested it for one week and no complains so far. Performance is around 50% lower but no more deadlocks for us. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Psql Autocommit is giving concurrency errors in PostgreSQL when operations are sent in parallel. Using serializable transactions seems to fix it.
Example:
This problem is happening pretty often in our deployment (several times per day).
I can easily reproduce it with this simple (and partial) Python code.
list_of_prefixesis a file with a prefix on each line:I guess this patch only hides the problem, but changing every query/transaction/function would not be as easy.