We all know that race condition is something we should avoid. We should not try to access the same variable or memory when using multiple threads.
But I have some tricky cases. Let’s say we have a user table.
user id | name | age 1 | Paul | 14
What if One thread try to update “name” column, and The other thread tries to update “age”?
Is this always safe?
Let’s say that there is some really bad query that it will take 20 seconds to get a result.
If one thread run this SQL query, and the other thread wants to run another SQL query. The other thread should wait 20 seconds? (till the previous query finish its job)
Below are the methods you can try. The first solution is probably the best. Try others if the first one doesn’t work. Senior developers aren’t just copying/pasting – they read the methods carefully & apply them wisely to each case.
Most relational database management systems, such as MySQL, are ACID compliant. The A and I in the ACID principles stands for Atomicity and Isolation. Atomicity means a transaction happens completely or doesn’t happen at all (is rolled back completely). Isolation means transactions run isolated from each other, and when they concurrently need access to the same resource, a locking mechanism based on the isolation level is used to appropriately handle interaction on that resource.
Because of those two properties of an ACID compliant database system, the usual programmatic race condition you’re probably referring to is not typically possible. Isolation levels and locking systems prevent the same resource from being mutable concurrently. Depending on the case (e.g. two write transactions, or a write and read transaction, or two read transactions) and the isolation level implemented, in cases where a race condition would normally be able to occur otherwise, either locking occurs to ensure proper sequential mutation of the resource or copies of the state of the resource are maintained, accessed, and mutated appropriately.
Now you can still create a logical race condition if you don’t design your software properly. For example, with your
user table, if you had two separate queries that your application could execute like the following:
UPDATE user SET name = 'Jon' WHERE name = 'Paul' AND age = 20;
UPDATE user SET age = 21 WHERE name = 'Paul';
Because they are two separate queries that would result in two separate transactions, the transactions would be isolated from each other and execute independent of one another.
So if your goal was to have the data routinely updated in the order of the queries I presented above, first changing all Pauls who are age 20 to Johns, and then changing the remaining Pauls’ ages to 21, that’s all fine and well as long as the two separate queries are always executed in that order, and the first query never ran into an error.
But in an application where either one could be executed by the end user, at any time, in any order, then you could run into a logical race condition where you first update the age of all Pauls to 21 and then the ones who were previously age 20 will never be updated to be named John. Or even worse, if the first query is ran first and fails, but the second query then runs successfully, you’ll run into the same logical race condition.
The way you can code your queries to prevent such a logical race condition is by wrapping both inside an explicit transaction. For example:
START TRANSACTION; UPDATE user SET name = 'Jon' WHERE name = 'Paul' AND age = 20; UPDATE user SET age = 21 WHERE name = 'Paul'; COMMIT;
The above now guarantees the first query is always executed to completion first before the second query is executed. The keyword being completion, because if the first query fails then the second one won’t execute, rather the entire explicit transaction is rolled back, preventing a logical race condition.
Please see more information on ACID properties, isolation levels, and transactions, specific to MySQL in this StackOverflow answer.
Note for lack of a better term, I use resource in this answer loosely to mean any object, entity, or piece of data in the context of a database system. This is because the above information is true for both mutations of data via DML statements (e.g.
UPDATE, etc) or of objects within a database themselves via DDL statements (e.g.
ALTER TABLE, etc).