The question:
Can using duplicate conditions in PostgreSQL cause any problems?
For example, I have a purchases table with a customer id, and my framework in some cases duplicates a lookup condition, like the example below.
SELECT p.*
FROM purchases p
WHERE p.customer_id = 123
AND p.customer_id = 123;
OR
SELECT p.*
FROM purchases p
WHERE p.customer_id = 123
AND p.salesman = 456
AND p.departament = 789
AND p.customer_id = 123;
The Solutions:
Below are the methods you can try. The first solution is probably the best. Try others if the first one doesn’t work. Senior developers aren’t just copying/pasting – they read the methods carefully & apply them wisely to each case.
Method 1
Yes, that can cause bad query estimates. If PostgreSQL thinks that WHERE p.customer_id = 123
will return 10% of the rows from the table, it will think that WHERE p.customer_id = 123 AND p.customer_id = 123
will only return 1% of the table, because it does not realize that the conditions are actually the same and treats them as statistically independent.
Bad estimates can lead to bad plan choices and bad performance.
Perhaps that does not matter so much in your case: PostgreSQL puts a lower limit of 1 on each row count estimate, so if your de-duplicated query is already estimated to return very few rows, not much harm will be done.
Try using EXPLAIN
!
Method 2
According to my tests on 12.6, it depends on the predicate:
db=# explain select * from books where product_form = 'BB';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Index Scan using index_books_on_client_id_and_product_form on books (cost=0.29..7273.75 rows=3604 width=23686)
Index Cond: ((product_form)::text = 'BB'::text)
(2 rows)
db=# explain select * from books where product_form = 'BB' and product_form = 'BB' ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Index Scan using index_books_on_client_id_and_product_form on books (cost=0.29..7273.75 rows=3604 width=23686)
Index Cond: ((product_form)::text = 'BB'::text)
(2 rows)
db=# explain select * from books where product_form = 'BB' and product_form = 'BB' and product_form = 'BB' ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Index Scan using index_books_on_client_id_and_product_form on books (cost=0.29..7273.75 rows=3604 width=23686)
Index Cond: ((product_form)::text = 'BB'::text)
(2 rows)
So in that case, no problem. The row estimate and costs are constant.
Possibly this is because an index is involved, because an ilike predicate is negatively affected by this:
db=# explain select * from books where title ilike '%phys%';
QUERY PLAN
--------------------------------------------------------------
Seq Scan on books (cost=0.00..7613.31 rows=539 width=23686)
Filter: ((title)::text ~~* '%phys%'::text)
(2 rows)
db=# explain select * from books where title ilike '%phys%' and title ilike '%phys%';
QUERY PLAN
---------------------------------------------------------------------------------------
Seq Scan on books (cost=0.00..7676.98 rows=11 width=23686)
Filter: (((title)::text ~~* '%phys%'::text) AND ((title)::text ~~* '%phys%'::text))
(2 rows)
db=# explain select * from books where title ilike '%phys%' and title ilike '%phys%' and title ilike '%phys%';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Seq Scan on books (cost=0.00..7740.64 rows=1 width=23686)
Filter: (((title)::text ~~* '%phys%'::text) AND ((title)::text ~~* '%phys%'::text) AND ((title)::text ~~* '%phys%'::text))
(2 rows)
Note the slight increase in cost and the reductction in estimate row count.
I think this could well be version sensitive also, so I would suggest checking is on your version with the predicates and schema that you’re concerned with.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0