The question:
I am trying to select unique record “pairs”, where a pair is 2 rows that have equal and opposite numeric values (eg 10 and -10). Once a record has been paired, it can’t be used in another set of pairs. Consider this table:
CREATE TABLE vals (
id text,
scalar numeric
);
INSERT INTO vals (id, scalar)
VALUES
('A', 10),
('B', -10),
('C', 10),
('D', -10),
('E', 10);
I have tried multiple variations of DISTINCT
, DISTINCT ON ()
, and UNIQUE
, but the closest I’ve come so far is the following:
WITH all_matching AS (
SELECT
v.id id1,
v2.id id2
FROM vals v
JOIN vals v2 ON
v.scalar = (v2.scalar * -1)
WHERE v.scalar > 0
), unique_left_id AS (
SELECT DISTINCT ON (id1) *
FROM all_matching
)
SELECT * FROM unique_left_id;
Which outputs:
id1 | id2
----+----
A | B
C | B
E | B
The obvious problem is that the id B
is being used to pair off against all of the other transactions, when I only want to use it once. If I select distinct on id2
after the final transaction, I would be left with a single pair instead of 2 pairs.
The Solutions:
Below are the methods you can try. The first solution is probably the best. Try others if the first one doesn’t work. Senior developers aren’t just copying/pasting – they read the methods carefully & apply them wisely to each case.
Method 1
WITH
cte AS (
SELECT id,
scalar,
ROW_NUMBER() OVER (PARTITION BY scalar ORDER BY id) rn
FROM vals
)
SELECT t1.id, t2.id
FROM cte t1
JOIN cte t2 ON t1.scalar = -t2.scalar
AND t1.scalar > t2.scalar
AND t1.rn = t2.rn
https://dbfiddle.uk/?rdbms=postgres_12&fiddle=2561e29b69b587e8c827169b361bde07
PS. This code does not process scalar IN (0, NULL)
.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0