The question:
I deleted my old question to be more precise on what I need.
we have a big query containing lots of smaller queries. This query runs really fine, until it reaches to the “having max” clause:
having
max
(
DateField
) < getdate() - @2_years_ago
is there any clause that can be faster than having max? because this is causing an index scan and looping and looping nonstop by the millions of rows that exists in the table.
i tried to do by row_number but got no luck.
And I just notice that this same query runs very well on other databases with the same structure.
just this one ( that funnyly has less rows) it doesnt work.
Dbs on the same server.
The Solutions:
Below are the methods you can try. The first solution is probably the best. Try others if the first one doesn’t work. Senior developers aren’t just copying/pasting – they read the methods carefully & apply them wisely to each case.
Method 1
It seems your primary issue is that poor cardinality estimation is causing the compiler to reorder the joins. This seems to be happening more when you use OPTION (RECOMPILE)
, because the server is making different estimations.
You should be able to simplify your current query enough that the compiler will find it easier to
Without seeing your schema it’s hard to say exactly, but it seems you can flip around the EXISTS
logic to NOT EXISTS
.
- Your original says “ensure the maximum date in the group is less than my date”
- You actually want that there should be no rows in the group which are less than the date, so you can say “ensure that there are no rows with a date more than my date”.
- You now don’t need a
HAVING
to find the maximum date.
Your existing query has numerous issues:
a
b
c
are silly aliases, consider using more meaningful ones.- Use
[]
to quote column names, not''
. Don’t quote at all unless you have to. - Why is
NOLOCK
splattered everywhere like confetti, what do you hope to achieve with it? It has very serious implications for data integrity. Consider usingTABLOCK
hints instead, orSNAPSHOT
isolation. - The left-join tables are being filtered so you have an implicit inner join, you should just use inner joins instead.
- An
EXISTS
subquery does not need to select anything, in fact it is ignored. You can just doEXISTS (SELECT 1
. - Why three subqueries, looks like you should be able to do it in one.
where b.[MerchantLogId] = a.[MerchantLogId] group by b.[MerchantLogId]
makes no sense: you will always have exactly one group so no need forgroup by
.- It seems you don’t need
NTFM_MerchantLog
in the subqueries, because that is joined by primary key. - Don’t use arithmetic on dates, it doesn’t work well. Instead use
DATEADD
It also seems you can combine the three subqueries into one, by joining all the tables and using OR
(for this you will still need some left-joins).
Hopefully after making all these improvements you should see better performance
SELECT
'NTFM_MerchantLog' AS [Process Master],
COUNT(DISTINCT ml.MerchantLogId) AS [Registers Count]
FROM
dbo.NTFM_MerchantLog ml
JOIN dbo.v_ntfm_merchantlogstatus mls on mls.MerchantLogStatusId = ml.StatusId
WHERE
ml.CreateDate < @Today - @lv_2484deletionCycle
AND mls.Status IN ('Closed', 'Expired', 'Reconciled')
AND NOT EXISTS (SELECT 1
FROM dbo.ntfm_merchantlogtransactions mlt
JOIN dbo.accounttransaction at1 ON at1.accountTransactionId = mlt.accountTransactionId
LEFT JOIN (
dbo.ntfm_merchantlogtransactions mlt2
JOIN dbo.ntfm_merchantlog ml2 ON ml2.MerchantLogId = mlt2.MerchantLogId
LEFT JOIN (
dbo.ntfm_merchantlogtransactions mlt3
JOIN dbo.accounttransaction at2 on at2.accountTransactionId = mlt3.accountTransactionId
) ON mlt3.MerchantLogId = ml2.MerchantLogId
) ON mlt2.accountTransactionId = at1.accountTransactionId
WHERE
mlt.MerchantLogId = ml.MerchantLogId
AND (
at1.postingdt >= DATEADD(day, [email protected]_2484deletionCycle, @Today)
OR ml2.createdate >= DATEADD(day, [email protected]_2484deletionCycle, @Today)
OR at2.postingdt >= DATEADD(day, [email protected]_2484deletionCycle, @Today)
)
);
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0