I deleted my old question to be more precise on what I need.
we have a big query containing lots of smaller queries. This query runs really fine, until it reaches to the “having max” clause:
having max ( DateField ) < getdate() - @2_years_ago
is there any clause that can be faster than having max? because this is causing an index scan and looping and looping nonstop by the millions of rows that exists in the table.
i tried to do by row_number but got no luck.
And I just notice that this same query runs very well on other databases with the same structure.
just this one ( that funnyly has less rows) it doesnt work.
Dbs on the same server.
Below are the methods you can try. The first solution is probably the best. Try others if the first one doesn’t work. Senior developers aren’t just copying/pasting – they read the methods carefully & apply them wisely to each case.
It seems your primary issue is that poor cardinality estimation is causing the compiler to reorder the joins. This seems to be happening more when you use
OPTION (RECOMPILE), because the server is making different estimations.
You should be able to simplify your current query enough that the compiler will find it easier to
Without seeing your schema it’s hard to say exactly, but it seems you can flip around the
EXISTS logic to
- Your original says “ensure the maximum date in the group is less than my date”
- You actually want that there should be no rows in the group which are less than the date, so you can say “ensure that there are no rows with a date more than my date”.
- You now don’t need a
HAVINGto find the maximum date.
Your existing query has numerous issues:
care silly aliases, consider using more meaningful ones.
to quote column names, not
''. Don’t quote at all unless you have to.
- Why is
NOLOCKsplattered everywhere like confetti, what do you hope to achieve with it? It has very serious implications for data integrity. Consider using
TABLOCKhints instead, or
- The left-join tables are being filtered so you have an implicit inner join, you should just use inner joins instead.
EXISTSsubquery does not need to select anything, in fact it is ignored. You can just do
EXISTS (SELECT 1.
- Why three subqueries, looks like you should be able to do it in one.
where b.[MerchantLogId] = a.[MerchantLogId] group by b.[MerchantLogId]makes no sense: you will always have exactly one group so no need for
- It seems you don’t need
NTFM_MerchantLogin the subqueries, because that is joined by primary key.
- Don’t use arithmetic on dates, it doesn’t work well. Instead use
It also seems you can combine the three subqueries into one, by joining all the tables and using
OR (for this you will still need some left-joins).
Hopefully after making all these improvements you should see better performance
SELECT 'NTFM_MerchantLog' AS [Process Master], COUNT(DISTINCT ml.MerchantLogId) AS [Registers Count] FROM dbo.NTFM_MerchantLog ml JOIN dbo.v_ntfm_merchantlogstatus mls on mls.MerchantLogStatusId = ml.StatusId WHERE ml.CreateDate < @Today - @lv_2484deletionCycle AND mls.Status IN ('Closed', 'Expired', 'Reconciled') AND NOT EXISTS (SELECT 1 FROM dbo.ntfm_merchantlogtransactions mlt JOIN dbo.accounttransaction at1 ON at1.accountTransactionId = mlt.accountTransactionId LEFT JOIN ( dbo.ntfm_merchantlogtransactions mlt2 JOIN dbo.ntfm_merchantlog ml2 ON ml2.MerchantLogId = mlt2.MerchantLogId LEFT JOIN ( dbo.ntfm_merchantlogtransactions mlt3 JOIN dbo.accounttransaction at2 on at2.accountTransactionId = mlt3.accountTransactionId ) ON mlt3.MerchantLogId = ml2.MerchantLogId ) ON mlt2.accountTransactionId = at1.accountTransactionId WHERE mlt.MerchantLogId = ml.MerchantLogId AND ( at1.postingdt >= DATEADD(day, [email protected]_2484deletionCycle, @Today) OR ml2.createdate >= DATEADD(day, [email protected]_2484deletionCycle, @Today) OR at2.postingdt >= DATEADD(day, [email protected]_2484deletionCycle, @Today) ) );