‘EXISTS’ is better than ‘IN’ Rishi Agarwal, July 11, 2008 While working on my last project, I was facing a query which was taking 30 second to execute. The number of tables involved were 10 and total records were around 20k. It was a search functionality for a web-app project. There were lot of sub-queries involved in the first draft, all of them using ‘IN’. After doing some research ( within MySQL Community ) , I just replaced all ‘IN’ by ‘EXISTS’ and voila ! The query took 11 seconds to execute. For example, you could change this query : SELECT * FROM invitees i WHERE i.event_id IN ( SELECT e.id FROM event e WHERE e.city LIKE = ‘ABC’ ) TO SELECT * FROM invitees i WHERE EXISTS ( SELECT 1 FROM event e WHERE e.city LIKE = ‘ABC’ AND e.id = i.event_id ). Try it and let me know if you find the same difference in performance as I have. There were many more modifications to the original query to bring it further down to 1 sec. But I will talk about that later. For more information on EXISTS and IN Sub Queries see Reference below. References : http://dev.mysql.com/doc/refman/5.0/en/optimizing-subqueries.html http://dev.mysql.com/doc/refman/5.0/en/in-subquery-optimization.html Related MySql performancequery optimization
1) You are using LIKE with no wildcard, and so it really should be = ‘ABC’ 2) You would get sub-second performance with a standard join: SELECT i.* FROM invitees i INNER JOIN event e ON e.id = i.event_id AND e.city = ‘ABC’; Reply
Hi there, in my opinion, these SELECT * FROM invitees i WHERE i.event_id IN ( SELECT id FROM event e WHERE city LIKE ‘ABC’ ) Should all be rewritten to joins SELECT i.* FROM invitees i INNER JOIN event e ON i.event_id = e.id WHERE e.city LIKE ‘ABC’ Also, the LIKE does not contain a wildcard, we can just as well write: WHERE e.city = ‘ABC’ instead of WHERE e.city LIKE ‘ABC’ Roland Bouman http://rpbouman.blogspot.com/ Reply
Oh , I am sorry. Should have put LIKE ‘%ABC%’ or = ‘ABC’ for the example. Second, I agree with your example of using a standard join, it will be better. The 2 examples given were to show the difference between IN and EXISTS , so didn’t focus on JOIN. But yeah, there are ways to replace ‘IN’ Sub Queries with JOIN for better performance. Reply
— SELECT * FROM invitees i WHERE i.event_id IN ( SELECT e.id FROM event e WHERE e.city LIKE = ‘ABC’ ) — I believe this is slow because the inner query is evaluated, and then the outer query is evaluated – you do not reference the inner and outer queries directly. If you rewrote the query to be: SELECT * FROM invitees i WHERE i.event_id IN ( SELECT e.id FROM event e WHERE e.city LIKE = ‘ABC’ AND i.event_id = e.id ) You would find the query to be much faster. Reply
Hi, I have this SQL query but instead of using IN i like to use the EXISTS operator, now how can I do that? My brain is hanging on this and need help. Thanks SELECT C.CUSTOMER_NUM, C.CUSTOMER_NAME FROM CUSTOMER C, ORDERS O WHERE C.CUSTOMER_NUM = O.CUSTOMER_NUM AND O.ORDER_DATE IN (‘2007-10-21’); Reply
Hi Ahmed, You can try this query : SELECT C.CUSTOMER_NUM, C.CUSTOMER_NAME FROM CUSTOMER C WHERE EXISTS ( SELECT 1 FROM ORDERS O WHERE O.ORDER_DATE = ‘2007-10-21’ AND C.CUSTOMER_NUM = O.CUSTOMER_NUM ) Reply
that did work many many thanks for that..you guys deserve an oscar award for being so helpfull…cheers thanks Reply
Hi…. can some buddy help me…to use EXISTS …how i can use EXISTS in following query insted of IN ………………………. ……………………. ………..WHERE application_opus.company_id = ’50’ AND application_opus.application_id IN ( SELECT event_log_link_opus.event_log_link_reference_id FROM event_log_link_opus LEFT JOIN event_log_link_type_opus ON event_log_link_type_opus.event_log_link_type_id = event_log_link_opus.event_log_link_type_id LEFT JOIN event_log_opus ON event_log_opus.event_log_id = event_log_link_opus.event_log_id LEFT JOIN event_log_type_opus ON event_log_type_opus.event_log_type_id = event_log_opus.event_log_type_id WHERE event_log_link_type_opus.name_short = ‘application_id’ AND event_log_type_opus.event_type IN (‘dteen_fraud_rule_130’) ) Reply
After looking at what Rishi has to offer, you this might work: WHERE application_opus.company_id = ’50′ AND EXISTS ( SELECT 1 FROM event_log_link_opus LEFT JOIN event_log_link_type_opus ON event_log_link_type_opus.event_log_link_type_id = event_log_link_opus.event_log_link_type_id LEFT JOIN event_log_opus ON event_log_opus.event_log_id = event_log_link_opus.event_log_id LEFT JOIN event_log_type_opus ON event_log_type_opus.event_log_type_id = event_log_opus.event_log_type_id WHERE event_log_link_type_opus.name_short = ‘application_id’ AND event_log_type_opus.event_type = ‘dteen_fraud_rule_130′ AND application_opus.application_id = event_log_link_opus.event_log_link_reference_id ); I, btw, know it’s too late to answer this but I saw it today and thought of giving it a try!!! 🙂 Reply