The question:
I’ve been experimenting with EXPLAIN
and noticed that the row count was big for our query so I tried adding an index to the VARCHAR column. First I added FULLTEXT
but it ended up just making the query longer so I checked it with EXPLAIN
and it was still showing all the rows. but when I tried to do EXPLAIN
on the same table but using another indexed column which is an INT the row count was only 1.
I also tried changing from FULLTEXT
to just a normal INDEX
but it still showed the same result.
Will VARCHAR always result in a high row count even if you INDEX it? Is there any way to improve its performance?
Also, the column is not unique.
Query Time Before FULLTEXT:
'query' => 'select `id` from `pdf_packages` where `private_reference_number` = ? and `pdf_packages`.`deleted_at` is null limit 1',
'bindings' =>
array (
0 => '558376067',
),
'time' => 64.42,
Query Time After FULLTEXT:
'query' => 'select `id` from `pdf_packages` where `private_reference_number` = ? and `pdf_packages`.`deleted_at` is null limit 1',
'bindings' =>
array (
0 => '292314725',
),
'time' => 129.89,
EXPLAIN WITH FULLTEXT:
EXPLAIN WITH INDEX:
EXPLAIN WITH INDEX INT COLUMN:
I’ve already removed the INDEX for the column “private_reference_number” so you won’t see it below.
SHOW CREATE TABLE:
CREATE TABLE `pdf_packages` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`organiser_id` int(11) NOT NULL,
`pro_event_id` int(11) NOT NULL,
`pro_order_id` int(11) NOT NULL,
`pro_order_item_id` int(11) NOT NULL,
`package_id` int(11) NOT NULL,
`package_item_item_id` int(11) NOT NULL,
`private_reference_number` varchar(50) COLLATE utf8_unicode_ci NOT NULL,
`reference_index` int(11) NOT NULL,
`pro_attendee_id` int(11) DEFAULT NULL,
`member_card_id` int(11) DEFAULT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
`deleted_at` timestamp NULL DEFAULT NULL,
`is_printed` int(11) NOT NULL DEFAULT '0',
`has_arrived` tinyint(4) NOT NULL DEFAULT '0',
`arrival_time` datetime DEFAULT NULL,
`seat_details` longtext COLLATE utf8_unicode_ci,
`svg_seat_details` longtext COLLATE utf8_unicode_ci,
`is_shared` tinyint(1) NOT NULL DEFAULT '0',
`share_from_id` int(11) DEFAULT NULL,
`owner_id` int(11) DEFAULT NULL,
`gates` text COLLATE utf8_unicode_ci,
`pro_order_edit_history_id` int(11) DEFAULT NULL,
`hard_ticket_sent` tinyint(4) NOT NULL DEFAULT '0',
`hard_ticket_sent_on` datetime DEFAULT NULL,
`sent_by` int(11) DEFAULT NULL,
`pdf_sent` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`from_access` tinyint(1) NOT NULL DEFAULT '0',
`pro_promo_code_id` int(11) DEFAULT NULL,
`printed_at` datetime DEFAULT NULL,
`bundle_index` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `pdf_packages_organiser_id_pro_event_id_pro_order_id_index` (`organiser_id`,`pro_event_id`,`pro_order_id`),
KEY `pdf_packages_pro_order_item_id_package_id_index` (`pro_order_item_id`,`package_id`),
KEY `pdf_packages_package_item_item_id_pro_attendee_id_index` (`package_item_item_id`,`pro_attendee_id`),
KEY `pdf_packages_member_card_id_share_from_id_index` (`member_card_id`,`share_from_id`),
KEY `pdf_packages_pro_order_edit_history_id_index` (`pro_order_edit_history_id`),
KEY `pdf_packages_pro_attendee_id_index` (`pro_attendee_id`),
KEY `pdf_packages_pro_order_item_id_index` (`pro_order_item_id`)
) ENGINE=InnoDB AUTO_INCREMENT=184980 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
The Solutions:
Below are the methods you can try. The first solution is probably the best. Try others if the first one doesn’t work. Senior developers aren’t just copying/pasting – they read the methods carefully & apply them wisely to each case.
Method 1
Given your query, I would add this index:
ALTER TABLE pdf_packages ADD INDEX (private_reference_number, deleted_at);
That matches the two conditions you have in your WHERE clause, so it should do the best job at reducing the number of examined rows. An index can help searching a VARCHAR column as well as an INT column.
However, if your query searches for a value that happens to occur on most of the rows in the table, the optimizer may reason that using the index isn’t going to help, so it just does a table-scan.
By analogy, this is the reason why common words aren’t included in the index at the back of a book. Why bother to list the word “the” in the index, followed by a list of every page number? It’s quicker to just read the book cover to cover.
In practice, MySQL’s optimizer skips the index if the value you search for occurs on about 20% of the rows or more. This is not a documented threshold, just something I’ve noticed.
If you think the optimizer has made a bad choice, you can override it by using an index hint. But most of the time, it makes the right choice.
A FULLTEXT index will be used only if you use the MATCH() AGAINST()
predicate. I don’t see that in your query, so you don’t need a FULLTEXT index.
Method 2
(Focusing on the Title)
LIKE
is useful in testing VARCHARs
.
x LIKE 'abc' -- this turns in to x = 'abc'
x LIKE 'abc%' -- will _probably_ use INDEX(x)
x LIKE '%abc' -- cannot use INDEX(x)
x = 123 -- cannot use INDEX(x) because not comparing to string
The general rule about the last example: “Leading wildcard prevents use of an index.”
FULLTEXT
works only with MATCH(...) AGAINST(...)
(And I agree with Bill’s answer plus magicianiam’s Comment.)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0