Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report low cardinality indices #31

Closed
macbre opened this issue Oct 28, 2017 · 2 comments
Closed

Report low cardinality indices #31

macbre opened this issue Oct 28, 2017 · 2 comments
Labels
Milestone

Comments

@macbre
Copy link
Owner

macbre commented Oct 28, 2017

Cardinality is counted based on statistics stored as integers, so the value is not necessarily exact even for small tables. The higher the cardinality, the greater the chance that MySQL uses the index when doing joins.

https://dev.mysql.com/doc/refman/5.7/en/show-index.html

Set the threshold of cardinality so that a single value covers at most 20% of table rows:

And finally for case with 20% rows with has_someting=0.

https://www.percona.com/blog/2007/08/28/do-you-always-need-index-on-where-column/

New report - low_cardinality_index

@macbre macbre added the linters label Nov 20, 2017
@macbre macbre added this to the v1.3 milestone Jul 15, 2018
@macbre
Copy link
Owner Author

macbre commented Jul 15, 2018

mysql@localhost[index_digest]>SHOW INDEX FROM 0020_big_table;
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table          | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| 0020_big_table |          0 | PRIMARY  |            1 | item_id     | A         |      100405 |     NULL | NULL   |      | BTREE      |         |               |
| 0020_big_table |          1 | text_idx |            1 | text        | A         |      100405 |     NULL | NULL   |      | BTREE      |         |               |
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0,00 sec)

--

mysql@localhost[index_digest]>select TABLE_NAME, INDEX_NAME, COLUMN_NAME, CARDINALITY from INFORMATION_SCHEMA.STATISTICS where TABLE_NAME = '0020_big_table' AND TABLE_SCHEMA = 'index_digest';
+----------------+------------+-------------+-------------+
| TABLE_NAME     | INDEX_NAME | COLUMN_NAME | CARDINALITY |
+----------------+------------+-------------+-------------+
| 0020_big_table | PRIMARY    | item_id     |      100405 |
| 0020_big_table | text_idx   | text        |      100405 |
+----------------+------------+-------------+-------------+
2 rows in set (0,00 sec)

@macbre
Copy link
Owner Author

macbre commented Sep 18, 2018

mysql@localhost[index_digest]>select * from 0020_big_table limit 10;
+---------+-----+-------+-----+
| item_id | val | text  | num |
+---------+-----+-------+-----+
|       1 |   1 | 00001 |   1 |
|       2 |   1 | 00002 |   2 |
|       3 |   1 | 00003 |   0 |
|       4 |   1 | 00004 |   1 |
|       5 |   1 | 00005 |   2 |
|       6 |   2 | 00006 |   0 |
|       7 |   2 | 00007 |   1 |
|       8 |   2 | 00008 |   2 |
|       9 |   2 | 00009 |   0 |
|      10 |   2 | 0000a |   1 |
+---------+-----+-------+-----+
10 rows in set (0,01 sec)

--

mysql@localhost[index_digest]>select TABLE_NAME, INDEX_NAME, COLUMN_NAME, CARDINALITY from INFORMATION_SCHEMA.STATISTICS where TABLE_NAME = '0020_big_table' AND TABLE_SCHEMA = 'index_digest';
+----------------+------------+-------------+-------------+
| TABLE_NAME     | INDEX_NAME | COLUMN_NAME | CARDINALITY |
+----------------+------------+-------------+-------------+
| 0020_big_table | PRIMARY    | item_id     |          13 |
| 0020_big_table | text_idx   | text        |          13 |
| 0020_big_table | num_idx    | num         |           3 |
+----------------+------------+-------------+-------------+
3 rows in set (0,00 sec)

After ANALYZE TABLE query

mysql@localhost[index_digest]>ANALYZE TABLE  0020_big_table;
+-----------------------------+---------+----------+----------+
| Table                       | Op      | Msg_type | Msg_text |
+-----------------------------+---------+----------+----------+
| index_digest.0020_big_table | analyze | status   | OK       |
+-----------------------------+---------+----------+----------+
1 row in set (0,02 sec)

mysql@localhost[index_digest]>select TABLE_NAME, INDEX_NAME, COLUMN_NAME, CARDINALITY from INFORMATION_SCHEMA.STATISTICS where TABLE_NAME = '0020_big_table' AND TABLE_SCHEMA = 'index_digest';
+----------------+------------+-------------+-------------+
| TABLE_NAME     | INDEX_NAME | COLUMN_NAME | CARDINALITY |
+----------------+------------+-------------+-------------+
| 0020_big_table | PRIMARY    | item_id     |      100256 |
| 0020_big_table | text_idx   | text        |       97996 |
| 0020_big_table | num_idx    | num         |           2 |
+----------------+------------+-------------+-------------+
3 rows in set (0,00 sec)

--

mysql@localhost[index_digest]>select num as value, count(*) as cnt from 0020_big_table group by 1;
+-------+-------+
| value | cnt   |
+-------+-------+
|     0 | 33333 |
|     1 | 33334 |
|     2 | 33333 |
+-------+-------+
3 rows in set (0,03 sec)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant