Index Case Summary-Indexes

On 24/01/2023 by Robert Corvino

In my experience, these six cases are the main reasons I find that indexes are not being used. It usually boils down to a case of “They cannot be used—using them would return incorrect results,” or “They should not be used—if they were used, performance would be terrible.”

Myth: Space Is Never Reused in an Index

This is a myth that I would like to dispel once and for all: space is reused in an index. The myth goes like this: you have a table, T, in which there is a column, X. At some point, you put the value X=5 in the table. Later you delete it. The myth is that the space used by X=5 will not be reused unless you put X=5 back into the index later. The myth states that once an index slot is used, it will be there forever and can be reused only by the same value. A corollary to this is the myth that free space is never returned to the index structure, and a block will never be reused. Again, this is simply not true.

The first part of the myth is trivial to disprove. All we need to do is to create a table like this:
$ sqlplus eoda/foo@PDB1
SQL> create table t ( x int, constraint t_pk primary key(x) ); Table created.

So, according to the myth, if I delete from T where X=2, that space will never be reused unless I reinsert the number 2. Currently, this index is using one leaf block of space. If the index key entries are never reused upon deletion, and I keep inserting and deleting and never reuse a value, this index should grow like crazy. Let’s see:

SQL> analyze index t_pk validate structure; Index analyzed.

This shows the space in the index was reused. As with most myths, however, there is a nugget of truth in there. The truth is that the space used by that initial number 2 would remain on that index block forever. The index will not coalesce itself. This means if I load a table with values 1 to 500,000 and then delete every other row (all of the even numbers), there will be 250,000 holes in the index on that column. Only if I reinsert data that will fit onto a block where there is a hole will the space be reused. Oracle will make no attempt to shrink or compact the index. This can be done via an ALTER INDEX REBUILD or COALESCE command. On the other hand, if I load a table with values 1 to 500,000 and then delete from the table every row where the value was 250,000 or less, I would find the blocks that were cleaned out of the index were put back onto the FREELIST for the index. This space can be totally reused.

Myth: Most Discriminating Elements Should Be First

This seems like common sense. If you are going to create an index on the columns C1 and C2 in a table T with 100,000 rows, and you find C1 has 100,000 distinct values and C2 has 25,000 distinct values, you would want to create the index on T(C1,C2). This means that C1 should be first, which is the commonsense approach. The fact is, when comparing vectors of data (consider C1, C2 to be a vector), it doesn’t matter which you put first. Consider the following example. We will create a table based on ALL_OBJECTS and an index on the OWNER, OBJECT_TYPE, and OBJECT_NAME columns (least discriminating to most discriminating) and also on OBJECT_NAME, OBJECT_TYPE, and
OWNER:
$ sqlplus eoda/foo@PDB1
SQL> create table t as select * from all_objects; Table created.
SQL> create index t_idx_1 on t(owner,object_type,object_name); Index created.

They use nearly the exact same amount of space—there are no major differences there. However, the first index is a lot more compressible if we use index key compression, as evidenced by the OPT_CMPR_PCTSAVE value. There is an argument for arranging the columns in the index in order from the least discriminating to the mostdiscriminating. Now let’s see how they perform, to determine if either index is generally more efficient than the other. To test this, we’ll use a PL/SQL block with hinted queries (so as to use one index or the other):
SQL> alter session set sql_trace=true; Session altered.
SQL> declarecnt int;begin
for x in ( select /+FULL(t)/ owner, object_type, object_name from t ) loop

They processed the same exact number of rows and very similar numbers of blocks (minor variations coming from accidental ordering of rows in the table and consequential optimizations made by Oracle), used equivalent amounts of CPU time,and ran in about the same elapsed time (run this same test again and the CPU and ELAPSED numbers will be a little different, but on average they will be the same). There are no inherent efficiencies to be gained by placing the indexed columns in order of how discriminating they are, and as stated previously, with index key compression there is an argument for putting the least selective first. If you run the preceding example with COMPRESS 2 on the indexes, you’ll find that the first index will perform about two-thirds the I/O of the second, given the nature of the query in this case.

However, the fact is that the decision to put column C1 before C2 must be driven by how the index is used. If you have many queries like the following, it makes more sense to place the index on T(C2,C1):
select * from t where c1 = 😡 and c2 = :y; select * from t where c2 = :y;

This single index could be used by either of the queries. Additionally, using index key compression (which we looked at with regard to IOTs and will examine further later), we can build a smaller index if C2 is first. This is because each value of C2 repeats itself, on average, four times in the index. If C1 and C2 are both, on average, 10 bytes in length, the index entries for this index would nominally be 2,000,000 bytes (100,000 × 20). Using index key compression on (C2, C1), we could shrink this index to 1,250,000 (100,000 × 12.5), since three out of four repetitions of C2 could be suppressed.

In Oracle 5 (yes, version 5), there was an argument for placing the most selective columns first in an index. It had to do with the way version 5 implemented index compression (not the same as index key compression). This feature was removed in version 6 with the addition of row-level locking. Since then, it is not true that putting the most discriminating entries first in the index will make the index smaller or more efficient. It seems like it will, but it will not. With index key compression, there is a compelling argument to go the other way since it can make the index smaller. However, it should be driven by how you use the index, as previously stated.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Index Case Summary-Indexes

Leave a Reply Cancel reply

Archives

Calendar

Categories