b-tree indexes clustered and nonclustered

Database indexes are like indexes in any book designed to decrease the time database engine will take to locate information. Indexes help when you are trying to find specific information e.g. WHERE clause is used in your SQL Statements. If you are retriving the whole table then indexes will not play any big role in your select statement. 

B-Tree Indexes
SQL Server, Oracle and MySQL database Servers support b-tree indexes, these indexes are designed to perform very well with OLTP databases with high unique values. Pages in B-tree indexes are known as index nodes. The top level node is called root node, the middle nodes are known as branch nodes / intermediate levels and the bottom level nodes are called leaf nodes.

  • The root node contains node pointers to branch nodes / intermediate nodes.
  • Branch node contains pointers to other branch nodes / intermediate nodes or leaf nodes.
  • The bottom nodes known as leaf node contains index items and horizontal pointers to other leaf nodes.

Heaps / Heap Table
SQL Server Books Online use this term to define a table with out clustered index, because data is all over the data pages with out any particular order. Many confuse heap with a table with out any kind of index but a heap can have non clustered index.

Clustered Index
The rows in data pages of clustered index are ordered on the value of the clustered index keys. The leaf nodes contains the data pages of the table, root and branch nodes contain actual index pages and index rows.

Each index row contains a key value and a pointer to branch level page or to data row in the leaf node of the index. New rows are inserted to fit the ordering sequence of index keys among existing rows. When you create a primary key in sql server, it will create clustered index by default. The structure of clustered index makes them faster then non clustered index.

Non Clustered Index
The non clustered indexes also use B-Tree index structure, but the data of table is not stored or sorted physically based on key columns, the leaf node is created using index pages unlike clustered indexes where leaf node contain data pages. Each index row contains a key value and a row locator pointing to the data row in the clustered index or heap having the key value.

Column Cardinality
The cardinality of column is refered to the uniqueness of data inside a particular column. The high cardinality means more uniquness and lower cardinality means more duplicate values in the column. Primary keys and columns with unique indexes have the higest cardinality in any table, data fields like gender, ethnicity have lowest cardinality and columns like usernames, order id are common examples for high cardinality.

Index Strategies
Now we know the basics of what indexes are and how they work. Here are few things to keep in mind when creating indexes.

  • Indexes should be created on columns with high cardinality, the basic rule for B-Tree indexes is minimum 10% before index will be useful. If table have 1000 rows then the column you want to index should have atleast 100 different values.
  • Index should be created on columns used in WHERE clause, Order By, Group By, Distinct etc.
  • All columns used in WHERE clause should be included in single index to get best results.
  • Do not create indexes unless you need them, too many indexes will slow INSERT, UPDATE and DELETE.
  • Create indexes on column(s) which are queried frequently.




Add Comments

Name: *
Email: *
URL:
Comments: *