martes, 23 de junio de 2009

Temporary Tables in SQL

Introduction

In the world of databases is very common the use of temporary tables. While everyone knows that these kind of structures are slowing the operation of our consultations, programmers can not avoid using them because they often facilitate the resolution of problems. Store data for later use, save partial results, to analyze large amounts of rows. There are many cases where we may need these temporary tables, but you have to use them properly!.

First advice:Don't use temporary tables

The first advice that we must continue to work with temporary tables is quite simple: do not use them. And why not? Well there are plenty of reasons that we see throughout this text, but begin to see that translated using a temporary table in SQL Server:
  • The temporary tables are created in tempdb, and when you create there are several blocks on the database such as tables and sysobjects sysindex. The locks on tempdb affect the entire server.
  • When you create need to be made in writing to the disk access (not always if the tables are small)
  • When entering data into the temporary tables are again on disk activity, and we know that the disk access is usually the "bottleneck" of our system • When reading data from the temporary table have to resort back to disk. Moreover, these data read from the table often combined with others.
  • Clearing the table is to buy new locks on the tempdb database and perform operations on disk.
  • Using temporary tables inside a stored procedure we lose the advantage of having compiled the implementation plan for the stored procedure and recompiled to occur more often. The same happens when the SQL Server attempts to reuse the execution plan of a parameterized query. If the query is a temporary table is difficult to reuse implementation plan. Viewed these problems I think that there is no need to repeat our first Council.

¿What can we do?

Instead of temporary tables can improve our code is not necessary, we can use subqueries (subquery usually use a drastically improved performance to use temporary tables), using permanent tables, use tables derived. We must always remember that any alternative is good if you avoid using temporary tables (¡cursors excluded of course!) Anyway if you ever have to use it is better to know well, so let's do this.

Types of temporary tables

Temporary tables are of two types in the scope table. We have local temporary tables and global temporary tables.

  • # local: local temporary tables have a # as first character in its name and can only be used in the connection in which the user creates. When the connection drops the temporary table disappears.
  • # # The overall global temporary tables begin with # # and are visible by any user connected to SQL Server. And one more thing, these tables disappear when no user is referring to, not the user is disconnected box that I think.
  • Temp There really is a kind of temporary tables. If we create a table within the database is a temp table in real terms we can use it like any other table in any database, and is temporary in that it disappears as soon turn off the server.

Operation of temporary tables

Create a temporary table is like creating a normal table.

Let's see an example:

CREATE TABLE # TablaTemporal (Field1 int, Field2 varchar (50))

And are used on a regular basis.

INSERT INTO # TalbaTemporal VALUES (1, 'first field')

INSERT INTO # TalbaTemporal VALUES (2, 'Second field')

SELECT * FROM # TablaTemporal

As we see almost no limitations when working with temporary tables (one limitation is that they can not have restrictions Foreign KEY).

Optimizing the use of temporary tables

The use that we can give this type of tables is infinite, but always bearing in mind a few guidelines that we must continue to slow our work as little as possible. For example it's a good habit to create temporary tables with DDL commands as in the previous example (CREATE TABLE) and then fill them with INSERT statements or INSERT INTO. It is true that we can achieve it in one step with SELECT INTO, but this is worse because the locks that are acquired on system objects that last longer.

As is always better to ask the fields we want and do not put the typical SELECT * FROM ... In the same way is highly recommended that we want to qualify the rows and rows do not have that we will not use temporary tables.

Another good habit is to remove us our tables. Yes it is true that at the end of the connection local temporary tables disappear, but if we have a set of sentences over and create a temporary table at the beginning and we are not going to use the remaining time does not make sense to have that table there taking up space and memory.

If the temporary tables are a great option to increase performance is to create an index that will help us to recover data from that table (for smaller tables is an unnecessary waste because they never used the index).

Place the tempdb database on a single disk dedicated to this function will increase overall system performance if it is an intensive use of temporary tables.

And last but not least, do not think temporary tables within transactions within nor triggers. Believe that the existence of your database if you do suffer a lot.

Variables Table With

SQL Server 2000 we can declare variables of type table. Such variables have a number of advantages over temporary tables so that we can always choose to use table variables versus temporary tables. Use temporary variables are easy:

DECLARE @ VariableTabla TABLE (Field1 int, Field2 char (50))

INSERT INTO @ VariableTabla VALUES (1, 'first field')

INSERT INTO @ VariableTabla VALUES (2, 'Second field')

SELECT * FROM @ VariableTabla

Find advantages in using table variables:

  • Have a clearly defined scope. The stored procedure, function or batch in which they are declared.
  • The variables of type table has less recompile the stored procedures in those that use temporary tables.
  • Table variables do not need locks or as many resources as temporary tables.

But they also have disadvantages:

  • We can not change the table definition once declared
  • We can not use non-clustered indexes
  • You can not use INSERT INTO or SELECT INTO
  • We can not use functions in the restrictions

If we balance the pros and cons we see that in general it is better to use variables of type table that the temporary tables. Just in case you have large amounts of data into a temporary table and if we will use repeatedly is the preferred choice of temporary tables because they can define indexes.

An example

This is all very well, but as always it is best to see an example in which we can see that it is worth the effort to reprogram our code to not use temporary tables.

Let's see a simple example from the truth but to illustrate what we explain in this text. We will use the Northwind database.

This database requests are sent through three transportation companies: Speedy Express (1), United Package (2) and Federal Shipping (3). Federal Shipping Company, we offer complete all shipments that do this through the United Package price $ 10.

We decided that this is worth saving and we are going to change in our database, all open orders that have to be sent by United Package to be sent via Federal Shipping.

To make this update of the data we have several options. Let's compare three ways.

Method 1: temporary tables

declare @ st datetime

SET @ st = getdate ()

CREATE TABLE # Actualizar(int OrderId, ShipVia int, Freight money)

INSERT INTO # Actualizar SELECT OrderID, ShipVia, Freight FROM Orders WHERE ShipVia = 2 ShippedDate IS NULL

UPDATE Orders SET ShipVia = 3 Freight = 10 WHERE OrderID IN (SELECT OrderID FROM # Actualizar )

DROP TABLE # Actualizar

PRINT 'Operation completed in:' +

RTRIM (cast (datediff (ms, @ st, getdate ()) as char (10))) + 'Milliseconds'

And as a result we obtain:

(11 rows affected)

(11 rows affected)

Operation completed in 140 milliseconds

Method 1: Variables Table type

DECLARE @ st datetime

SET @ st = getdate ()

DECLARE @Actualizar Table (OrderId int , ShipVia int, Freight money)

INSERT INTO @Actualizar Select OrderID, ShipVia, Freight FROM Orders WHERE ShipVia = 2 ShippedDate IS NULL

UPDATE Orders SET ShipVia = 3 Freight = 10 WHERE OrderID IN (SELECT OrderID FROM @ Actualizar)

PRINT 'Operation completed in' +

rtrim (cast (datediff (ms, @ st, getdate ()) AS char (10))) + 'Milliseconds'

And in this case the result is:

(11 rows affected)

(11 rows affected)

Operation completed in 73 milliseconds

Method 1: No temporary tables

DECLARE @ st datetime

SET @ st = getdate ()

UPDATE Orders SET ShipVia = 3 Freight = 10 WHERE OrderID IN (SELECT OrderID FROM Orders WHERE ShipVia ShippedDate = 2 AND IS NULL)

PRINT 'Operation completed in' + rtrim (cast (datediff (ms, @ st, getdate ()) AS char (10))) + 'Milliseconds'

And finally we get:

(11 rows affected)

Operation completed in 50 milliseconds

Of course this example is not significant, and in each case we must study the situation and compare the results obtained in a working environment to know what is the best option, but anyway I hope this will serve at least for a little better to the "temporary tables".

2 comentarios:

  1. You can also create non-clustered indexes on table-variables, but yes not named constraints.

    ResponderEliminar
  2. Temp tables cannot be partitioned..

    ResponderEliminar