viernes, 26 de junio de 2009

Reduce Log file

Introduction

So there is a recurring question in the newsgroups and forums about the size of the file of transactions from a database of SQL Server. The problem is that the log file grows indefinitely until it is so big that eats the entire hard drive, arriving in the worst cases to fill with the resulting error:

Error: 1105, Severity: 17, State: 2 Could not allocate space for the transaction log for database 'MiBase' because the device is full. Dump the log or enlarge the device to create more space.

This error is not really a problem of SQL Server. Is normal on the server when we take into account in any database is to perform a series of maintenance tasks so that everything works properly. What we are signaling is that the log file can not grow, either because we have limited size, or because we have filled the entire disk, and if the server can not write to the log file can not continue working. But let us see a little how the data storage in SQL Server and that is the file of transactions (or log) to understand how we can resolve this error, and most importantly, how to avoid it.

File type

In any database SQL Server there are at least two files. One file is the database where the data will be stored in our tables (and other objects) and the other is the file of transactions. The transaction file is a series of records of all changes to the database and the transaction that has made each change. In the transaction log contains the start of each transaction. It also records changes to data and provides enough information to undo the modifications (if necessary later) made during each transaction. The file has extension mdf database while the transaction has ldf extension. In this case we will concentrate on the latter which is what can give us problems in terms of space.

Recovery mode

The recovery mode of the database is very important to understand where our problem arises. Depending on how we created the database several ways to recover the database in SQL Server. We have the full mode, the mass registration and simple. If you created the database did not specify any recovery mode chosen is full. To learn how you have configured one can look at the "Corporate Administrator" in the properties of the database in the "Options". Here we can see what the recovery mode of the database. We can also determine the recovery model by running sp_helpdb 'NombreBaseDatos' in Query Analyzer. If we do so in the results pane status where there is a column detailing, among other things, the model of recovery (Recovery = FULL).

Have full recovery mode means you can recover the database until the time when an error occurs or until a certain moment of time. Is the usual way in a production database.

In this recovery mode every transaction that occurs in the database (insert, modify, delete ...) was recorded in the transaction file (the. Log) so that it can reconstruct what happened to the database against time. But using this way means that the transaction file will grow indefinitely, even to be bigger than the database itself.

The solution to this growth is simple: to make backups. When we do a backup of transaction log data to pass to the backup disk is erased, leaving free space. So with every copy of the disc will remove the log file that is no longer necessary, leaving free space for recording new transactions that occur in our database.

The bad thing is that when this happens is released into space from log file, but it does not mean reducing the size of this file to disk. In this case we will have to shrink the log file in a second phase.

Operation of the log file

As they say what each BOL SQL Server transaction log file is divided logically into smaller segments called virtual log files (VLF). The virtual log files are the units of transaction log truncation. When a virtual log file no longer contains records for active transactions may truncated to ensure that there is space available for recording new transactions.

The minimum size of a virtual log file is 256 KB. The minimum size of a transaction log is 512 KB, which provides two virtual log files of 256 KB. The number and size of virtual log files in a file of transactions increases as does the log file. A log file can have a small number of small virtual log files while a log file can be larger virtual log files larger.

As we said when the transaction becomes very large the solution is to make a backup and that part of the file which makes a copy is released. That is, the VLF is emptied and once empty can be truncated to reduce the space occupied by the file on disk. To truncate the log file you can use Enterprise Manager using the "Shrink Database" or do it from Query Analyzer. I prefer Query Analyzer because it gives more of what we are doing and any errors that may appear. Let's see what we have to execute commands to do this from the Query Analyzer with TSQL.

Reduce the log file

Assuming we have a database name MiBase the steps would be:

1 - Open Query Analyzer. Execute the statement USE MiBase For the execution context is the database we want.

2 - Implement the decision CHECKPOINT To write to disk all the pages out of date (they are cached And are not yet on disk)

3 - Run the sentence EXEC sp_addumpdevice 'disk', 'CopiaMiBase', 'd: \ LogMiBase.bak' To create a hardware device to store the backup. This device is a file in d: LogMiBase.bak name. Make sure there is enough room for the backup. We can also create this device from the Corporate Administrator.

4 - Create a backup of the database MiBase BACKUP DATABASE TO CopiaMiBase This created a complete copy of the database

5 - Create a backup log file BACKUP LOG MiBase TO CopiaMiBase We have now created a copy of the log file information with the have released some of the VLF (virtual log file) Now if we look at the log file has been reduced, and if you have not Turn to the next point.

6 - Free space of the log file DBCC SHRINKFILE (MiBase_Log, 3000) This command frees space if there are any VLF gap. The first argument is the logical name of the file you want to reduce the size is 100 Goal Mb Although in this example we use size as 100Mb goal in each case will look what is the desired value and up depending on where you can get of VLF-free after you have the backup of the file Log. Once this is done and we should have a log file size reasonable.

Plan b

If we are interested not have backup of all data from the log file (or there is no space for the backup) we can follow a quicker method but does not back up this file, but that of the base Data: 1 - USE MiBase

2 - CHECKPOINT

3 - EXEC sp_addumpdevice 'disk', 'CopiaMiBase', 'd: \ LogMiBase.bak'

4 - BACKUP DATABASE MiBase TO CopiaMiBase

5 - BACKUP LOG MiBase WITH TRUNCATE_ONLY

6 - DBCC SHRINKFILE (MiBase_Log, 100)

This option simply truncates the log file the missing part of this file is not active.

A final solution

So that does not happen again is to have a maintenance plan for the base damage you make full backups and log file each time. How long is hard to say without knowing how to use the database and what is its size, but could be from several times a day to once a week. Of course it is much easier if we schedule the execution of work as the backup schedule and if you tell us some warning if you exceed the limit that we consider reasonable for the size of our files. Prevention is always better to correct the errors. Endnote I know it is not necessary to say, but if we do any task that can be dangerous is mandatory to backup the database and the database system if there are any unforeseen problems.

martes, 23 de junio de 2009

Temporary Tables in SQL

Introduction

In the world of databases is very common the use of temporary tables. While everyone knows that these kind of structures are slowing the operation of our consultations, programmers can not avoid using them because they often facilitate the resolution of problems. Store data for later use, save partial results, to analyze large amounts of rows. There are many cases where we may need these temporary tables, but you have to use them properly!.

First advice:Don't use temporary tables

The first advice that we must continue to work with temporary tables is quite simple: do not use them. And why not? Well there are plenty of reasons that we see throughout this text, but begin to see that translated using a temporary table in SQL Server:
  • The temporary tables are created in tempdb, and when you create there are several blocks on the database such as tables and sysobjects sysindex. The locks on tempdb affect the entire server.
  • When you create need to be made in writing to the disk access (not always if the tables are small)
  • When entering data into the temporary tables are again on disk activity, and we know that the disk access is usually the "bottleneck" of our system • When reading data from the temporary table have to resort back to disk. Moreover, these data read from the table often combined with others.
  • Clearing the table is to buy new locks on the tempdb database and perform operations on disk.
  • Using temporary tables inside a stored procedure we lose the advantage of having compiled the implementation plan for the stored procedure and recompiled to occur more often. The same happens when the SQL Server attempts to reuse the execution plan of a parameterized query. If the query is a temporary table is difficult to reuse implementation plan. Viewed these problems I think that there is no need to repeat our first Council.

¿What can we do?

Instead of temporary tables can improve our code is not necessary, we can use subqueries (subquery usually use a drastically improved performance to use temporary tables), using permanent tables, use tables derived. We must always remember that any alternative is good if you avoid using temporary tables (¡cursors excluded of course!) Anyway if you ever have to use it is better to know well, so let's do this.

Types of temporary tables

Temporary tables are of two types in the scope table. We have local temporary tables and global temporary tables.

  • # local: local temporary tables have a # as first character in its name and can only be used in the connection in which the user creates. When the connection drops the temporary table disappears.
  • # # The overall global temporary tables begin with # # and are visible by any user connected to SQL Server. And one more thing, these tables disappear when no user is referring to, not the user is disconnected box that I think.
  • Temp There really is a kind of temporary tables. If we create a table within the database is a temp table in real terms we can use it like any other table in any database, and is temporary in that it disappears as soon turn off the server.

Operation of temporary tables

Create a temporary table is like creating a normal table.

Let's see an example:

CREATE TABLE # TablaTemporal (Field1 int, Field2 varchar (50))

And are used on a regular basis.

INSERT INTO # TalbaTemporal VALUES (1, 'first field')

INSERT INTO # TalbaTemporal VALUES (2, 'Second field')

SELECT * FROM # TablaTemporal

As we see almost no limitations when working with temporary tables (one limitation is that they can not have restrictions Foreign KEY).

Optimizing the use of temporary tables

The use that we can give this type of tables is infinite, but always bearing in mind a few guidelines that we must continue to slow our work as little as possible. For example it's a good habit to create temporary tables with DDL commands as in the previous example (CREATE TABLE) and then fill them with INSERT statements or INSERT INTO. It is true that we can achieve it in one step with SELECT INTO, but this is worse because the locks that are acquired on system objects that last longer.

As is always better to ask the fields we want and do not put the typical SELECT * FROM ... In the same way is highly recommended that we want to qualify the rows and rows do not have that we will not use temporary tables.

Another good habit is to remove us our tables. Yes it is true that at the end of the connection local temporary tables disappear, but if we have a set of sentences over and create a temporary table at the beginning and we are not going to use the remaining time does not make sense to have that table there taking up space and memory.

If the temporary tables are a great option to increase performance is to create an index that will help us to recover data from that table (for smaller tables is an unnecessary waste because they never used the index).

Place the tempdb database on a single disk dedicated to this function will increase overall system performance if it is an intensive use of temporary tables.

And last but not least, do not think temporary tables within transactions within nor triggers. Believe that the existence of your database if you do suffer a lot.

Variables Table With

SQL Server 2000 we can declare variables of type table. Such variables have a number of advantages over temporary tables so that we can always choose to use table variables versus temporary tables. Use temporary variables are easy:

DECLARE @ VariableTabla TABLE (Field1 int, Field2 char (50))

INSERT INTO @ VariableTabla VALUES (1, 'first field')

INSERT INTO @ VariableTabla VALUES (2, 'Second field')

SELECT * FROM @ VariableTabla

Find advantages in using table variables:

  • Have a clearly defined scope. The stored procedure, function or batch in which they are declared.
  • The variables of type table has less recompile the stored procedures in those that use temporary tables.
  • Table variables do not need locks or as many resources as temporary tables.

But they also have disadvantages:

  • We can not change the table definition once declared
  • We can not use non-clustered indexes
  • You can not use INSERT INTO or SELECT INTO
  • We can not use functions in the restrictions

If we balance the pros and cons we see that in general it is better to use variables of type table that the temporary tables. Just in case you have large amounts of data into a temporary table and if we will use repeatedly is the preferred choice of temporary tables because they can define indexes.

An example

This is all very well, but as always it is best to see an example in which we can see that it is worth the effort to reprogram our code to not use temporary tables.

Let's see a simple example from the truth but to illustrate what we explain in this text. We will use the Northwind database.

This database requests are sent through three transportation companies: Speedy Express (1), United Package (2) and Federal Shipping (3). Federal Shipping Company, we offer complete all shipments that do this through the United Package price $ 10.

We decided that this is worth saving and we are going to change in our database, all open orders that have to be sent by United Package to be sent via Federal Shipping.

To make this update of the data we have several options. Let's compare three ways.

Method 1: temporary tables

declare @ st datetime

SET @ st = getdate ()

CREATE TABLE # Actualizar(int OrderId, ShipVia int, Freight money)

INSERT INTO # Actualizar SELECT OrderID, ShipVia, Freight FROM Orders WHERE ShipVia = 2 ShippedDate IS NULL

UPDATE Orders SET ShipVia = 3 Freight = 10 WHERE OrderID IN (SELECT OrderID FROM # Actualizar )

DROP TABLE # Actualizar

PRINT 'Operation completed in:' +

RTRIM (cast (datediff (ms, @ st, getdate ()) as char (10))) + 'Milliseconds'

And as a result we obtain:

(11 rows affected)

(11 rows affected)

Operation completed in 140 milliseconds

Method 1: Variables Table type

DECLARE @ st datetime

SET @ st = getdate ()

DECLARE @Actualizar Table (OrderId int , ShipVia int, Freight money)

INSERT INTO @Actualizar Select OrderID, ShipVia, Freight FROM Orders WHERE ShipVia = 2 ShippedDate IS NULL

UPDATE Orders SET ShipVia = 3 Freight = 10 WHERE OrderID IN (SELECT OrderID FROM @ Actualizar)

PRINT 'Operation completed in' +

rtrim (cast (datediff (ms, @ st, getdate ()) AS char (10))) + 'Milliseconds'

And in this case the result is:

(11 rows affected)

(11 rows affected)

Operation completed in 73 milliseconds

Method 1: No temporary tables

DECLARE @ st datetime

SET @ st = getdate ()

UPDATE Orders SET ShipVia = 3 Freight = 10 WHERE OrderID IN (SELECT OrderID FROM Orders WHERE ShipVia ShippedDate = 2 AND IS NULL)

PRINT 'Operation completed in' + rtrim (cast (datediff (ms, @ st, getdate ()) AS char (10))) + 'Milliseconds'

And finally we get:

(11 rows affected)

Operation completed in 50 milliseconds

Of course this example is not significant, and in each case we must study the situation and compare the results obtained in a working environment to know what is the best option, but anyway I hope this will serve at least for a little better to the "temporary tables".