Oct
24
2012

Fix for Web Service Task "Cannot find definition for .... Service Description with namespace ... is missing. Parameter name: name"

The problem is that the SSIS Web Service Task does not handle namespaces other than the default.

While the web service WSDL file may validate just fine, SSIS still balks at it. Here's how to fix up the WSDL file.

Before:

<wsdl:definitions
	...
	targetNamespace="http://tempuri.org"
	xmlns:tns="http://tempuri.org"
	xmlns:ns1="http://www.MyServiceTargetNameSpace.com">
	...
	<wsdl:service name="MyServiceName">
		<wsdl:port name="MyPortName" binding="ns1:MyBindingName">
			...
		</wsdl:port>
	</wsdl:service>
</wsdl:definitions>

 

After:

<wsdl:definitions
	...
	targetNamespace="http://www.MyServiceTargetNameSpace.com"
	xmlns:tns="http://www.MyServiceTargetNameSpace.com">
	...
	<wsdl:service name="MyServiceName">
		<wsdl:port name="MyPortName" binding="tns:MyBindingName">
			...
		</wsdl:port>
	</wsdl:service>
</wsdl:definitions>

You could also just overwrite the targetNamespace and xmlns:tns attribute values with your service target namespace. Essentially, they all have to end up being the same namespace for it to work.

Oct
18
2012

Creating a Backup Maintenance Plan (video)

Maintenance plans are a great way of getting started with a backup solution for your environment.

While maintenance plans don't offer great flexibility, particularly when it comes to managing backups across many different servers, they're often a good starting point because they're easy to create and simple to understand.

In this demo video, I walk through creating a skeleton maintenance plan that contains 3 subplans to perform full, differential, and transaction log backups. The only thing left to do is schedule each of the automatically-generated SQL Agent jobs to meet your backup and recovery needs.

 

 

Oct
16
2012

Choosing a database recovery model

The recovery model of a database is a critical setting that determines how the transaction log both records and retains information about transactions that occur in the database. It's important to remember that the setting always applies to the entire database.

As we'll explore in this post, there are 3 recovery models in SQL Server to choose from: FULL, BULK_LOGGED, and SIMPLE; each model has advantages, disadvantages, and tradeoffs. Only you can decide which model is appropriate for your particular situation and goals; we'll go through some of those tradeoffs and a few typical scenarios.

First of all, let's take a quick step back and ask why we care about what gets recorded in the transaction log.

 

What is recorded in the transaction log?

The transaction log contains an ordered sequence of physical operations that occur within a database. Each physical operation is written to the transaction log before the change itself is applied to the data page(s). This is called write-ahead logging, and it occurs regardless of the recovery model chosen.

Usually, the physical operations written to the transaction log also contain enough information to perform the inverse operation of the change (i.e., to undo a delete, do an insert). This combined with the write-ahead logging mechanism ensures transactions succeed or fail as a whole (atomicity) by giving SQL Server the ability to both undo and redo transactions. Hopefully those two bolded words set off hints about how logging is intimately tied to backup, restore, and several other features in SQL Server -- we'll get to those in a minute.

 

Minimal Logging

There are a handful of operations we can do in a database that can be minimally logged -- this means that only enough information required to undo the transaction is logged. In the case of a BULK INSERT, for example, all that's logged are page allocations -- we can undo those by deallocating the pages, but we can't redo them because the actual contents of the pages aren't recorded in the log.

We gain speed by writing to the transaction log less, but lose the ability to perform point-in-time recovery during the operation.

You might be wondering how a database restored using only transaction log backups can be complete after a point in time where an operation such as a BULK INSERT took place. How does SQL Server know what to put on those data pages, since only the allocations were logged?

The answer is in what is recorded by the first transaction log backup taken after the minimally-logged operation. SQL Server keeps track of extents that are modified by minimally-logged operations in a special metadata page (the ML map, or minimally-logged map, which contains 1 bit per extent), and copies those extents into the transaction log backup, along with the required portion of the transaction log itself. Yes, transaction log backups can contain data!

The net result of minimal logging isn't that we get the operation completely for free; we just defer a good chunk of the work to the backup subsystem for an immediate gain in speed.

 

Armed with this knowledge, let's take a look at the features of each of the three recovery models.

 

FULL

  • Full logging, all the time.
  • Supports point-in-time recovery (you must be taking transaction log backups).
  • Only transaction log backups internally clear the transaction log, so a good backup strategy is a must.
  • Required for certain features, such as Database Mirroring and Availability Replicas.
  • Typically used for OLTP databases, and where data protection is a high priority.

 

BULK_LOGGED

  • Full logging, except when the required conditions are met for minimally-logged operations.
  • Supports point-in-time recovery (again, you have to take transaction log backups) for all transaction log backups that do not contain minimally-logged operations. This means you should take a transaction log backup immediately before, and immediately after the minimally-logged operation(s), to maximize your ability to do a point-in-time restore.
  • Only transaction log backups internally clear the transaction log (same as FULL).
  • Used for performing bulk operations on a database that would otherwise prefer to be in FULL recovery. A database can be switched freely between FULL and BULK_LOGGED without breaking the transaction log backup chain.

 

SIMPLE

  • Full logging, except when the required conditions are met for minimally-logged operations.
  • Does not support point-in-time recovery, and the transaction log cannot be backed up.
  • A database CHECKPOINT internally clears the transaction log as part of the process (since you can't take transaction log backups), which means you don't necessarily have to manage the transaction log as much as the other recovery models.
  • Usually used for databases that frequently handle large quantities of data (i.e., data warehouses), or databases where point-in-time recovery is not important.
  • Switching a database from FULL or BULK_LOGGED to SIMPLE breaks the transaction log backup chain. The transaction log backup chain can be restarted by switching from SIMPLE back to FULL or BULK_LOGGED, and taking a full or differential backup.

 

A good place to start choosing a recovery model would be by answering the following questions. Remember that these are only guidelines, and your specific situation will determine the best course of action.

  • Does the database require point-in-time recovery? (Yes = FULL or BULK_LOGGED, No = SIMPLE)
  • Are special features going be used on this database? (Yes = FULL or BULK_LOGGED -- usually FULL -- depending on the feature)
  • Will large quantities of data be imported into, or manipulated within this database? (Yes = BULK_LOGGED or SIMPLE)
Oct
11
2012

Where does SQL Server store my data?

In this post, I'm going to give a high-level overview of where SQL Server stores user data. One of the most important topics, backup and restore, is closely related to how this (and logging) works, so it's important to have a solid understanding of these concepts to be able to plan and execute a successful backup and restore strategy. I'll cover logging and the transaction log in a future post.

The structures in which SQL Server stores data are arranged hierarchically. We're going to start with the smallest relevant units and work our way up.

 

Pages and Extents

In the SQL Server database engine, a page is an 8 KB region of storage (applies to both memory and disk). Each page has a header region that stores metadata about the page (the page's internal id, the type of the page, etc.), and a data region that contains the actual data.

While there are many different types of pages, there are only two basic classes:

  • Data Pages - Contains information related to data rows themselves.
  • Metadata Pages - Contains state information about other pages or extents.

Pages are grouped into sets of 8 (for a total of 64 KB), called extents, which makes the allocation system more efficient. Most extents only contain a single type of page1.

 

Partitions

Logically groups extents into objects. A normal table is stored in a single partition, while a partitioned table is stored in multiple partitions. This is how ALTER TABLE ... SWITCH is able to efficiently move partitions to/from an independent table, or between partitioned tables -- a partition is a table.

 

Data Files

A data file is the basic unit of a database in the file system2, and it contains some or all of the extents that make up a partition (see next section on filegroups).

All databases have a primary data file, which stores structural (metadata) information about the database, and by convention has the file extension .mdf. The primary data file cannot be removed from the database.

Secondary data files can be added, which have the extension .ndf by convention.

 

Filegroups

A filegroup is a named logical construct that contains data files. (Note: log files do not belong to filegroups.) All databases have a PRIMARY filegroup that contains (at least) the primary data file. The PRIMARY filegroup cannot be removed. Filegroups (except PRIMARY) can be set as read-only.

A filegroup allows for the distribution of extent and page allocations between multiple data files. SQL Server uses two algorithms simultaneously when making allocations:

  1. Round-robin. Allocations are made sequentially from the data files within the filegroup (i.e., for a filegroup with 3 files, the allocations would come from File1, File2, File3, File1, File2, File3, etc.).
  2. Proportional fill. The number of allocations made from a file is proportional to the amount of free space in that file. The purpose of this is to eventually reach a steady-state where allocations are balanced between all the files. However, you can get into trouble by adding a new, empty, data file, which can result in almost all the allocations coming from that one file, probably defeating the purpose of the allocation distribution among the files in the first place.

Filegroups also serve as the target for object storage. For example, the table [dbo].[Sales] can be stored in a filegroup called SALES.

 

Database

In this context, a database is a logical container that contains one or more filegroups (PRIMARY, plus zero or more others). One of the database's filegroups can be set as the default filegroup, where objects get created if the T-SQL statement does not explicitly specify a filegroup.

 

Now that you understand the basic internal storage structures SQL Server uses, you will be able to easily understand how many other areas of the system work, including backups, which is an important topic I will cover in a future post.

 

1 Mixed extents can contain multiple types of pages. These are a small minority of allocations.

2 Not getting into FILESTREAM or snapshot files here.

Oct
4
2012

How are deadlocks created? (video)

Deadlocks occur when two processes both hold locks on resources, and also try to take locks on resources from the other process. This results in neither process being able to continue.

SQL Server resolves these situations automatically by terminating one of the processes, thus allowing the other(s) to proceed. This is done by an internal process called the Deadlock Monitor.

In this video demo, I demonstrate a basic sequence of events needed to create a deadlock. I also look at the locks that are held and requested by each process, and introduce you to reading a deadlock graph (which is what's captured from the Extended Events deadlock capturing system I presented last week).

(Sorry my cursor isn't visible in the video -- I blame the capture program I used. It won't happen again.)

 

If you want to reproduce the situation yourself, you can download the scripts here.