Script and deploy the data for database from SSDT project

Script and deploy the data for database from SSDT project

This post is a part of the series about SSDT and database project leveraging Visual Studio.

We already know that SSDT can cover all objects but only on a database level, such as:

  • tables,
  • view,
  • stored procedures,
  • functions,
  • user types,
  • assemblies,
  • etc.

We can not manage out of the box with SSDT is everything behind the scope, e.g.:

  • SQL Logins,
  • SQL Jobs,
  • Extended Events.

There is another category of thing, which exist on a database level, but can’t be managed by SSDT natively.
Namely: The DATA.
It’s a very common scenario when we have a database (project) and require some data in selected tables even in the freshly deployed, empty database.
Nomenclature for these data/tables that you can meet are:
– reference data
– master data
– dictionary
Which term have you heard and is closest to you?

By default, SSDT does not support scripting or deploying the data.
Hence: the question is: how we can cope with that?
The answer is the Post-Deployment script.
Having said that – you already might know than the game is just beginning…

Post Deployment

There are several ways how to approach this problem.
I will show you some solutions I have used for projects I’ve worked for to date.

I would divide the problem into two things:
1) How to script the existing data
2) How to build a deployment of the reference data, considering the following aspects:
a) Inserting new rows
b) Update existing data (in target table)
c) Should I delete rows that don’t exist in the source?

Let’s take a look (agenda):

Type of reference data

Scenario #1 (insert new rows only)

This is probably the most common way to generate reference data.
It is not fast, but it can work for small tables. Also, any developer can write it, not just database developers.
Target is not empty, but new rows can come from the code and need to be added.

Scenario #2 (insert only) – initial load data

In this scenario, we expect to insert reference data rows only once – for the first time after the creation of a database and table. The table is empty, so that indicates whether inserting the data or not. Once done – we don’t have to touch the data as we know that the data is static in there and no one gonna changes it.

A drawback of this approach: data could change accidentally by developers, external system (when should not) or undiscovered process. That’s why is better to protect these data and redeploy and overwrite all rows over and over again.

MERGE all things
What if you must “simply” reflect all you have in the source (code repository), updating whatever exists or not in the target database tables?
The things are getting complicated, right?
A MERGE statement can come with help then.
Have a look at these 2 scenarios:

Scenario #3 (Full update, including Delete)

By creating a MERGE statement we do ensure that the data is the same as in repo. Not only the new rows but also all unexpected rows which must be deleted, or changes made for existing rows which must be updated and return to the desired state. The script should do all actions at once:

  • INSERT – for these rows which are in the source but not in the target yet
  • UPDATE – for all existing rows
  • DELETE – for all rows which present in the target but not in the source

Let’s create a code – MERGE statement for the table. Now, you will see the code created by one tool that I introduce later:

Scenario #4 (Full update, excluding Delete)

For Scenario 4 you must just omit lines 37-38 in order not to delete rows (added by someone manually in the target database, but that are not present in the source).

Cool. Merge statement is not easy to remember and also adopting it for dozens of tables is a time-consuming process. Furthermore, to standardise the code for all tables – would be perfect to use a tool to script the data.
Thankfully, you are not alone here. You have two options.

Scripting master data

In this post, I want to show you two tools which can help:

  • Data Script Writer (Desktop Application for Windows)
  • Generate SQL Merge (T-SQL Stored Procedure)

Data Script Writer (desktop app)

This is open-source, simple Desktop application for Windows which allows you to connect to your database, list of tables, select the ones you are interested in and with one button – script ALL the data to files.
Let’s familiar with the process:

  1. Run the app and connect to your server and select database:

    Data Script Writer (Desktop App) – Connect to Server

  2. Once hit CONNECT the App will load the list of all tables with information of the number of rows per each table.

    Data Script Writer – select tables and script method

  3. Check all tables you want to script.
  4. Optionally you can select a method of pushing the data into the target:
    a) INSERT (see: scenario 2 above)
    b) MERGE (see: scenario 3 above)
    c) MERGE without DELETE (see: scenario 4 above)
    d) MERGE NEW ONLY (see: scenario 1 above)
  5. Click “Generate script” to do the things.

The last step is to put all these scripts into our SSDT database project.

“Show All Files” button

To do this – follow the steps:

  1. Copy these scripts to the location of the database project. (e.g.: Data/Stored Procedure)
  2. Select the database node.
  3. Switch on the option “Show all files” to be able to see hidden files (not attached to the project)
  4. Select all required files and click “”
  5. Check everything is ok by building project

The application is available to download here (git repo): Data Script Writer.

Generate SQL Merge (script)

It’s an open-source script that helps you generate MERGE statement with static data.
Firstly, you must install the script on the server. Installation is trivial:

Simply execute the script, which will install it in “master” database as a system procedure (making it executable within user databases).

As I’m a PowerShell enthusiast, I have created a simple script that does this for you:

It’s worth to mention that the script can be stored in [master] database to be used from any user database context.

How to use it once installed? Let’s take the table [PhoneNumberType] from [AdventureWorks] database as an example:

As a result, you’ll get the following script:

There is more option to be chosen, but it’s not my goal to present you all of them. One the website you find out all details.

When updating the tables?

The best moment to do that is just after deployment the changes. That means: post-deployment script.
The first, easiest way, is to only append the script file to the deployment script:

Required property of Post-Deployment file

A drawback of that approach is that the whole script appears in there whenever deploying the database. And even worse: the whole script won’t be parsed like all other scripts which are being defined as post-deployment.

My recommended method

I would like to suggest one particular approach to address that case. I have applied it many times in multiple projects and it just works.
Let’s take a look at the code. We require:

  1. Separate schema (e.g. “data”) – optionally
  2. Script inserting the data wrapped up in Stored Procedure
  3. One “principal” stored procedure which calls all others.
  4. Post-deployment script when you have only one line of code – execute the “principal” procedure

Does that make sense? I hope so. To make sure that you understand it better – have a look the following pic:

All you need is… these files

To convince you more to this approach, let me list all its advantages:

  • The stored procedures are part of the database project.
  • They will be validated and compiled, thus you avoid potential errors with uncontrolled code
  • The changes to the SP appears in output script only when something changes.
    It’s easier to review the script before a run as it doesn’t contain unnecessary code
  • Having separate schema – you can control who has access to the SPs to protect against accidental executions
  • Having “principal” SP – you can control what and when will be executed, means populated.
  • In Post-deployment script – there is only one line of code which never is changed.
    It is important as this script is not built, so you must trust that it contains faultless code.

For me – it’s much more than enough reasons to apply this methodology.

icon-github  I populated a small table with data in this post. If you want to see entire code – check that repo on GitHub: ssdt-demo

Summary

Let’s wrap up what we learned today.  I do distinguish a few scenarios of updating the data in the target server. Perhaps not all of them is necessary for your scenario, but it works for me, that’s why I decided to share. The following table allows you to understand them better:

How does your solution look like?

I’m not trying to tell you that my solution is the best (even if I think that 😉 )
You don’t have to like it, maybe it just would not work for you, or you have particular requirements. Each project is specific, although 90% of them can use the examples presented above.
What is your experience with this area?
Maybe you know a better way to do the things?
Do you have something to add here?
Please do and leave a comment.

Previous Last Week Reading (2019-10-27)
Next Last Week Reading (2019-11-03)

About author

Kamil Nowinski
Kamil Nowinski 156 posts

Blogger, speaker. Data Platform MVP, MCSE. Senior Data Engineer & data geek. Member of Data Community Poland, co-organizer of SQLDay, Happy husband & father.

View all posts by this author →

You might also like

DevOps 3 Comments

New DB project with SSDT and import existing database

With this episode I would like to commence long-awaited new series It’s about DevOps with databases. I’ll start with SSDT, but not exclude other tools (like Redgate ReadyRoll) in the

DevOps 0 Comments

SSDT (SqlPackage) thread crashes with error when publishing

In this post, I will show you one of the scenarios when SSDT (SQL Server Data Tools) can fail unexpectedly while publishing database project to target server or preparation of

T-SQL Tuesday 3 Comments

#TSQL2sDay – How SQLPackage.exe can spoil your deployment

Today, I’d like to describe briefly one thing with SQLPackage.exe. I know… It’s not precisely related to PowerShell (this week topic), but… you can use PowerShell to run it (and

0 Comments

No Comments Yet!

You can be first to comment this post!

Leave a Reply