February 2011 - Posts

PowerShell cmdlet for BizTalk db restore
22 February 11 08:59 AM | wmmihaa | 4 comment(s)

Configuring the backup job for BizTalk is a fairly simple task, while restoring it is a bit more complicated. By default the BizTalk backup job makes a full backup once a day, and a log backup every 15 minutes. When backups are done, a mark is set on each file. This mark is the same across all databases, and should be used to restore all databases to the same point in time and keeping all databases in a consistent state. –Also, by default, all backups are made to the same directory folder.

The only supported disaster recovery procedure from Microsoft is log-shipping. Nick Heppleston has gone through the trouble of describing this in great detail, and I strongly recommend to read these post before you choose to use any other approach.

A few weeks ago, I sent out a question on twitter, asking whether people used log-shipping or not. I got 24 responses where only 4 used log-shipping.

Although log-shipping comes with many advantages, it is still expensive since it requires a secondary SQL cluster. Most of the people I asked confirmed this was the main reason why they had chosen other alternatives. Since BizTalk doesn’t come with any restore scripts/features other than log-shipping, everyone is left to fix this on their own.

If you’re in the same situation, feel free to download this sample. If it doesn’t fit your solution, it might at least be a good starting point.

The sample comes with two cmdlet’s: Get-Marks and New-RestoreDatabaseFromMark. The first one gives you a list of all marks from all log files. The second one, as the name implies, restores a database to a specific mark. When doing so, the database will be restored from the last full backup before the mark. After that, all log files will be restored in order from the full backup. The last log file will only be restored to the specified mark.

image

The Get-Mark cmdlet queries the backup output folder to retrieve all marks. The mark is part of the name of each backup file:

image

Each file is made up of the following parts:

[Server]_[Instance*]_[Database]_[Full|Log]_[Mark]

Eg: SERVER001_DTA_BizTalkDTADb_Log_BTS_2011_01_18_12_06_50_22.bak
* The instance is only present for none default instances.

You can use the New-RestoreDatabaseFromMark cmdlet with or without specifying the mark. Leaving the mark empty is eqvivilent to last mark.

The New-RestoreDatabaseFromMark cmdlet is called per database, why it's easier to create a script for restoring all databases together. The sample comes with a RestoreScript.ps1 script file, which could work as a good start:

$backupPath = "X:\BizTalkBackUp";
$dataPath = "E:\SQL Server 2008\MSSQL10.MSSQLSERVER\MSSQL\DATA";
$logPath = "E:\SQL Server 2008\MSSQL10.MSSQLSERVER\MSSQL\DATA";



$mark = Read-Host "Specify mark (use the Get-Marks cmdlet to get all marks or blank to use last mark)";

if ($mark.Length -eq 0)
{
Write-Output "Restoring to last mark...";
$mark="";
}

New-RestoreDatabaseFromMark SSODB $backupPath $dataPath $logPath $mark;
New-RestoreDatabaseFromMark BAMPrimaryImport $backupPath $dataPath $logPath $mark;
New-RestoreDatabaseFromMark BizTalkDTADb $backupPath $dataPath $logPath $mark;
New-RestoreDatabaseFromMark BizTalkMgmtDb $backupPath $dataPath $logPath $mark;
New-RestoreDatabaseFromMark BizTalkMsgBoxDb $backupPath $dataPath $logPath $mark;
New-RestoreDatabaseFromMark BizTalkRuleEngineDb $backupPath $dataPath $logPath $mark;

trap [Exception]
{
Exit;
}

Write-Output "Done restoring all BizTalk databases" -foregroundcolor "yellow";

The first section of the script defines a set of path variables. In my simple sample, all database files are located in the same folder. This is never a good practice, why you probably have different paths for data- and log files for each database.

To use the sample, open PowerShell and navigate to the sample output folder. Eg:

PS C:\> CD “c:\Program Files\bLogical.BizTalkManagement”

Before you can use the cmdlets, you need to install them. You can do this using the install script:

PS C:\Program Files\bLogical.BizTalkManagement> Install.ps1

After you’ve installed the snapins, you can start using the commands:

PS C:\Program Files\bLogical.BizTalkManagement> Get-Mark "X:\BizTalkBackUp"

or

PS C:\Program Files\bLogical.BizTalkManagement> MyRestoreScript.ps1

image

Downloads 

HTH

Using AppFabric Cache as WF Persistence store
01 February 11 09:00 PM | wmmihaa | 2 comment(s)

Paolo Salvatori and I, had this idea about using AppFabric Cache as a workflow persistence provider. As we’ve been working on this for quite some time now, we’re happy to finally share the work with everyone else thinking about how workflow persistence work, or perhaps thinking of building your own provider.

Developing your own persistence provider is not the simplest task you can take on, but it’s perfectly doable. I was surprised by the absence of persistence provider samples on the web. The only helpful one I could find was the MemoryStore, which helped me get started.

The funny thing with this sample was that I spent most of the time, trying to figure out why it worked, not why it didn’t.

I want to thank Paolo Salvatori, Manu Srivastava and Ruppert Koch from the AppFabric CAT and Product group for helping out and clarifying how the underlying plumbing works. More on that at the end of this post.

What is Workflow Persistence?

Unless we enable persistence, a workflow instance is only stored in memory. If the host goes down, it will take any evidence of your workflow instance with it. The process of persisting a workflow to a durable repository or storage is known as dehydration while restoring it, is generally referred to as rehydration. If you are a BizTalk person, you are probably aware that the persistence store for BizTalk is the BizTalkMsgBoxDb database.

Much like BizTalk, Workflow Foundation comes with a SQL persistence store, - the SqlWorkflowInstanceStoreProvider. However, with WF it’s optional. -Not only to use persistence, but you are also free to choose other providers, or even build one yourself.

You can enable the persistence either through configuration or code. If you’re hosting your workflow in IIS/AppFabric you can even manage your persistence through the IIS Manager.

Why and when would we use persistence?

You might end up using persistence for different reasons, which might also have an impact on which provider you’d choose. As I’m a “BizTalk guy”, I generally think of persistence in terms of 1) A safety-net in case something goes wrong and 2) Resource management, off-loading running instances to prevent memory consumption. In most cases you’re looking for a safe and robust provider to make sure you are not going to lose sensitive data along with the internal state of the workflow instance.

AppFabric Cache might not be your first choice if you are looking for a secure and reliable store provider. As Stephen W. Thomas put it: “Never put anything in the Cache that MUST not be lost – it is a non-transactional cache, not a database”.

However, persistence is also about scalability. If you, for example, are using WF to control the page flow of an ASP.Net application running on multiple web servers, you’d need persistence to make sure one server can continue a workflow which initiated on another server. Furthermore, in any scenarios where workflow services are accessed in a session-less manner, it is also important to persist immediately when the workflows gets idle. Otherwise the second request to the workflow might be re-directed to the other server, while it’s still active in memory on the first server. We can control this behavior using the WorkflowIdleBehavior.TimeToUnload setting in the web-/app.config:

<behaviors>
<serviceBehaviors>
<behavior>
<sqlWorkflowInstanceStore connectionStringName="persistenceStore" />
<serviceMetadata httpGetEnabled="true"/>
<workflowIdle timeToUnload="00:00:00"/>
<serviceDebug includeExceptionDetailInFaults="true"/>
</behavior>
</serviceBehaviors>
</behaviors>

The default value for the property is 1 minute. Unloading a workflow implies that it is also persisted. If TimeToUnload is set to zero the workflow instance is persisted and unloaded immediately after the workflow becomes idle. Setting TimeToUnload to MaxValue() effectively disables the unload operation. Idle workflow instances are never unloaded.

In some cases, for example when using WF to control the UI of your application, , you may consider using a less reliable provider. In such cases storing your workflow state in AppFabric Cache could be a good fit. In fact I would not be surprised if Microsoft shipped an AppFabric Cache provider in a near future.

But I won’t lie to you, – I think AppFabric Cache is super cool, and would make up any excuse to play with it!

What is correlation?

First lets have a look at the sample workflow. It’s a very simple sample to simulate a process of working with some sort of document form. The user may save the document, to come back later to continue working on its content.

image

  1. The client calls the workflow service through the CreateDocument method. This will cause the runtime to create a new instance of the workflow.
  2. After receiving the new Document, the workflow assigns it to a local variable.
  3. As the response is sent back to the client, the workflow initializes the correlation on the Document.Id (string)
  4. While the workflow waits for the client to send an updated document it will become idle, and therefore persisted to the AppFabric Cache store.
  5. The client invokes the UpdateDocument method. As the workflow instance is not in memory, the runtime will ask the persistence provider for the serialized workflow and will then rehydrate it.
  6. The workflow instance will perform the following actions:
      • update the local document variable;
      • return the response;
      • persist its internal state,
      • continue to listen for updates until the user submits “I’m done”. 

The only “complex” part of this workflow is making the second call (and all subsequent calls) to be “routed” to the right workflow instance. Keep in mind that there might be any number of concurrent documents being processed in the system. This is solved using the WF Correlation and in particular Content-Based Correlation.

If you’re a BizTalk guy, this is nothing new; however when I teach BizTalk classes people seems to have a bit of trouble to understand the concept. A correlation set is basically a “primary key” for the instance of the workflow. The id is created on the first call to the workflow, and needs to be passed on to the runtime for all subsequent calls. Using WF, this is very simple, all you need to do is to set the CorrelationInitializers for instance a SendActivity:

image

This will cause the Workflow runtime to create the instanceId (Guid), and add it to the binding context of the client. This way the instanceId is passed along with the soap header.  Although this might work fine in most situations, there are some limitations. -If the client does not have the instanceId, as when different clients are interacting with the same workflow, this won’t work. You are therefore better off using the Content based correlation type; Query correlation initalizer. For more information about the different kinds of correlation that WF provides, have a look at Paolos post.

image

This correlation type lets you define the id of your workflow, based on the content of the message. But keep in mind it needs to be unique.

So how does a persistence provider work?

The simple answer is: It serializes the workflow along with its metadata to a store. The persistence provider could use any durable media such as a file, database or queue. The AppFabric Cache does not fall in the category of being a durable media, but we will ignore Stephens lame warnings for now, and just concentrate on the sweet coolness of AppFabric Cache.

To begin with, a persistence provider needs to inherit from InstanceStore (System.Runtime.DurableInstancing), which is an abstract class with some operations you need to overload. The most important one is the BeginTryCommand. This is a universal operation which will be called from the runtime. With it comes an InstancePersistenceCommand parameter which tells you what the runtime expects you to do. This can be any of the following:

(there are more commands, but this will do for now…)

image

If the last SaveWorkflowCommand has the CompleteInstance set to true, this indicates the workflow is done. In my case, this is where I clean up the cache, even though the cache will eventually clean itself up, as I add to the cache using the timeout parameter. You can set this yourself in the web.config.

...
<behaviors>
<serviceBehaviors>
<behavior>
<bLogicalPersistenceStore cacheName="bLogical" timeout="00:05:00"/>
<serviceMetadata httpGetEnabled="true"/>
<workflowIdle timeToUnload="00:00:01"/>
<serviceDebug includeExceptionDetailInFaults="true"/>
</behavior>
</serviceBehaviors>
</behaviors>

The LoadWorkflowByInstanceKeyCommand is where I got stuck. As I said in the beginning, –it worked, I just couldn’t figure out why. Below is the code in my LoadWorkflowByInstanceKey method:

private IAsyncResult LoadWorkflowByInstanceKey(InstancePersistenceContext context, 
LoadWorkflowByInstanceKeyCommand command,
TimeSpan timeout,
AsyncCallback callback,
object state)
{

Key key = _cacheHelper.GetKey(command.LookupInstanceKey);
Instance instance = _cacheHelper.GetInstance(key.Instance);

I’m loading the workflow using the LookupInstanceKey value, which is a Guid. Where did that come from? I was expecting my correlation key, – the Document.Id. I was starting to believe the correlation key was stored (together with the LookupInstanceKey) somewhere else. I started to trace the message to see if there was anything hidden in the header, but there wasn’t. I tested different client, collaborating on the same workflow instance, and that worked too. I even set up an environment with multiple servers –AND IT STILL WORKED! How was this possible?

As I was losing sleep over this, I turned my faith to Paolo Salvatori, hoping he could share some light on this. Eventually, Manu Srivastava and Ruppert Koch explained how the magic works. It turns out the LookupInstanceKey is a 128-bit hash of the actual content based key.

Manu Srivastava:

“The Hash == InstanceKey.Value == CorrelationKey == LookupKey. These terms all mean the same thing. InstanceKey.Value contains the value of the hash, which is of type GUID and represents the correlation / lookup key [http://msdn.microsoft.com/en-us/library/system.runtime.durableinstancing.instancekey.aspx]”

“It is the responsibility of the Provider to store the Workflow Instance, the Correlation Keys *and* the mapping between the two. The LoadWorkflowByInstanceKeyCommand has a LookupKey as an argument. This is the correlation key. This is the key the custom provider implementation must use to identify and then return the Workflow Instance. The hashing algorithm used is irrelevant to the implementation of the Provider. The Provider only has to worry about storing the InstanceKey.Value and retrieving an instance via the InstanceKey.Value. The transformation between Document.Id and the InstanceKey is a WF runtime detail that you do not need to worry about for your implementation; thus, you do not need to worry about the hashing algorithm itself. When you persist an instance, save the InstanceKey.Value and its mapping to InstanceId. When you load an instance, use the LookupKey to find the correct InstanceId and return the WorkflowInstance. That's it. You don't need to go from Document.ID to hash or vice-versa...thats handled at the WF Runtime Layer. Just focus on saving InstanceKey.Values and loading by LookupKey. =) InstanceKey.Value == LookupKey in this particular scenario”

In other words: – Concentrate on the problem, and leave the plumbing to us! –Point taken.

Running the sample

Download and install:

Before running the sample, you need to properly configure the environment and in particular you have to create the cache used by the custom Persistence Provider.

To accomplish this task, you can perform the following steps:

1. Download the Windows Server AppFabric.

2. Read Scott Hanselman’s post as to how to set it up.

3. Open Caching Administration Windows PowerShell through the Start menu, and run the following commands:

  • Start-CacheCluster
  • New-Cache bLogical

image

4. Download and run the sample the sample.

This is probably a good time to point out, that the sample provided with this post is not production ready. However, If you want to take it further, I’d be happy to help out!(</disclaimer>)

Again, many thanks to Paolo Salvatori, Manu Srivastava and Ruppert Koch from the AppFabric CAT and Product group.

HTH

This Blog

News

    MVP - Microsoft Most Valuable Professional BizTalk User Group Sweden BizTalk blogdoc

    Follow me on Twitter Meet me at TechEd

    Visitors

    Locations of visitors to this page

    Disclaimer

    The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

Syndication