There is a wonderful mystery that revolves around Citrix Provisioning Services (PVS) and DFS-R. You’ll need to be using and familiar with the technology to benefit from this article. Why is it that something that should “just work”, doesn’t. What’s going on, where did I go wrong, what am I missing?
Let me paint a scenario for you based on a real world example.
You have three PVS machines up and running, you’ve got your initial gold master image working fine. All the PVS machines are load balanced, DFS-R is configured properly for this environment ( your “master” PVS server is set to replicate out the VHD store to the other servers who must not be configured in read-only mode, and are set to receive only ).
You’ve finally come to a point where you need to push out an update. You create your new version in PVS console, boot up your target device for gold master maintenance, do what you need to do and shutdown. So far so good.
DFS-R kicks in and proceeds to replicate your new VHDs to the other hosts. PVS console reports replication status is good and you’re all set to put this baby into production.
One evening you hang back after work. You place the new version of the VHD into production and immediately PVS console is issuing a replication status warning. The other two PVS machines are no longer in sync. What happened?
All of the date/time stamps match perfectly, the file sizes are identical. Why oh why are you out of sync. At this point, it was late and I wanted to go home so I place it back into Maintenance mode and assumed come Monday morning that DFS-R would have brought the servers back in sync. Guess what? DFS-R didn’t. The servers that were out of sync, remained out of sync. Even robocopy said the files were all in sync.
This really was bugging me and it wasn’t until I sat down with another person who lives and breathes Citrix and was similarly puzzled by this odd behavior.
So we did some really simple tests. Firstly I got rid of the version I wanted to work with so we started clean again from a working established gold master version. For the purposes of these tests we used robocopy to handle our replication as we need to force replication for the testing.
What we did was this
- created a new version
- robocopy’d the new files (avhd, pvp, lok) to the other servers
- PVS console reported replication status good
- promoted the version into production – replication warning immediately
- re-ran robocopy
- still got a replication warning.
Something was happening to these files that PVS was able to detect as different, yet robocopy didn’t detect as a difference. Our observation based on a series of tests like this suggest that robocopy use the filename, date/time stamp and file size to determine if the files are the same. Its our assumption that DFS-R works in the same way although we didn’t explicitly test DFS-R in this case due to time constraints.
Having observed this behavior we subtly altered the process as follows
- created a new version
- placed the current production image into override status
- promoted the new version into production – not bootable due to override
- robocopy’d the new files to the other servers
- PVS reported replication status good
At this stage now all three servers are in sync, the version is marked as production but not for booting at this time.
Its clear from these tested that the act of promoting a version into production makes an internal change in the file – one that does not cause a change in filesize or date/time stamp and something that PVS is clearly checking for and is aware of.
This is from a real site that was relying on DFS-R to handle its replication requirements for Citrix Provisioning Services. Now they’ve swapped to robocopy for the short term and altered the process of doing updates. Versions are marked for production prior to robocopy running and replication status is consistently reported as good when new versions are created. The override feature is now used as the method to control the release of a version for actual production use.
Individual results may vary ofcourse. I posted this as I couldn’t find anyone or anything on the subject to help. Only by sitting side by side and working it through logically did we find this behavior.