CloudPrep 2014 Development Update

I wanted to take a few minutes today to talk about what we've been doing since late January in regards to CloupPrep and the PowerShell commands for file migration and management of SharePoint Online.

First thing I can say is that one of our most difficult choices was in choosing an e-commerce platform and licensing API to use for our product. Even though we plan to keep our licensing fairly simply, we wanted to have options for future products and well as many of the items we also sell through our partners.

This turned out to be more challenging than I imagined, but we have settled down on using Fast Spring and LogicNP Crypto License. Perhaps in some future post I will talk about those more from a software developer's perspective. What I can say today is that it will be at least a couple weeks before we can get a working prototype of the licensing server and the store online, and so we have had to push back release closer to the end of March or early April, mostly for that reason.

Meanwhile, we have been developing features for the different editions of CloudPrep 2014. Progress on that front continues at a rapid pace and I am pretty satisfied with the way our tools are maturing.

When we decided to produce this software, we planned to release the lite and standard editions first and follow up with premium and professional features later this spring. I was a bit surprised to see that where we are putting our development efforts, probably all four editions of CloudPrep will be available at one time.

Now for the geeky stuff. Here's some of what's been happening as we've been building.

Features we've essentially completed:

  • Upload an entire folder or specific set files to document library
    We've tested that these commands will work against network drives and UNC paths. Take that, OneDrive!
  • Preserve metadata about the local file system that the document was uploaded from
  • Create and Modified dates on files are preserved, though we did find that with larger files there are limits to what we can accomplish here
  • You can specify the content type for uploaded files, root folders, and sub-folders - including Document Set and its child content types
  • A bunch of other random stuff including commands for manipulating SharePoint lists and reports to make sure that file uploads won't exceed SharePoint limits


We noticed that Office 365 throws us a lot of connectivity errors that we don't normally see in on-premises SharePoint environments. If you've been trying to copy files using their standard UI or using OneDrive, some of these errors might be hidden from you. However, they're readily apparent if you're using Web Folders (WebDAV) or Client Side Object Model to connect. We see unexpected dropped connections quite often, and certain upload methods will time out on files that are too big and required some fun workarounds. There are different methods needed for files under 2MB, under 35MB, and larger.

Our path was also complicated by the fact that on certain Office 365 sites, our rights come from delegated admin privileges. This is the preferred way that consultants get their rights to help clients manage SharePoint Online, so we figure a lot of folks who are interested in CloudPrep are seeing this phenomenon as well. When you log in with delegated admin to a client's Office 365 site using the credentials from your own Office 365 account, you sometimes see the access denied page; login again a few seconds later, everything is fine. Our code had to expect and handle this contingency.

Another thing that we did not expect is that we're seeing some reasonable evidence that Office 365 uploads are being throttled. Most of the time, file transfers seem to be limited to about 300KB/sec; there are days when the transfer speed is even slower than that, sometimes by half. As such, it is difficult for us to estimate file upload times, and we're having to improve our algorithms to take these fluctuations and sea changes into account.

As for the cause, we can't say if this is something Microsoft is doing, or if it comes from the erosion of net neutrality. We do wonder if Comcast or other providers may be limiting traffic to Office 365 in order to give their own offerings a competitive advantage or just to control their own costs. I expect we'll be doing some tests in the near future, and we've been kicking around some ways to circumvent these bandwidth caps - at least partially. One test we did in January showed that if we took half our files to a different physical location, we were able to upload them to SharePoint Online in about half the time it would have taken if we'd uploaded them all from one server.

One thing that became clear during early development was that the disconnected nature of cloud storage was going to introduce multiple random problems along the way. As a result, in any large set of documents to be moved to the cloud, there would be some which for one reason or another may not be successfully copied. We started by trying to get this failure rate as low as possible, down to less that 0.25% of files in most cases. We did a lot of work in early February to improve the code and reach this threshold.

Even so, we needed to be able to easily run multiple passes on any file copy operation and track the results. Our first prototypes had to crawl the Document Library in SharePoint one folder and file at a time. This proved to be incredibly slow, and it quickly became apparent that we needed to be able to gather status information for thousands of files at a time if we wanted to hone in on only those which required an update from the local copy. This is something we added to the code base about a week ago, and we're now in the process of replacing some of our early code to use the new file comparison analytics logic.

As a side note, a bevy of SharePoint management features found their way into our PowerShell library simply because we had customers who needed them in short order. For example, we now have the ability to take a View from any SharePoint List and make a copy of it in the same List or a different one even on another SharePoint Site. Of course, one must be very careful with this kind of power, since creating Views with field references that don't exist in the List will certainly break the View if not the entire List itself. When we've added sufficient safety checks, we'll open the capability up as part of the CloudPrep product.

This week, we introduced the concept of using a hash algorithm to test whether files in SharePoint match those on our local drive. Use of a hash in addition to checking the file size and date stamps of a document ensures that the document has been uploaded into SharePoint and that it has not been corrupted in the process. We developed this ability in order to add credibility to Office 365 migrations where we may be moving hundreds of thousands or even millions of files, and we need to establish that the migration process has been completed satisfactorily. This capability can also be used to perform duplicate file detection, and we may develop a follow on product or feature to do just that later on.

Next week, we're planning to work on some important features that we feel are a must for getting this product to where we want it to be.

The first is the make sure that we can translate between Active Directory permissions on the local file system and users in SharePoint. The primary purpose here is to preserve meaningful data for Created By and Modified By fields in SharePoint; this is something we can't do yet. As part of this process, we'll be introducing PowerShell commands to add new users into SharePoint sites and manage groups. For most customers, this is probably of limited use. However, those with several hundred users or groups to manage will find it much easier to deal with these via PowerShell instead of using the SharePoint admin web pages. For consultants, it will make migrations faster by speeding up the time it takes to implement the security configuration. Our goal here is to lower the cost of our migration services.

The next things we do after that will be:

  • Download documents from SharePoint to the local drive
  • Assign metadata from CSV file as you upload documents
  • Flatten a folder structure as you upload it.


These are harder to do than you might think. I'll post more on this in coming weeks, including our challenges and progress updates.

What tool should you use to Deploy SharePoint Projects?


At this point, I have tried them all. I hand rolled my solutions; used early versions of WSPBuilder; struggled with the interface for VSeWSS 1.0, 1.1 CTP, and 1.1 final; and even tried a few offbeat solutions.

Until this summer, every method had serious problems that made packaging and deployment problematic. Revisiting my posts, I see that I never mentioned this before, but I owe props to my colleague Ted Calhoon who turned my on to the WSPBuilder Extensions. I switched over, and I never looked back.

http://www.codeplex.com/wspbuilder/Release/ProjectReleases.aspx?ReleaseId=16820

They are a bit counterintuitive, and the commands to flush out the folder structure are well buried, but once you get the hang of them, they are the best around - at least until someone comes up with something better.

In future, I'll do a walkthrough.

Update 5/1/2010: Microsoft has finally delivered a tool of their own with Visual Studio 2010. We've played with it a few times, and may still need to use WSPBuilder extensions for some time to support our legacy projects. They're very similar, and it appears that the SharePoint tema got this part right for the new version.