Earlier this month Microsoft announced the (initial limited) availability of a PST import service for Office 365, the aim being to enable businesses to import PST data into their users’ mailboxes without having to resort to 3rd party tools (Microsoft’s own PST Capture Tool, BitTitan’s Personal Archive and MessageOps Office 365 Exchange Migration Tool are examples that come to mind). As of the 14th of May, the service is generally available, though I’d personally still call it a preview as it’s still lacking some features. At this stage, the service is not being charged for, but it seems Microsoft will do so in the future.
I’m a long time user of hosted email services (and email in general), originally utilising hosted POP/IMAP until about 5 years ago, then hosted Exchange email from Intermedia and switching to Office 365 earlier this year. I only ever delete spam and advertising emails, keeping the rest. I used to archive periodically (every 6 months, last performed about 5 years – went to a near unlimited Intermedia plan) to PST so that I could remove historical email out of my mailbox to reduce size and stay within limits. As a result of all this, I’ve accrued a large number of PST files with a lot of email that I’ve wanted to consolidate and make searchable without having to attach them to Outlook or use 3rd party tools to open them.
With the Import Preview, I decided to test the tool out with my email archives. I have 30GB of PST files that I needed to import, this seemed like a good time to do it.
First, an overview…
There are two methods that Microsoft have made available, with information for both available via a Technet article:
- Import PSTs by shipping an encrypted drive (or drives) to Microsoft
- Import PSTs by uploading them via the network
I’m only covering the network upload based mechanism in this post.
According to the Technet link above, the use case for shipping a drive is if you have a large dataset based on network upload performance. That is, if you have a small upload speed and a reasonably sized dataset, it may be faster to send a drive to Microsoft, have them copy it to Azure and make it accessible to you that way. An example might be if you have 1TB of PST files, but only have a 10Mbit upload. Given that 10Mbit service is really only about 1MByte/second in perfect conditions, you’re looking at around 11-12 days to upload the data if you get 100% utilisation without any interruption – not a realistic scenario. In this case, it’d make sense to send drives across since you’d complete it faster and more reliably.
For my situation, I had access to a 100Mbit link that allowed me to upload all my data rapidly. Additionally, I wasn’t under any time constraints that might exist if this were a project for a client. Its my stuff, I don’t particularly care if I did it quickly or not – just that it needed to get into Office 365. As a result, the speed of my Internet access and the size of the dataset made it simple to utilise the network upload function.
At a high level, the tasks you’re going to undertake are:
- Upload your PST files to the Azure Blob storage provided by this service
- Establish the proper permissions for the Office 365 Admin account being utilised for this import
- Create a PST User Mapping file
- Create a PST Import job, utilising the PST User Mapping file from the previous step
Lets start with the annoying things about this service:
- The instructions provided aren’t entirely correct, as you’ll see below
- The PST Import functionality cannot distinguish between duplicate items, it just imports everything
- It isn’t as fast as I’d have hoped
The duplication issue is a bit of an annoying one. Along the way, I’ve actually exported some messages two or three times, resulting in lots of duplicates that need to be filtered out. Unfortunately, there’s no duplicate checking in place meaning that you need to ensure your source is accurate before you upload the PSTs. I found this out after the fact, so I need to consider if I want to delete all the messages and perform a re-upload, or work out a way to delete the duplicate messages within the mailboxes as they stand now. Additionally, the speed to import isn’t the fastest. The first 12GB of emails took in excess of 4 days to import, which is an excessively long time by anyone’s standards.
In saying that, it did get the data in and it was a relatively painless process once I worked out the intricacies of the tools. I’ve detailed my steps below, so hopefully others don’t need to go down the same path as I.
Initiating a network upload of your PSTs
It should go without saying that the data needs to have been organised beforehand. If you haven’t yet done this, I strongly suggest it as it will make your life much easier. For this article, I will assume you’ve got a naming convention for all your PST files that makes it simple for you to create the CSV file.
On the assumption you’re reading this whilst preparing to do this in a live environment, take into consideration the issues I’ve mentioned earlier relating to duplicate messages. If you know the PSTs have duplicates you need to inform your users about it, in order to avoid confusion. Also ensure you’re communicating a reasonable schedule to your users – I know this seems simple, but the number of times I’ve seen projects fail to gain user acceptance means I tend to harp on about this.
Performing the PST upload with Azure AZCopy
First thing’s first, download the AZCopy Tool – you can do this via the “Import Data” section. Of the 4 icons across the top, the first 2 are important. Clicking on the first icon (shaped as a + symbol) allows you to select the “Upload files over the network” option (click this):
This will bring up the below screen, listing various things you’ll need:
You need to do the following:
- Download the tool
- Copy the secure storage account key
- Copy the secure URL
You can access this information by clicking on the second icon (shaped as a key symbol), which yields the following screen/pop-up:
For reference, this is what your information should look like:
- Secure storage account key: bVqiKup0j7Bxik39vSN/zV3UveKHWsvmYb+NsvyoJ s2Dhb8kOYIqmE2IPuC7uA7h3dTKJ1EmA0fWF3lPtBYtqA==
- Secure URL: https://d1717df0c99e4157xyz520a.blob.core.windows.net/ingestiondata
Note: these are examples. Do not try to use them, they will not work.
At this point, you have two options. The first is to install the tool on whatever server the PST files reside on. Alternatively, you can install it to a workstation (highly recommended) and run it from there. I’ll provide basic instructions in both cases.
Once installed, you need to initiate the upload of the PST files through the command prompt. The Azure AZCopy tool is CLI based. Ensure that the PST files themselves are accessible via a local file path or UNC path. In the event of a UNC path, ensure that the user you’re logged with has access to the files. Lastly, double check the paths.
Pick the following command, based on your configuration. The first is for locally stored PSTs, the second is for remotely stored PSTs:
AzCopy /source:C:\PST /dest:https://UNIQUE-AZURE-ID-HERE.blob.core.windows.net/ingestiondata/LOCALVM/PST /destkey: /S /V:C:\PST\LOG\PST_Upload.log
AzCopy /source:\\SERVERNAME\SHARE /dest:https://UNIQUE-AZURE-ID-HERE.blob.core.windows.net/ingestiondata/SERVERNAME/SHARE /destkey: /S /V:C:\TEMP\PST_Upload.log
With the syntax of the command, there’s a few things to note.
- You define the actual destination path in your URL – in the “locally accessible” command (which is what I actually used – I spun up a VM for this), the URL ends with /LOCALVM/PST – similar to the “remotely accessible” command. This path is important later on
- Make sure you copy the whole key – any missing characters will cause you problems
- I’ve specified additional logging in this command, as I believe that’s critical for troubleshooting should it be necessary. Make sure you adjust the path for the log file to a location you will remember
After you run the command, it will log in and commence uploading the files. Speed is dependent on the usual range of factors when transferring over the Internet.
On completion of the upload, this is what the log file looks like (this is a modified version of mine):
[2015/05/15 15:28:06.255+10:00] >>>>>>>>>>>>>>>>
[2015/05/15 15:28:06.270+10:00][VERBOSE] 3.1.0 : AzCopy /source:C:\PST /dest:https://.blob.core.windows.net/ingestiondata/LOCALVM/PST /destkey:****** /S /V:C:\TEMP\PST_Upload.log
[2015/05/15 15:28:07.833+10:00][VERBOSE] Start transfer: Name_of_PST_File-1.pst
[2015/05/15 15:28:07.848+10:00][VERBOSE] Start transfer: Name_of_PST_File-2.pst
[2015/05/15 15:43:01.127+10:00][VERBOSE] Finished transfer: Name_of_PST_File-1.pst
[2015/05/15 15:43:21.753+10:00][VERBOSE] Finished transfer: Name_of_PST_File-2.pst
[2015/05/15 15:49:13.759+10:00] Transfer summary:
Total files transferred: 2
Transfer successfully: 2
Transfer skipped: 0
Transfer failed: 0
Elapsed time: 00.00:12:07
Next step is to set up the permissions for the account importing the PSTs.
Granting permissions for the PST import
If you read the Technet article I linked earlier, it says to assign the Compliance Management role to the appropriate admin account. The problem here is that I haven’t yet seen an Office 365 account in which that admin role provides the “Mailbox Import Export” role. I’ve flagged it with Microsoft, so hopefully they fix the permissions or update the article. In any event, I’m going to show you how to create a specific admin role that provides the relevant permissions.
The first step is to click the “Exchange” link under the “Admin” section of the Office 365 Admin panel, shown below:
Once the page has loaded, select “Permissions” on the left hand panel and click the + icon to create a new admin role:
Within the resulting page, name the role (I suggest a descriptive title) and then select the Role and Member. The role you select needs to be the one titled “Mailbox Import Export”:
Click “Save” to complete this task.
PST User Mapping file
You will need to download the PST Mapping example file, which you can do here. The relevant fields in this are:
- Workload – Set this to “Exchange”. At this stage, the Office 365 Import tool only works with Exchange, but I imagine that Microsoft will introduce further enhancements at a later stage
- FilePath – This is the path listed in the URL. As mentioned before, it’s important as it defines where the actual PST files are for the Import tool. If you used the “locally accessible command “LOCALVM/PST” in this field. If you used the “remotely accessible” command, it’d be “SERVERNAME/SHARE”
- Name – The actual filename of the PST
- Mailbox – The destination mailbox for the PST. Use the primary email associated with the account. This will apply for Shared Mailboxes and User Mailboxes equally
- IsArchive – This allows you to actually import the data directly to the user’s Exchange Archive, as opposed to their primary account. If you have archival policies in place, this may factor into how you import the PSTs
- TargetRootFolder – This defines the actual folder, at the root level, the PSTs get imported into. If you leave it blank, the contents of the PSTs will be imported at the root level by default. The difference is if the PST file is imported into the same folders within the mailbox, or if it creates a top level folder and imports the data into that folder
As an example, this is a pre-filled out version which I’ll explain:
In this example:
- On the first line, I’m directing “Name_of_PST_File-1.pst” to go to the primary mailbox of “email@example.com”, within a root level folder called “FolderName”
- On the second line, “Name_of_PST_File-2.pst” will go to the archive mailbox of “firstname.lastname@example.org”
We’re now ready to initiate the actual import of the archives.
Initiating the import of PST data
We start by selecting the “Upload files over the network” option:
As we’ve now uploaded the files and prepared the mapping file, select both checkboxes and click “Next”:
Enter in a useful name, so you can track the progress of the job. For this post, I called it “pst_import_attempt_5”:
Click the + icon and select the PST Mapping File you’ve created. Once done, check the box and click “Finish”:
The import will kick off. You should check back in 5 minutes to see if the job has started, or if it has encountered any errors.
Checking the status can be achieved by selecting the job you’ve created, then selecting “View Details” in the right hand panel:
This takes you to the details of the job. In here, you can review the status of the tasks, and also the status of each PST file. Below is the details of one of my import jobs:
If you’re going to import a large number of files, there might be value in establishing a job per user. This will allow you to watch the individual progress of each users’ PST import. With any luck, it should have worked and your PST files are being imported.
If using this in a large scale import, you need to make sure you plan this out. I suspect the biggest issue with this will be getting users to understand what is happening and how it impacts them. The actual tool itself is relatively simple to use, once you understand it.