I recently received a request at my client site to review a custom built feature that was no longer responding\functioning properly within our SharePoint environment. The feature was designed to copy selected files from a project site to the team site of the group whom would be taking over daily operations of the project once the solution was in production. Once copied the documents were declared as records within our Record Management System. The process was to select the files to be moved and then from the ribbon select the option to set the destination. This is the part that was not working.

I grabbed a few sites from production (including the ones specifically affected) and moved them to my development machine and did some testing; hoping to re-create the problem. At this point in time, I was suspecting something in a recent CU we applied (October 2015) might be the culprit as the feature, while infrequently used, still worked post-production implementation. However, when I tested in my dev environment it worked successfully. It just seemed to take a while. So off to the ULS logs I go.

After some looking around I found the following error in the logs:

System.Threading.ThreadAbortException: Thread was being aborted.

Before this error message you can see the process accessing sites within SharePoint so I know it is working. What this tells me is that it looks like the process is timing out and thus we are getting the aborted message (the GUID for the aborted message matched the GUID for the process). So since it is working, I need to find out why it appears to not be working.

Looping through Sites and Lists in SharePoint

When I reviewed the code I found the culprit. The developer started at the root web application and dug down from there. They looped through each site collection to find the sites the user has access to. From that list of sites, they checked the libraries within the site the user had access to and then finally checked the folder the user had access to and built a TreeNode with this data. This sounds pretty standard and while I don’t disagree that it is a fairly easy way to get the data needed, in this client’s environment it is the worst method to gather the necessary data. The reason: the client has allowed the users to create folders all over their team sites. They have basically recreated their network drives within SharePoint. So in some sites there are over 10,000 folders for the code to loop through. If you have a user with a great deal of access this is going to take a long time to come back, which is what generated the trouble ticket.

So if I shouldn’t loop, what should I do?

I can’t give you a definitive answer for all cases. There isn’t one. However, I can give you some ideas that may or may not help you consider other solutions. For example, if you have reviewed my previous posts you would see the one where I discussed using SPViews instead of iterating through the sites. However, for this particular case I did the following:

  • Modified the data gathering as follows
    • Instead of getting everything, I first grabbed a list of all sites so the user could see what is available. Nothing else was done. So it was very quick.
    • I built the site list into links that when clicked on the system only built list of libraries and folders the user had access to directly. So no longer are you building a list of sites and libraries and folders all at once, you are only building it as you needed.
    • On top of that, instead of iterating through each folder to find the ones the user has access to, I instead wrote an SPQuery that returned a list of folders. Let SharePoint and SQL Server do the work. They are built for it. I will illustrate the code I wrote below.
  • Added the site, libraries and folders to the TreeNode logic already in place.

How I Implemented the Solution

Most of this is just straight web development, but I wanted to show you how I implemented this behind the scenes. The code is just snippets and not the final result, but I bet you will be surprised by the end result with even this rough implementation.

The first step is to gather a list of all the sites and webs we have in our environment. To do this quickly we are going to use SharePoint Search to gather a list of all the sites within the system and then build out our grid or however you choose to display the data.

string spSiteURL = SPContext.Current.Web.Site.Url;
SPSecurity.RunWithElevatedPrivileges(delegate{
    using (SPSite spSite = new SPSite(spSiteURL))
    {
        //Get the root URL
        string rootURL = spSite.WebApplication.Sites[0].Url;
        KeywordQuery spSearchQuery = new KeywordQuery(spSite);
        spSearchQuery.QueryText = string.Format("(Path:{0} AND contentclass:\"STS_Site\")", rootURL);                
                
        SearchExecutor searchExecutor = new SearchExecutor();
        ResultTableCollection resultList = searchExecutor.ExecuteQuery(spSearchQuery);

        //Logic to build grid can go here
    }
});

We are using Elevated Privileges here because we want to get everything, even if the user doesn’t have access. The reason being is they may not have direct access to the site, but could have access to a folder or list within the site. Also if you wish to get the sub sites too you have to modify your query as follows:

spSearchQuery.QueryText = string.Format("(Path:{0} AND contentclass:\"STS_Site\" OR contentclass:\"STS_Web\")", rootURL);

So now the user has selected a site. We are going to need some code to quickly and efficiently get our list of folders and sub folders. To start, you need to build a special SPQuery xml. What it does is utilize a built in field called FSObjType. This field indicates the system type of the item. Basically it indicates whether it is a file or a folder. If it has a value of 1 it is a folder. We are going to use this field to gather all of the folders (down to the lowest child folder) within the site.

private static SPQuery BuildFolderListQuery()
{
    SPQuery spQuery = new SPQuery();
    spQuery.Query = "<Where>" +
                        "<Eq>" +
                            "<FieldRef Name='FSObjType'/>" +
                            "<Value Type='Lookup'>1</Value>" +
                        "</Eq>" +
                    "</Where>";
    spQuery.ViewAttributes = "Scope='RecursiveAll'";

    return spQuery;
}

With this in hand it’s pretty straight forward to access the list and get all the folders. Now there still needs to be a bit of looping as you are going to need to know if the user has access to each folder so you will have to loop through checking the access on the folder.

The Results

So to prove to myself and to others that this way of building the data I needed was better than the previous method, I built a little test module to compare timings.

Initially I tested this in my development environment to get a comparison of the results with a small data set

Original looping method:

Build Folder List With Looping - Small Data Source

New method using Search and SPQuery:

Build Folder List With Search - Small Data Source

As you can see, the results are pretty similar, which may lead one to believe my suggested method isn’t much better. But before we throw my solution out, let’s take it to an environment with a lot of data and tons of folders (so many folders, you have no idea… 🙁 )

New method with large data set:

Build Folder List With Search - Large Data Source

So this actually looks pretty good. We scanned and returned 75 times the original folder count in the same time frame (approx 15 seconds).

However, things didn’t go so well for the looping method. I finally killed it after a minute because it had started to really use up memory. This leads me to think the original problem may not have been a timeout, but instead a killed process because of memory usage. In less than a minute, the looping process had gathered 600MB of memory already. Take a look at the screen shot before. I grabbed the image along with a quick view of the processes running.

Build Folder List With Looping - Large Data Source

So Again I Say: Stop Looping through Sites and Lists in SharePoint

This is not a perfect solution for all things SharePoint. Things you have to consider are thresholds limits and dynamically building the queries that could make this method unusable for you in your situation. But what I want you all to understand is while looping through data is a guaranteed way to get your data, it may not be the best way. If you are writing your code with a lot of looping through SharePoint items, consider first if there is another way like Search, SPQuery, etc. If you take your code to an environment like this one where best practice on folder usage isn’t followed, or you simply have a huge environment; it simply may not work efficiently.

Hope this gets you thinking.

Thanks for reading!!