So recently at my client site we had a report that was running incredibly long. This was a while ago, so unfortunately I can’t remember the run time, but I believe it was greater than 12 hours. The report was pretty basic, just a determination of which documents were created within the month and an output that displayed things like location, file name, created date and creator.
When reviewing the code I noticed the original solution had been coded to loop through each site in the web and then grab all the documents within a particular date range. The line in particular that scanned the documents was as follows:
var listAddedItems = listItems.Where(x => ((DateTime)x["Created"] >= reportManager.ReportStartDate && (DateTime)x["Created"] <= reportManager.ReportEndDate) || ((DateTime)x["Modified"] >= reportManager.ReportStartDate && (DateTime)x["Modified"] <= reportManager.ReportEndDate));
Pretty straight forward, but what it’s actually doing is going through and gathering all the data and throwing out what you don’t need. Not a big deal unless you are working with a lot of data. Right now we are sitting at around 1 million items in our SP farm. I think I found the culprit.
SPView or SPQuery
So then I thought, ok, let’s redo this as a SPQuery. This would work, because then only the data that I need comes back. However, let me suggest something else to you. What if you dynamically created SharePoint Views (SPView) and then gathered that data from those views instead? “But David, why-ever would we want to do that” you ask? Simply put, like an SPQuery, SharePoint builds the query for you in the backend. You don’t have to loop through all the data to find the items you need. Also, if you create a view instead of a query, you can actually refine your data gathering even further by building a SPQuery against a view, but you won’t have to create a complicated SPQuery string.
So to speed things up I created a function to build my view, created the view, gathered data against the view and then deleted the view as I only wanted it available for this report. Here’s the code:
private SPView CreateDateFilteredView(SPWeb spWeb, SPList spList, DateTime filterDate)
SPViewCollection spViews = spList.Views;
string viewName = "NumOfDocRepView";
System.Collections.Specialized.StringCollection viewFields = new System.Collections.Specialized.StringCollection();
string query = "<Where>" +
"<FieldRef Name=’Created’ />" +
"<Value IncludeTimeValue=’FALSE’ Type=’DateTime’>2015-06-01T08:26:27Z</Value>" +
newView = spViews.Add(viewName, viewFields, query, int.MaxValue – 1, false, false);
newView.Scope = SPViewScope.Recursive;
The code to build the view is simply:
//spLibrary is the library to create the view in
//reportStartDate is the earliest date we want to report on.
reportView = this.CreateDateFilteredView(spWeb, spLibrary, reportStartDate);
To read the data based on the view is a single line of code now (well the first one was too, but the loop was built into that single line).
List<SPListItem> spItems = classifiedLib.GetItems(reportView).Cast<SPListItem>().ToList();
Once the report had finished with that particular library I simply delete the view as follows:
Did this particular report require a SPView instead of a SPQuery? Probably not, but it was a good report to provide a proof of concept moving forward. As I said, you could build yourself a basic view on the fly or maybe a bit more complicated, and then further refine a search with a much simpler SPQUery than you would have had to write in the first place. BTW, that 12+ hour long report? It now runs against the entire farm in less than an hour.
Let me know what you think.