Over a year ago, I wrote about SharePoint’s storage limits and threshold, why they are there and and can they be exceeded. I had started to illustrate how the content databases could be exceeded and to what extent, but I really never got to do some more in depth tests. I had promised to come back one day and perform more tests and to show if pushing SharePoint content databases beyond 200GB is possible in your environment.
Why is This Important?
Today people are starting to use SharePoint a great deal more for collaboration of documents and work than ever before. Gone or at least going are the days of files being stored on a network drive and multiple versions floating around in the emailosphere (yah, I made that one up) for people to update. With SharePoint providing an easy way to store, share and collaborate (co-author) documents and other work related tasks more and more information is going to be stored in those content databases. This means that you are going to be hitting the documented thresholds much quickly than before. In many cases, archiving isn’t an option so knowing if your environment can handle content databases exceeding 200GB is important.
Testing Environment
My testing system was a development environment so definitely not as powerful as production. I will be explaining shortly why this is important to note. The environment I tested with was a single server running both WFE and most APP features. The SharePoint and DB servers are both running a 4 core processor running at 2.5 GHz. The SharePoint server is running with 16GB ram and the database server 64GB. The DB server has so much ram because it is handling the DB needs of a large number of SharePoint instances. I don’t know all the details of the drive backbone, but speaking with those responsible it appeared that it should at least meet if not exceed the levels recommended by Microsoft.
Testing Method
What you need
- Set of three different files. They don’t have to be exactly what I used (use other file types), but the file size of each worked out almost perfect.
- PDF base file. I used one that was about 4MB.
- Word base file. I used one that was about 1.5MB.
- JPG base file. I used one that was about 1.5MB.
- PowerShell script that will upload one of the base files 4998 times to a library. It has to change the name of the file each time one is uploaded to ensure duplicates don’t occur. It should take in the following parameters
- Sub Site name
- Library Name
- File Location
- Number of files to upload (optional. I didn’t have to change the number of files because the sizes I used worked almost perfect for my testing).
PowerShell Script: It looks like I accidentally deleted the PowerShell script I was using for this purpose. I ran these tests a while ago and am using a different system then I did then.
Looks like I forgot to copy the scripts over. If you really need help here, let me know in the comments and I will look at re-writing it.
- 10 TB of disk space on SharePoint DB Server
- 5 TB will be used for the “upload” content DB.
- 5 TB will be used for the restore DB (for testing backup and restore process)
- 5 TB of disk space for backups (Optional)
- When I was testing, the backup location I was using already had a ton of space so I didn’t need to add any.
- Another set of files for the load test. I uploaded 3 files in my load test and they ranged from 500KB to 1.2MB.
Visual Studio Load Test
I am not going to cover how to build a load test as there is a really good example here. What I am going to do though is describe what the load test did and how it was configured before running.
- Access the site you wish to test.
- Navigate to a site page (I just went to \SitePages\Home.aspx)
- Return to the landing page and navigate to another page in the site. These steps are to simulate users moving from page to page within a sub site.
- Open a document library.
- Upload three files to the library (the files I mentioned in in step #5 above).
- Perform a search on one of the files (not caring about results, but time it took).
- Open the file.
- Close the file.
Configure the load test to start with 1 to 100 users. With a single workstation you really shouldn’t go beyond that as you run a risk of you system becoming the bottleneck and throwing off the results. However, if you are able to set up load test agents then you can go well beyond that.
You really want to ensure you are testing with what is a good percentage of the number of users accessing the system. For the environment I was testing, 100 users was between 60-70% of users that could be accessing the system. The number of users in this case is actually much higher, but most users access sporadically. The percentage is the number who could access concurrently.
Run the tests when there is next to nothing in the environment (this will be your baseline) then 1TB, 2TB… you get the idea.
Setting Up the Environment
You don’t have to set things up this way, but it’s just the way I did it to ensure thresholds weren’t exceeded and there was some semblance of organization when adding 4TB of files smaller than 10MB to a content database.
- Create four sub sites. Each one is going to hold a TB of data.
- Within each of the site, create 90 libraries (yes 90). Each library is going to hold 4998 documents of the three types I mentioned above. 30 for each type of file.
- I actually found that when I filled up each library it came out to just a touch over 1TB. Worked out pretty well actually.
Edit: I initially had stated it was 30 libraries and 10 per library, but Pragnesh in the comments pointed out the numbers didn’t add up. I realized there was a typo. It was actually 90 libraries and 30 per type.
Performing the Tests
- Upload your first 150K files.
- Run your load tests.
- Backup the SQL database using whichever tools your organization uses and record the time it takes. Do not use SharePoint’s backup process. It’s unusable now.
- Restore the database (If you have the space to do so) to your “restore content DB”. Record the time it takes.
- Run a full search index. Granted, you aren’t going to run into issues where you add tons of data at once and have to index it all (like you are in your test efforts) unless you are migrating data, but running a full index is going to give you a good idea how long it will take for SharePoint to build the index for your files.
- Ensure the drive you have your index on is large enough to hold all the data.
Pushing SharePoint Content Databases Beyond 200GB
I am not going to share all my results as the purpose of this post is to illustrate that exceeding the limits can be done and how to test for it, but I can give you an idea of what you are going to see if your infrastructure can handle it. I am just going to discuss the smaller and the largest tests.
1TB Test Results
- It took about 1 second longer to complete a test on average at 1TB. Keep in mind, a single test had all 8 steps above within it. Considering we are ok with a 3-4 second response on a single page load 1 second to complete all the tests isn’t a huge time increase.
- The longest test was viewing the contents of a library and that only added .34 seconds. Well within tolerances.
4TB Test Results
- As expected this is going to take longer. But not much. All tests took 3.6 longer than the baseline I ran.
- The list view took an average of .5 seconds longer than the baseline. Well below the times we were concerned about.
- Even getting search results was only about .11 seconds longer than the base test on average.
Keep in mind, I was testing against a development environment. Production is much more powerful so the times would likely be less. It is possible they would be about the same because there could be more people accessing, but it should still be within tolerances if your users aren’t too picky. Let me know if you have any questions surrounding my testing methods or comments from within your own organizations.
Thanks for reading!!
Comments
Hi David,
This is really very helpful to me and will help me in solving the problem that my client is facing.
However when I calculated the size of one subsite with the numbers that you have given, the size for each site will be 585 GB provided an average size of each document is 4MB
So, it will be (4998*30*4)/1024=approx 585 GB.
Can you tell me how you could achieve 1TB for each subsite? i.e what is the average size of document you considered?
Thanks,
Pragnesh
Hi Pragnesh,
Sorry for my late reply. I missed this comment somehow. I apologize for the confusion. It is actually supposed to be 30 libraries per file type per site. So a site actually had 90 libraries. The total storage for the files would be 585GB for the PDFs as you state, but the Word and Picture files were 1.5MB so 60 libraries @ 1.5MB gives me about 439GB per site. All together it comes to about 1TB exactly, but with the overhead (metadata storage and other site components) about 1.1TB to 1.2TB was actually stored once the data completed upload.
Again, sorry for the confusion, but thanks for the question.
Dave