SharePointing SQL Server: The Next Byte

SharePoint and SQL Server

 

If you’re all caught up on my first post regarding Installing & Configuring SQL Server for SharePoint Farms, then let’s happily move forward with my next installment of “SharePointing SQL Server: The Next Byte…”

Hello everyone! The sun is shining bright. It’s hot outside. The grass needs watering. I’m sweltering.  But we battle through it all to continue our discussion of…

‘SharePointing’ SQL Server!

For those taking notes, we’ll be covering:

  1. What are drives/partitions?
  2. How many should I have?
  3. How do drives/partitions impact “SharePointing SQL Server”?

Particularly, we’ll be talking about partitions and partitioning. No, not what happened to Berlin in 1945. Rather, the dividing of storage drives into usable, computer-addressable units. No, not breaking or cutting drives up.  Rather, using drive heads to create magnetic boundaries onto a hard drive storage unit. I say storage unit rather than just a drive because the unit can be comprised of a single disk, multiple, vertically stacked disks (called platters) into a multi-plattered drive, or by arrays of multi-platter drives ranging from two such drives to hundreds of such drives.

So, why bother breaking drives into multi-platter disks or arrays of independent drives? It’s possible to make a single-platter drive that is large enough for most user’s computer. But I suspect it would be quite wide through the waist. And that’s why we save space by making many smaller disks and stacking them vertically on a common spindle.

Some would say we also get the advantage of having more read/write heads with vertically stacked platters. However, I can easily perceive an arm with multiple heads on it, moving back and forth on a giving large platter. That would tend to reduce any advantage gained by having stacked platters.

And thus starts our conversation:

(You say)

Ok, I can see why we have multiple disks, but why do we need to divide them up? Isn’t that foolish after bringing them together?

(I say)

Oh, I don’t mean breaking them up, physically. That would be illogical. So we logically divided them up.

(You say)

Smaller first, then larger?

(I say)

No, create magnetic boundaries in such ways that make it look like we have more drives than we actually have. And even make the partitions of different sizes.

(You say)

You’re telling me it’s logical to use magnets to pretend we have more drives than we have? Wow, that’s different. Why in the world would we want to use more than one drive per computer? We should buy the largest one we need for each computer and then put all that user’s data and programs on it. That’s simple, right?

(I say)

Yes, far too simple, it would seem. We’d lose out on the economy of scale. Not all computers will need the same storage as the others will. If we need ten terabytes of data scattered over a hundred computers, some will need more than the average, and others less.  It would be difficult to move unused storage from one computer to another if all drives are the same size. By having central storage for very large amounts of data, and then dividing the drives into partitions where only the desired users can access the various partitions, we save money. Especially if we can tailor the partitions to the users needs. So by being more complicated, we can be simpler. Having a large pool of readily adjusted, centralized storage is much simpler.

(You say)

Ok, I’m persuaded. We need readily adjusted space so we are able to quickly adjust the space available to a given computer, but why not just one really large server with adjustable divisions?

(I say)

Because we also need to consider a fact of computer life. Write/read times count. The more read heads which can reach a given storage area, the faster the read/writes will be. So the more drives and drive heads we have, the faster our operation. And customers like fast. The faster a computer is, the more likely customers are to pay money for it. And I like customers who pay…

(You say)

Ok, we want to have as many small drives as we can fit into a box, so we have happy customers.

(I say)

Not exactly. We want many, many drives. More drives than we can fit into a computer cabinet; so we put them into a box of their own and group them together with software. We then call that group a LUN. We then subdivide that LUN into partitions, usually named with a Letter such as C or D.

(You say)

LUNs! What does that have to do with computers? Sounds like LUNacy to me. So we take a LUN and assign it to a computer. And call it the C drive. And why the C drive? Why not the A drive? Why name a drive after an ancient language, long forgotten, like Latin or COBOL. And what happens if we end up with more than 26 partitions?

(I say)

More on the C drive later… No, we actually give them more than one drive, but it can be from the same LUN. We do that because of a peculiarity of SQL Server. The more partitions and the more LUNs which are connected to a given SQL Server, the more threads it is encouraged to open when doing operations, up to a point.

So we like to have at least four partitions in SQL Server.

  1. C drive for the O/S, and formatted with 4k blocks.
  2. ‘Nother drive, usually D for the primary database data files.
  3. ‘Nother drive, usually L for the database log files
  4. ‘Nother drive, usually T for the Tempdb

One of the considerations in having multiple partitions is that SQL Server tends to open worker threads for each available partition. The more threads opened, the faster the processing. If two databases have the same overall size, but one has several more partitions than the other, we’d expect the one with the more partitions to process faster than the one with fewer partitions.

And as many other partitions needed to keep the data files below 200 gigs (200 gigs is Microsoft’s recommended maximum size for SharePoint databases), we want to encourage our customers to keep their databases smaller than 200 Gigs.

And, of course, there will be different partitions and names if this is a clustered SQL Server.

The log files get their own partition, because everything written to a database data file is first written to the database’s log file, and later from the log file to the database file. By having two different sets of write heads, one writing, in serial fashion to the log file, and the second writing with a search pattern to the data file, we minimize one choke point and improve performance.

Tempdb data file gets its own partition. I’ve never seen an authoritative explanation as to why. I’ve come to believe it’s due to the differences in drive head activity between user databases and tempdbs. User databases use a search movement for the read and write activity as the heads are constantly moving to reach a non-serial data location. Tempdb mostly sorts data, and uses much more of a serial read/write activity. Also, tempdb activity probably can be done more in parallel to data activity – so by giving the tempdb its own threads, we presumably improve the parallel read/write activity.

p.s. Only the tempdb data file is put into the tempdb partition. Tempdb data and log activity have much the same relationship to each other as do the user databases.  So the tempdb log file goes into the log partition.

(You say)

Really, and just why is the C drive called the C drive?

(I say)

Instead, ask me about sizing the partitions. Because I’m guessing you may have tried to size them but were unable to get your hands around them.

The optimal partition size depends on how you actually use SharePoint. Different usages have different size requirements. If you use SharePoint mostly for document presentation, there’s very little demand placed on the tempdb or on the individual log files. So those should be small – say less than 10% of the total for the logs and 5% for the tempdb. If you have a lot of interactivity with your online customers, you might need a lot more for either or for both. Most Sharepoint SQL Servers I’ve worked with never have a log file grow larger than 2 gigs, especially after the initial load. Most tempdbs never grow larger that two gigs, but some have grown to 15 gigs.

My advice for SharePoint users is to use 10% of the available space as log files until that becomes 50 gigs. Then leave it at 50 until usage proves you need more.

Next use 5% of the available space for the tempdb until that space hits 25 gigs. Then leave it at 25 gigs until usage proves you need more.

Create additional partitions out of the remainder for data files. First, create as many equal partitions as you can with a minimum size of 50 gigs and a maximum of 100 gigs. That encourages limiting data files to under 200 gigs. If you have several 100 gig partitions, and a database data file approaching 100 gigs, you can always open another data file for the same database on a different partition. And each time we create a file on a different partition, we can hope to have more threads for processing.

So in summary:

  1. We’ve explained what and why we partition storage units.
  2. We discussed the reasons for have multiple partitions for a SharePoint farm.
  3. We’ve discussed means of allocating available space between data files, the tempdb and the log files.

(You say)

But why do we name the C drive, the C drive?

… Well that, my friend, will have to wait until The Next Byte. Thanks for reading!

 

About Fpweb.net

Fpweb.net provides enterprise SharePoint hosting. Fpweb.net has become the global leader in hosted SharePoint solutions by showcasing an unmatched level of experience and dedication along with unrivaled expertise. That’s why our cloud solutions are trusted in over 80 countries. Save time and money with total server control in your SharePoint private cloud.

2012-07-24T04:35:38+00:00 July 24th, 2012|

Leave A Comment