Welcome

Welcome to my blog of papers on IT technology, please provide feedback and suggestions. Enjoy.

Thursday, December 30, 2010

Many ways to skin a mailbox server

I was working on a design my first 2010 enterprise exchange server cluster, and trying to make sense of all the options and wanted see what was the best potential option and how features effected the number of hard drives, and what the impact of using 1TB SATA drives was in comparison to 600GB SAS drives. 
For this I wanted understand the impact of various options on the design and performance of the exchange system. I downloaded and used the “E2010 Mailbox Server Role Requirements Calculator”. This is currently at http://msexchangeteam.com/files/12/attachments/entry453145.aspx .
This is an amazing work of art, that shows what you can do with Microsoft Excel, and horribly useful when trying to account for all the factors that impact the performance of the storage array.
An excellent and in depth article about the details of the storage calculator is located  on the Microsoft exchange team blog is located at: http://msexchangeteam.com/archive/2009/11/09/453117.aspx
The article puts the challenge for me to a fine point.
“Previous versions of Exchange were somewhat rigid in terms of the choices you had in designing your mailbox server role.
The flexibility in the architecture with Exchange 2010, allows you the freedom to design the solution to meet your needs.”
While redundancy in most of the Exchange 2010 server roles is straight forward, the mailbox role in 2010 has a number of features or options that can be deployed.  Some of these features can be combined with others, and other feature can not be combined.
Picking these options will change how the calculator would allocate drives and other factors.  Assumptions you put in effect the outcome, but while puttering with the calculator I got some wild and intractable numbers, or at least in my daily tumble of interruptions, couldn’t keep track of all the options. So I got a quiet moment and decided that I needed to create a table with all the options I was choosing and intentionally kept other options the same.
Running through the various iterations and possibilities, I learned some of the rules on DAG that did not appear to me in any other location.
There were questions that I asked, or other engineers I worked with asked and I ran simulations through the calculator to see how they would impact drive counts and design.
While much of the world revolves around Money, Exchange storage revolves around Spindles. Drive performance is measured by IOPS (Input / Output Operations Per Second), so to abstract the performance of a given Hard drive or drive array IOPS has been use previously as a standard calculator. Previously there were literally hundreds of different kinds of storage disks to choose from, IOPS was the basis. Now that the types of drive available for use in server arrays has been paired down to a Dozen or so, the current calculator uses a pull down table to select from a much shorter potential list of drives.  Given I had reduce the possible selection set of drives to build the array down to two options, I could see plainly in my results how the number of spindles, or disks was more important that the size of the disks.

General parameters in design:
4 Mailbox servers at primary site
1 DAG
Log drives and data drives are all the same size
6000 Mailboxes
1GB mailbox

Options:
                RAID – Redundant array of independent disks. This is a option in the calculator that uses the standard RAID modes to configure disks.
                JBOD – JBOD storage refers to placing a database and its transaction logs on a single disk without leveraging RAID.
                2nd Site – Site redundancy, having email servers at more than one data center
                Log Drive – Separate drives for storage of the log files
                DB Copies – The number of database copies
                Disks @ Site – the number of disks / spindles at each data center, the first seven options only have 1 data center so there are no disks in the secondary column.

Tables:
These are the tables I created from all the versions of the data I put in to the calculator.               

DRIVE TYPE:



1000 GB 7.2k RPM
DISKS @ Site

RAID
JBOD
2nd Site
Log Drives
DB Copies
Primary
Secondary
Option 1
x



2
76
x
Option 2

x


2
x
x
Option 3
x


x
2
84
x
Option 4

x


3
52
x
Option 5
x



3
100
x
Option 6
x


x
3
116
x
Option 7

x

x
3 to 5
x
x
Option 8
x

x
x
4
84
84
Option 9
x

x

4
76
76
Option 10

x
x

4
36
36
DRIVE TYPE:
600 GB 15k RPM
DISKS @ Site

RAID
JBOD
2nd Site
Log Drives
DB Copies
Primary
Secondary
Option 1
x



2
96
x
Option 2

x


2
x
x
Option 3
x


x
2
104
x
Option 4

x


3
76
x
Option 5
x



3
108
x
Option 6
x


x
3
136
x
Option 7

x

x
3 to 5
x
x
Option 8
x

x
x
4
104
104
Option 9
x

x

4
84
84
Option 10

x
x

4
52
52


               
Some things I learned:
·         JBOD is not permitted with less than 3 copies of the MB database or separate Log Drives
·         Drive size of 2000GB didn’t reduce drive / spindle count
·         Number of drives increased at mailbox size threshold level of 1128MB, and 1491MB in another model.
·         Number of drives did not decrease in count with smaller mailboxes
·         Only 100 databases are permitted per DAG so “max active DB” = 100 / Copies of DB
·         SPEC INT value is NOT CPU clock speed 100 is a good number to pick from.
·         You could have different types of hard drives for active and passive copies, but it’s an administrative “challenge” if there are many copies.
·         Log drives could be smaller drives if used
·         Expen$ive $AN storage is not necessary for a High availability solution with exchange 2010
·         4 JBOD copies of MB on 4 servers is more fault tolerant than 2 copies of raid 10 on 2 servers.
·         JBOD minimizes the impact of a single disk failure in a properly maintained system
·         JBOD maximizes the backup performance and reduces backup time
·         The space requirements for public folder storage that needs to be considered

Comments:
Given the calculations and the reduction of drives that JBOD provided, it seems logical from both a cost perspective and a high availability perspective.
Using a standalone self-sufficient mailbox storage server seems the most efficient option. Multiple mailbox servers could be configured as a Kind of RAIS (redundant array of independent servers)  this could be used as a basis for a DAG or Cluster of mailbox server. Since they would not have a Single point of failure or contention in a SAN array, this solution would be easily maintained as each server is completely independent of outside system or factors.
The advantage of a SAN is not beneficial in this case as the number of drive / spindles are more important than raw storage space. SAN has tremndous features for manageing and sharing drives among numerous servers with regular needs.  Since the large Exchange server designs use multiple servers and numerous spindles not just storage space to get response time, spending money on SAN storage, is in this case, frivolous. The increase in componants and complexity to maintainance and supportablity does not provide any benifit in this case, as you would simply be adding the same number of drives in a SAN as you would be in a DAS solution. While the SAN equipment produced by the major vendors provides great performance. Which gets us back to the primary factor in this case is the drive perfomance for dollar.
If one steps back and takes a purely functional view, there are a lot of hard drives involved in satisfying the requirements. While some solutions exist and may be agreed on to provide a “Complete Solution” to a organizations storage requirements. The counter effect is that these solutions may provide features and functionality that are not required or beneficial for a high availability messaging system.
The "HP DL180 G6 rack server" is a exelent example of a server with storage and processing combined in a single chasis, and what I would recomend based on my research. The server can be selected with a drive cage with room for 14 HD in 2U of rack space 1TB HD and 2 SSD.  They provide the required storage and processor in the same rack space as a san solution would require. Addition storage space could be added by adding a HP StorageWorks MSA70 to each server to increase the total capacity by 25 Drives. These systems are inexpensive as enterprise mail servers go, and can provide increased separation of the cluster nodes to increase the resiliance to failures of the entire cluster. This approach also minimizes the overall amount of rack space requried for this solution.

Conclusion:
Using the sizing calculator for an exchange deployment, whether it is new,  or reviewed for growth in sizing, is a very helpful and powerful tool to understand how much computer storage and processing resources are required to provide an acceptable level of performance.
In this example we can see that the storage requirements come out to approximately 10 times the mailbox size based on the requirements designed into the Mailbox sizing calculator. This is a significant amount of storage, and would be a surprise to any organization contemplating a large deployment. These resources will need to be in place
 It would be useful to consider the optimal mailbox size or number of mailboxes that you can accommodate with a given number of drives. There are also considerations for the space requirements for public folder storage that needs to be considered.

No comments:

Post a Comment