Anyone know about RAIDs?

Discussion in 'Domains, Hosting and Servers' started by Abomination, Feb 11, 2010.

  1. Abomination

    Abomination Zealot

    Joined:
    Jun 1, 2009
    Messages:
    1,514
    Likes Received:
    102
    I've been watching a thread on Hostgators support forums, and to me, it makes absolutely no sense what so ever.

    http://forums.hostgator.com/emergency-maintenance-lebaron-t64092.html

    They had a problem with the RAID and they 'rebuilt' the raid over the course of several days, they never say they replaced the RAID, although people are assuming that happened.

    If they could read the information on the disk, then why didn't they copy the information onto another raid?

    Why didn't they take the server offline to quickly fix the problem?

    Why is that particular sever so busy? Why didn't they remove certain accounts from it and put them on another server?


    Is what they did reasonable? Does it normally take 4-5 days to fix a problem like that?

    :waiting:
     
  2. Vekseid

    Vekseid Regular Member

    Joined:
    Jun 2, 2009
    Messages:
    393
    Likes Received:
    13
    There are a lot of different configurations. I don't use RAID-5 or RAID-6, so I have no idea what the time frame for rebuilding those sorts of arrays would be.

    The entire point of a RAID (besides RAID 0) is not to 'get a new RAID'. A drive fails, and you replace that drive, which then gets rebuilt using the contents from the other drives - it's possible to do this live while the server is running.

    Naturally, this hits performance, since at least one drive is already out of commission (since it broke)and you need at least one or more drives dedicated to replacing it. In a RAID 1, this only takes one drive out of typically a 3-drive setup (the third drive here is sometimes called a business continuance drive - if one goes down, one of the remaining drives can be dedicated to replacing the new one while the third can keep the business running), so things will generally work. RAID 10 will take a single set of drives, etc.

    On a RAID 5, however, this is going to require every other drive, and on RAID 6, most of them. If they're doing something idiotic like trying to keep the server live while rebuilding it, I can certainly believe it's taking them a few days.
     
  3. Abomination

    Abomination Zealot

    Joined:
    Jun 1, 2009
    Messages:
    1,514
    Likes Received:
    102
    That is what they did. They kept what they said was an extremely busy server online while re-building a RAID 10.

    If the server was offline, how long might it take? 4 hours? If you read that thread the server was not fully functional during those 4 days. Many were without email and other essential services.
     
  4. Vekseid

    Vekseid Regular Member

    Joined:
    Jun 2, 2009
    Messages:
    393
    Likes Received:
    13
    Depends on the size and speed of the drives, but if they only have a four-drive RAID 10 that's a potentially serious problem and I'd be leery of them.
     
  5. Abomination

    Abomination Zealot

    Joined:
    Jun 1, 2009
    Messages:
    1,514
    Likes Received:
    102
    Well you can read the thread, it is not that long, and like so many threads in their tech support forums they locked it.

    I am leery of them for many reasons, they got me through the first couple of years, but was glad to leave. Really glad.

    If anyone else has input that would be great.
     
  6. Vekseid

    Vekseid Regular Member

    Joined:
    Jun 2, 2009
    Messages:
    393
    Likes Received:
    13
    ...yeah. Four drive RAID 10. So, no rebuilding and serving data at the same time.

    Most professional hosts will use an additional mirror set, so the number of drives will be a multiple of 3. They should have shut down and rebuilt because if during all that stressed activity the drive it was rebuilding from had a problem, they would have lost all data on the server.

    And then their customers would go from happy to ecstatic.
     
  7. benivolent

    benivolent Newcomer

    Joined:
    Oct 13, 2009
    Messages:
    3
    Likes Received:
    2
    First Name:
    Subramanian
    Basically raid combines two or more hard disks into a single logical unit using special hardware or software. It has three concepts namely mirroring, striping and error correction. Raid will be in different scheme architecture followed by a number, as in RAID 0, RAID 1, etc.

    Web Design Company | Seo Company | PHP Website Development
     
  8. Abomination

    Abomination Zealot

    Joined:
    Jun 1, 2009
    Messages:
    1,514
    Likes Received:
    102
    Wow.

    Additional mirror set: 2 different 4 drive RAID 10s, or a 5 drive RAID 10?
     
  9. Vekseid

    Vekseid Regular Member

    Joined:
    Jun 2, 2009
    Messages:
    393
    Likes Received:
    13
    Six drives. You typically only see one RAID on a machine, especially with RAID 10.

    A RAID 1 is a set of mirrored drives - they all have the same content. It only speeds up read access, but provides full reliability in the event a drive fails
    A RAID 0 is a set of striped drives - easy contains a separate part of the whole data set, striped (so drive 0 contains block 0, drive 1 contains block 1, drive 0 contains block 2 ...)

    A RAID 10 is a combination of this scheme - you have multiple RAID 0's, and then you mirror them. Most hosts only stripe pairs, so each set will have two drives, so when you see a RAID 10 on a professional host, it's usually six drives - three RAID 0's that each contain the same data.

    So my reference to the four-drive RAID 10 - they have two drives per set, striped, and two sets. This provides four times the read speed and twice the write speed of a single drive. Okay.

    When one of the drives in that RAID fails, the system still functions and all of the data remains, but now it's vulnerable. Because half of your data now only exists on one drive, and the other half of your data - though mirrored - is useless without the other half (because it's striped).

    So they plug in the fourth drive so that the RAID can be rebuilt. While the server is still running.

    If they had three sets of drives (six drives), this would be fine - you'd still notice a slowdown for a few hours, but it wouldn't be a critical problem, as one drive is still able to serve data to both web users while another helps rebuild.

    With only the two sets though, the drive with the vulnerable half of the data has to be dedicated to help rebuilding. Worse, though, any additional write that occurs (like say, web server logs), needs another write afterwards. And I would not be surprised if they didn't disable atime, either, which means every single page access generates a write.

    Insane.
     
  10. Abomination

    Abomination Zealot

    Joined:
    Jun 1, 2009
    Messages:
    1,514
    Likes Received:
    102
    You careful and well thought out reply is much appreciated. I've spent several minutes digesting the information. I actually understand about the speed of the reads and writes because the seek times are presumably interleaved.


    From my simple point of view having a raid 1 with mirrored drive would allow for the server to be stopped, new drive added, and the raid to be re-built, while the server is offline. That makes sense to me. But that would mean that the server would be down during that time.

    So I suppose the entire point of having a raid 10 either 4 or 6 drives is to keep the server online during the rebuild process.


    Is what I typed basically the way it works?
     
  11. Vekseid

    Vekseid Regular Member

    Joined:
    Jun 2, 2009
    Messages:
    393
    Likes Received:
    13
    Hot swappable is a different concept from RAID. Any RAID (aside from RAID 0) can rebuild live if the drive bays are hot swappable. Pop the old drive out and put the new drive in while the machine is running.

    The 1 in the RAID simply means mirroring - you have x copies of the same data where x is the number of drives you use. You can read x times as fast but only write at the speed of the slowest drive (sortof - many RAID cards come with a hefty amount of RAM and can buffer writes). When a drive fails, one of the remaining drives is going to want to spend its time rebuilding the replacement.

    The 0 in the RAID means striping - you have y drives storing a single set of data, and both reads and writes are y times as fast. But - if one drive fails, you lose everything.

    So in any sane production setup, no one is going to have a RAID 0 for data they are worried about.

    RAID 10 combines these two methods - you have multiple sets of mirrored drives (data is mirrored between x drives, then striped across y sets of x drives - read speeds are x * y, while write speeds are y times faster). RAID 01 is the opposite - data is striped and then mirrored, but the final effect is basically the same (I think I confused the order above is all, sorry).

    If you only have one mirror - whether in RAID 1 or RAID 10 - that mirror needs to be dedicated to rebuilding the array when the set fails. That means if you are also trying to do other things with that mirror (like try to run several hundred websites) - it's going to get exceedingly slow.

    If you have -two- mirrors, though, you can effectively continue business as usual with only minor disruption. Three-drive RAID 1, six-drive RAID 10, and so on.

    ...I hope that was clearer. : /
     
  12. Abomination

    Abomination Zealot

    Joined:
    Jun 1, 2009
    Messages:
    1,514
    Likes Received:
    102
    You were quite clear, although my typing may not have shown it.

    But that will not affect the amount of re-build time it would take, right? A 4 disk RAID 10 with hot swap would have all of the same issues you mentioned.


    Reference link I found (warning there is a pop-up of some kind):
    Drive Swapping
     
  13. Vekseid

    Vekseid Regular Member

    Joined:
    Jun 2, 2009
    Messages:
    393
    Likes Received:
    13
    Correct.

    There needs to be another mirror (for a minimum of six drives in RAID 10, or three drives in RAID 1) in order to handle the situation gracefully. One drive or set of drives can then still serve data while the other performs the repair. It will still be slower, but various things can ease up on this a lot - disabling atime and mounting /tmp as tmpfs should be done anyway, and if it's a serious issue you can turn off logging for a few hours, keeping the number of writes to a minimum.
     
  14. Abomination

    Abomination Zealot

    Joined:
    Jun 1, 2009
    Messages:
    1,514
    Likes Received:
    102
    Thank you once again.

    I looked at wire tree and they have "8 Hot-Swap SATA II Raid optimized drives w/ NCQ in battery-backed RAID-10 on each system."

    Looked at ServInt and they have "15K RPM SCSI/SAS Hard Drives in RAID 10", I'm not sure how many disks though.
     
  15. Vekseid

    Vekseid Regular Member

    Joined:
    Jun 2, 2009
    Messages:
    393
    Likes Received:
    13
    Can always e-mail Servint and ask : )

    I've always been rather skeptical of Wiredtree's technical competence, personally. They've made me go O_o a few times. I use Ubiquity but in the professional VPS market you'll find most respectable companies charge similar prices and offer similar quality of service. Just don't try to 'fool' the VPS if it's virtuozzo - go with CentOS or RHEL. Taking the less traveled by path there is fraught with headaches. >_>
     
  16. Abomination

    Abomination Zealot

    Joined:
    Jun 1, 2009
    Messages:
    1,514
    Likes Received:
    102
    You bring up several points that could take this conversation beyond RAID drive configurations : )

    I looked at Ubiquity, they look like a good company. How many drives on your RAID 10?
     
  17. Vekseid

    Vekseid Regular Member

    Joined:
    Jun 2, 2009
    Messages:
    393
    Likes Received:
    13
    I'm poor so I only use two drives in RAID 1. They're not hot swappable anyway, so downtime is unavoidable. Rather than go with RAID-10 I plan on using multiple RAID 1 arrays, if it ever comes to that.

    Their VPS's run on six-drive RAID-10 arrays I believe.
     
  18. Abomination

    Abomination Zealot

    Joined:
    Jun 1, 2009
    Messages:
    1,514
    Likes Received:
    102
    How do you find out what RAID configuration they use on their standard vps's?

    I see nothing on their site about RAID configurations.
     
  19. Vekseid

    Vekseid Regular Member

    Joined:
    Jun 2, 2009
    Messages:
    393
    Likes Received:
    13
  20. Abomination

    Abomination Zealot

    Joined:
    Jun 1, 2009
    Messages:
    1,514
    Likes Received:
    102
    Strange they are using a wiki. Perhaps they did not want to clutter up their main page.

    Something does not add up.

    Ubiquity VPS: 24 containers, and each of those containers gets 1.5GB & 80GB 'web space', which I assume should be considered hard disk space. They state 24 of those on a node which would make the node have 36GB of ram and 1.92TB of 'web space', yet their servers come with 36GB of ram, leaving nothing left over for the hardware container (container 0).

    And they state the disks are:
    6x 500 GB 7,200 RPM Nearline SAS Hard Disks or 8x 300 GB 10k RPM SAS

    There must be much more going on.

    I will say that because of your input they are now on a list of potential hosting providers in case I ever need to switch. : )
     

Share This Page