Me and my friend were discussing this the other day about how he said RAID is no longer needed. He said it was due to how big SSDs have gotten and that apparently you can replace sectors within them if a problem occurs which is why having an array is not needed.

I replied with the fact that arrays allow for redundancy that create a faster uptime if there are issues and drive needs to be replaced. And depending on what you are doing, that is more valuable than just doing the new thing. Especially because RAID allows redundancy that can replicate lost data if needed depending on the configuration.

What do you all think?

  • Doombot1@lemmy.one
    link
    fedilink
    arrow-up
    1
    ·
    8 months ago

    …absolutely, positively, super false. I work in a sector where we’re constantly dealing with huge capacity enterprise SSDs - 15 and 30 terabytes at times. Always using RAID. It’s not even a question. Not only can you have controller malfunctions, but even though you’ve got what’s known as “over provisioning” on the SSDs, you still need to watch out for total disk failures!

  • redcalcium@lemmy.institute
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    8 months ago

    Unlike hdd, I never experienced graceful disk failures on ssd. Instead, they just randomly decided to die at the most inconvenient time. Raid 1 saved my hide a couple times now from those ssd failures.

    • r00ty@kbin.life
      link
      fedilink
      arrow-up
      0
      ·
      8 months ago

      Yep. While it has been decades since I had a home SSD failure. But I have had 2 SSD failures in the last 10 years in server hardware. In the first case it was RAID striped and I needed to restore from backup. In the second case it was part of a raid 1 array and I just requested a replacement and got on with my day.

      In my house, I have non raid SSDs on my own PC. But important stuff is on my NAS made up of 4xHDD drives in raid 5 (that also has the important folders backed up to an encrypted cloud).

      RAID still has a place in an overall data security solution. Especially for servers that you want to keep up.

  • neidu2@feddit.nl
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    8 months ago

    I wholeheartedly agree with you. It is worth noting that a lot of the use cases of RAID can now be solved via software, but there are some places where hardware RAID still shines, such as redundancy. Yes, software also can provide redundancy, but I still haven’t seen a software solution that is equivalent to a proper RAID controller with a dedicated battery to keep the I/O buffer alive in case of hardware failure. That one has saved me a few times.

    Source: I’m in charge of 6 storage clusters at work. Beegfs is what takes care of the actual clustering, resulting in each cluster clocking in at 1.2PB of storage. Each cluster consists of four machines with three storage volumes each.
    Each storage volume consists of 12 drives in a RAID6 configuration.

    I can yank faulty drives and toss them out and have them replaced with no downtime. I know some like to set up hot spares, but I for one don’t. I’ve even had entire servers die on me, and thanks to additional redundancy provided by beegfs, I’ve changed motherboard with no cluster downtime either. Just move the drives over to an identical machine (yes, each cluster has a dedicated spare machine), import the RAID, and you’re good to go.

    • Revan343@lemmy.ca
      link
      fedilink
      arrow-up
      0
      ·
      8 months ago

      a dedicated battery to keep the I/O buffer alive in case of hardware failure

      Unless I’m misunderstanding, that sounds like you’re worried about the write hole, which RAIDZ doesn’t have

      • neidu2@feddit.nl
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        8 months ago

        It’s mostly a matter of making sure any writes that are interrupted part way through (power failure, etc) are kept alive until the issue has been resolved. The raid controller caches everything until the write is complete.

        It’s not so much about disks being out of sync, but more about preventing data loss.

        • Revan343@lemmy.ca
          link
          fedilink
          arrow-up
          0
          ·
          8 months ago

          RAIDZ is copy-on-write, and will notice and correct parity discrepancies if interrupted partway through. Doesn’t help if you don’t get at least one copy of the data written, but I’d take RAIDZ and a UPS over a hardware raid any day

          • neidu2@feddit.nl
            link
            fedilink
            arrow-up
            0
            ·
            8 months ago

            And at the scale I’m operating, I’ll take hardware raid over raidz any day. I did some performance benchmarking when initially building these clusters, and beegfs really doesn’t like raidz.

            I use raidz at home, though.

            • Revan343@lemmy.ca
              link
              fedilink
              arrow-up
              0
              ·
              8 months ago

              That’s fair. My biggest concern with a hardware raid is the risk having trouble finding compatible hardware if/when a controller dies, but I expect that’s not really an issue at larger scale; you probably buy hardware in bulk and have replacements on hand

  • originalucifer@moist.catsweat.com
    link
    fedilink
    arrow-up
    0
    ·
    8 months ago

    its not about the individual drive… its about total drive failure… if that ssd’s controller dies it doesnt matter if it has extra data sectors.

    that said, I moved on from raid by mirroring multiple , unraided NAS devices for redundancy with data stored specifically on the drives in such a way as to eliminate cross disk logical volumes.

  • AlternateRoute@lemmy.ca
    link
    fedilink
    English
    arrow-up
    0
    ·
    8 months ago
    • Bit rot is still a problem, you need a high integrity file system and or RAID to avoid that
    • Full drive failure is still about as likely, IE the main reason for RAID of multiple drives in the first place.

    A good read on the problems with SSDs SSD 101: How Reliable are SSDs?

  • tobogganablaze@lemmus.org
    link
    fedilink
    English
    arrow-up
    0
    ·
    8 months ago

    He said it was due to how big SSDs have gotten and that apparently you can replace sectors within them if a problem occurs which is why having an array is not needed.

    Buying SSDs with the same capacity as my NAS with 70TB (after raid 6) would cost almost tripple of what my setup (including the NAS) costs.

    So unless you shit money, SSDs are not an option for anything with a decent capacity.

  • spaghetti_carbanana@krabb.org
    link
    fedilink
    arrow-up
    0
    ·
    8 months ago

    Its very much still needed and heavily utilised in the enterprise world. Volume size is usually the lowest priority when it comes to arrays, redundancy and IOPS (the amount of concurrent transactions to the storage) is typically the priority. The exception here would be backup and archive storage, where IOPS is less important and volume size is more important.

    As far as replacing sectors goes, I’ve never heard of this and I might just be ignorant on the subject but as far as I know you can’t “replace” a bad sector. Only mark it as bad and not use it, and whatever was there before is gone. This has existed since HDD days. This is also why we use RAID - parity across disks to protect data.

    Generally production storage will be in RAID-10, and backup/archive storage in RAID-6 or in some cases RAID-60 but I’m personally not a fan.

    You also would consider how many disks are in the volume because there is a sweet spot. Too many disks = higher likelihood of total array failure due to simultaneous disk failures and more data loss in the event it does, but too few disks and you won’t have good redundancy, capacity or performance either (depending on RAID level).

    The biggest change I see in RAID these days is moving away from hardware RAID cards and into software-based solutions like Microsoft Storage Spaces, md, ZFS and similar. These all have their own way of doing things and some can even synchronise the data with other hosts.

    Hope this helps!

    • Blue_Morpho@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      8 months ago

      As far as replacing sectors goes, I’ve never heard of this and I might just be ignorant on the subject but as far as I know you can’t “replace” a bad sector.

      Ssds maintain stats on cell writes and move data when a cell nears it end. They keep spare capacity hidden from end users for this. Not using part of the drive increases also this spare capacity.

      However ssds do fail and moving data to spare cells doesn’t change that.

  • Scrubbles@poptalk.scrubbles.tech
    link
    fedilink
    English
    arrow-up
    0
    ·
    8 months ago

    SSDs man, I personally still don’t trust them for primary storage. My data array is unraid, several spinning disks. Spinners just always work for me, there are gotchas of being jostled or turned off incorrectly, but if you treat them well they’ll last a real long time. Plus the double redundancy of my array and I’m very happy with it. (Plus I don’t see 20TB ssds on the market for 300 bucks either).

    SSDs though wear out, they only have so many IOPS in them. I had some in a traditional raid and it just ate through them. Too many writes and I had 5/6 fail on me. I use them now as cache drives, for unraid you can set a faster drive to store data temporarily, and then it will move it off the cache drive later onto the main array, and that’s a level of risk I’m happy with.

  • lemmylommy@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    8 months ago

    This has nothing to do with ssd or their size. Harddisks also have a little spare area (though not as big) and can mark and remap failing sectors.

    RAID (1) is still (possibly) good for the only thing it ever was (possibly) good for: Keeping the system running long enough for you to put in a new harddisk if one fails.

    Think of industrial systems where every minute of downtime can cost thousands of dollars. And even there the usefulness of RAID can be questioned: should you not in that case have a whole spare system, easy to swap in, because more than just storage can fail?

    And what about the RAID controller itself? Does it not add complexity and another point of failure to the whole system?

    And most importantly: will anyone actually get notified of a failing disk and replace it quickly? Or will the whole thing just prolong the inevitable?

    Would you even trust a system that had one disk fail already to keep going in a critical place? Or would it not be safer to just replace the whole thing anyway after one failure?

    • redcalcium@lemmy.institute
      link
      fedilink
      arrow-up
      0
      ·
      8 months ago

      And what about the RAID controller itself? Does it not add complexity and another point of failure to the whole system?

      This is why people prefers software raid these days instead of hardware raid.

      • Atemu@lemmy.ml
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        8 months ago

        That does not address the point made. It doesn’t matter whether it’s a complex hardware or software component in the stack; they will both fail.

        • redcalcium@lemmy.institute
          link
          fedilink
          arrow-up
          0
          ·
          8 months ago

          Yes, I didn’t address the point made, just want to mention that people are increasingly avoiding hardware raid these days.

  • Dekkia@this.doesnotcut.it
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    8 months ago

    I don’t think the internal wear-leveling and overprovisioning of SSDs can or should be able to replace raid. Disregarding a dead sector without losing capacity is great, but it won’t help you when (for example) the controller dies.

    Depending on the amount of data you’re storing SSDs also might be too expensive.

    The only exception is maybe Raid 0 in a normal PC. Here it’s probably better to just get one disk for each logical drive.

  • bluGill@kbin.social
    link
    fedilink
    arrow-up
    0
    ·
    8 months ago

    Raid often comes with snapshots which can recover from your mistakes. Often the raid can even recover after malware encryhts your disk. you still need offline, offsite backups for the best protection but raid is still a useful part of your data safe

  • DontTakeMySky@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    8 months ago

    Maybe maybe MAYBE for a prosumer desktop situation it’s less necessary than it used to be. But it’s absolutely still needed, your friend is dumb and reckless with their data.

    Drives fail all the time, not just sectors.

  • lemmyreader@lemmy.ml
    link
    fedilink
    English
    arrow-up
    0
    ·
    8 months ago

    Reminds me of the days that cdroms were brand new and advertised like indestructible, with photos of elephants walking over it. Having said that I assume SSD disks can break like other hard disks can break, and in that case RAID can save a lot of time to get a computer back up especially when a lot of data is involved.

  • SkaveRat@discuss.tchncs.de
    link
    fedilink
    arrow-up
    0
    ·
    8 months ago

    due to how big SSDs have gotten and that apparently you can replace sectors within them if a problem occurs

    True, but that’s something an SSD does internally and is just there to prolong the lifespan.

    You definitely still want a raid if you want to keep a system running during a disk failure. No amount of extra sectors and wear leveling will safe you from that

    • dbilitated@aussie.zone
      link
      fedilink
      arrow-up
      0
      ·
      8 months ago

      yeah but if SSD failing is now less likely that other parts of the machine it might be better to focus on a redundant server to fail over to… it’s an interesting thought. RAID isn’t obsolete I don’t think but it’s an interesting question

      • szczuroarturo@programming.dev
        link
        fedilink
        arrow-up
        0
        ·
        8 months ago

        Hmm but in a server enviroment wouldnt it be possible for ssd to reach their wear level much faster and therefor fail due to that ( depending on the workload of course ).

        • dbilitated@aussie.zone
          link
          fedilink
          arrow-up
          0
          ·
          8 months ago

          yeah true. I guess what I’m saying is the considerations probably have changed, I seriously doubt RAID is no longer useful though.