Me and my friend were discussing this the other day about how he said RAID is no longer needed. He said it was due to how big SSDs have gotten and that apparently you can replace sectors within them if a problem occurs which is why having an array is not needed.
I replied with the fact that arrays allow for redundancy that create a faster uptime if there are issues and drive needs to be replaced. And depending on what you are doing, that is more valuable than just doing the new thing. Especially because RAID allows redundancy that can replicate lost data if needed depending on the configuration.
What do you all think?
Yeah and Titanic was unsinkable.
If the controller in your SSD fries, it doesn’t matter how many unused gigabytes your SSD has got for relocating bad sectors. It is still fried. For you, that data is forever gone.
This is why you have redundancy. Full redundancy. You can go for RAID1, one disk die and you still have no data loss, or go bananas with RAID6, two full disks can die and you’re still going strong.
Ps. Spinning harddrives have had hidden sectors used for relocation of bad sectors for ages. It’s nothing new. If you have to much time on your hand, Google harddrive hidden sectors nsa.
…absolutely, positively, super false. I work in a sector where we’re constantly dealing with huge capacity enterprise SSDs - 15 and 30 terabytes at times. Always using RAID. It’s not even a question. Not only can you have controller malfunctions, but even though you’ve got what’s known as “over provisioning” on the SSDs, you still need to watch out for total disk failures!
Unlike hdd, I never experienced graceful disk failures on ssd. Instead, they just randomly decided to die at the most inconvenient time. Raid 1 saved my hide a couple times now from those ssd failures.
Yep. While it has been decades since I had a home SSD failure. But I have had 2 SSD failures in the last 10 years in server hardware. In the first case it was RAID striped and I needed to restore from backup. In the second case it was part of a raid 1 array and I just requested a replacement and got on with my day.
In my house, I have non raid SSDs on my own PC. But important stuff is on my NAS made up of 4xHDD drives in raid 5 (that also has the important folders backed up to an encrypted cloud).
RAID still has a place in an overall data security solution. Especially for servers that you want to keep up.
Reminds me of the days that cdroms were brand new and advertised like indestructible, with photos of elephants walking over it. Having said that I assume SSD disks can break like other hard disks can break, and in that case RAID can save a lot of time to get a computer back up especially when a lot of data is involved.
Had a microsd card literally break in half last week. They’re definitely not invincible
Yeah they sometimes get touted as that
Was that a SteamDeck? 🙃
3ds
Ok. Coz it is really common for SteamD users to forget removing SD card when didassembling device. Lots of cards have been lost
Actually that’s kinda what happened. I inserted the card to test if it was working before I put the bottom back on, but forgot to take it out. When I started screwing the bottom back on I heard a snap and that’s when I realized…
Definitely a lot of data lost, but most of it is redownloadable.
Funny. Growing up, I was taught to be extra careful with CDs because the moment you look at them wrong, all your data gets corrupted.
you can replace sectors within them if a problem occurs
That won’t help you if sector where your data is located dies!
its not about the individual drive… its about total drive failure… if that ssd’s controller dies it doesnt matter if it has extra data sectors.
that said, I moved on from raid by mirroring multiple , unraided NAS devices for redundancy with data stored specifically on the drives in such a way as to eliminate cross disk logical volumes.
This is a total load of bullshit, your friend is wrong
I don’t think the internal wear-leveling and overprovisioning of SSDs can or should be able to replace raid. Disregarding a dead sector without losing capacity is great, but it won’t help you when (for example) the controller dies.
Depending on the amount of data you’re storing SSDs also might be too expensive.
The only exception is maybe Raid 0 in a normal PC. Here it’s probably better to just get one disk for each logical drive.
RAID0 has always been playing with fire
Its very much still needed and heavily utilised in the enterprise world. Volume size is usually the lowest priority when it comes to arrays, redundancy and IOPS (the amount of concurrent transactions to the storage) is typically the priority. The exception here would be backup and archive storage, where IOPS is less important and volume size is more important.
As far as replacing sectors goes, I’ve never heard of this and I might just be ignorant on the subject but as far as I know you can’t “replace” a bad sector. Only mark it as bad and not use it, and whatever was there before is gone. This has existed since HDD days. This is also why we use RAID - parity across disks to protect data.
Generally production storage will be in RAID-10, and backup/archive storage in RAID-6 or in some cases RAID-60 but I’m personally not a fan.
You also would consider how many disks are in the volume because there is a sweet spot. Too many disks = higher likelihood of total array failure due to simultaneous disk failures and more data loss in the event it does, but too few disks and you won’t have good redundancy, capacity or performance either (depending on RAID level).
The biggest change I see in RAID these days is moving away from hardware RAID cards and into software-based solutions like Microsoft Storage Spaces, md, ZFS and similar. These all have their own way of doing things and some can even synchronise the data with other hosts.
Hope this helps!
As far as replacing sectors goes, I’ve never heard of this and I might just be ignorant on the subject but as far as I know you can’t “replace” a bad sector.
Ssds maintain stats on cell writes and move data when a cell nears it end. They keep spare capacity hidden from end users for this. Not using part of the drive increases also this spare capacity.
However ssds do fail and moving data to spare cells doesn’t change that.
- Bit rot is still a problem, you need a high integrity file system and or RAID to avoid that
- Full drive failure is still about as likely, IE the main reason for RAID of multiple drives in the first place.
A good read on the problems with SSDs SSD 101: How Reliable are SSDs?
I found this article from the one you posted. It is crazy think DNA can be used for storage one day.
I do recall google apparently stopped using raid in some data centres, but it was because they had whole-machine redundancy.
RAID is probably redundant for some of the uses it used to have, like optimising read performance by using many drives (SSD is fast) and honestly I suspect that SSDs are probably more reliable as they don’t have a bunch of platters and bearings and screaming rotational speeds.
So if you needed it for a base level of reliability, an SSD on its own may have exceeded that.
I suspect there are still uses for drive redundancy in some high availability setups… although your friend might be right. If the likelihood of drive failure is lower than other parts in the machine and you need high redundancy for availability it might make more sense to replicate the whole machine rather than the drives.
It’s possible redundancy specifically for the drives was an artifact of unreliable drives back in the day 🤔 they might have a point! I think it’s likely still useful at times though.
I’d rather hotswap a drive than set up a new server, even if it’s a less likely scenario.
due to how big SSDs have gotten and that apparently you can replace sectors within them if a problem occurs
True, but that’s something an SSD does internally and is just there to prolong the lifespan.
You definitely still want a raid if you want to keep a system running during a disk failure. No amount of extra sectors and wear leveling will safe you from that
yeah but if SSD failing is now less likely that other parts of the machine it might be better to focus on a redundant server to fail over to… it’s an interesting thought. RAID isn’t obsolete I don’t think but it’s an interesting question
Hmm but in a server enviroment wouldnt it be possible for ssd to reach their wear level much faster and therefor fail due to that ( depending on the workload of course ).
yeah true. I guess what I’m saying is the considerations probably have changed, I seriously doubt RAID is no longer useful though.
Higher end Samsung ssds were dying a lot faster than they should. I dont know what drugs your friend is on thinking they cant fail but theyd better have enough for the rest of the class.
I’d say “old” RAID could be dead if you have proper backups and have the ability to replace a defect drive fast in the case uptime is crucial. But there’s also modern RAID like btrfs and zfs that also can repair corrupted filed, caused by bitrot for example. Old RAID can’t do that also hardware based RAID couldn’t either when I used it until years ago. Maybe that changed but I don’t see the point of hardware based RAID in most cases anymore
Hardware raid can 100% do any of the above tasks, and has always been able to do them. You need an actual raid card, not some half assed baked in mobo raid.
Hardware RAID was doing all of the above before software RAID was available to end users.
But AFAIK real RAID don’t perform CRC, thy rely on drive to report bad sector. In case if on one drive data got corrupted, it would return data from one drive or another. In case of mirroring. If we aren’t talking about RAID 6 I think.
I wonder how to detect real raid card from simple switch? I guess to look at price and it should be really high?
Most discrete raid cards will do the job, but look for on card caching and a battery for “quality.”
AFAIK only officially supported RAID modes in BTRFS are RAID0 and RAID1.
RAID56 is officially considered unstable.
Raid56 is a risky one in more filesystem than just btrfd though, but if you have a ups as backup, you should be fine.
UPS won’t protect from Kernel Panic, sadly
True
What about dm-raid? Is it still risky? I guess so, because it’s separate devices. So any software raid with 5-6 would be problematic?
I’d say “old” RAID could be dead if you have proper backups and have the ability to replace a defect drive fast in the case uptime is crucial.
RAID and backups serve different purposes. Backups are to prevent data loss, RAID is to prevent downtime in case of hardware failure. They are not interchangeable.
Different purposes true, but not exclusively. RAID only has effect on drive failure specifically. If downtime is intolerable then it’s not the right solution to just use RAID and you should look into total redundancy of the hardware and more. It also comes with performance bottlenecks or improvements depending on the setup, that’s another factor to take into account. So in the end it really depends on your requirements and backups can actually serve as an alternative, depending on your setup and as long as it meets your RTO
SSDs man, I personally still don’t trust them for primary storage. My data array is unraid, several spinning disks. Spinners just always work for me, there are gotchas of being jostled or turned off incorrectly, but if you treat them well they’ll last a real long time. Plus the double redundancy of my array and I’m very happy with it. (Plus I don’t see 20TB ssds on the market for 300 bucks either).
SSDs though wear out, they only have so many IOPS in them. I had some in a traditional raid and it just ate through them. Too many writes and I had 5/6 fail on me. I use them now as cache drives, for unraid you can set a faster drive to store data temporarily, and then it will move it off the cache drive later onto the main array, and that’s a level of risk I’m happy with.