Hey folks, I’m at my wits end. I’ve been screwing with proxmox for years now, but I’m at a tipping point. I’ve just used consumer SSDs in it to run my VMs off of - but I just realized after a dozen or so crashes over the last week that I think the SSDs are the culprit. (Really, really terrible write speeds leading to kernel crashes I believe).
I’ve never gotten an enterprise SSD, if that’s even what I need. Any recommendations? New? Used? Brands?
Appreciate it
Really. Anything branded from Samsung or Crucial(Micron) is going to be fine. They are the top producers of NAND, produce high quality products, and stand behind warranties. But you are gonna pay out the nose for the privilege of enterprise grade hardware.
You might just be buying lower quality consumer SSD’s though, since even they should be able to handle a surprising amount of abuse.
How do you know you’re getting higher quality? When you’re looking at them they all seem the same
I recently upgraded three of my proxmox hosts with SSDs to make use of ceph. While researching I faced the same question - everyone said you need an enterprise SSD, or ceph would eat it alive. The feature that apparently matters the most in my case is Power Loss Protection (PLP). It’s not even primarily needed to protect from an possible outage, but it forces sync writes instead of relying on a cache for performance.
There are some SSDs marketed for usage in data centers, these are generally enterprisey. Often they are classified for “Mixed Use” (read and write) or “Read Intensive”. Other interesting metrics are the Drive Writes Per Day (DWPD) and obviously TBW and IOPS.
At the end I went with used Samsung PM883.
But before you fall into this rabbit hole, you might check if you really need an enterprise SSD. If all you’re doing is running a few vms in a homelab, I would expect consumer SSDs to work just fine.
Well I have the exact same use case and I just checked and yup, 3 out of 4 drives failed in a year. Those were shitty WD blues though, so I think it’s time to shell out real money
To expand on @doeknius_gloek’s comment, those categories usually directly correlate to a range of DWPD (endurance) figures. I’m most familiar with buying servers from Dell, but other brands are pretty similar.
Usually, the split is something like this:
- Read-intensive (RI): 0.8 - 1.2 DWPD (commonly used for file servers and the likes, where data is relatively static)
- Mixed-use (MU): 3 - 5 DWPD (normal for databases or cache servers, where data is changing relatively frequently)
- Write-intensive (WI): ≥10 DPWD (for massive databases, heavily-used write cache devices like ZFS ZIL/SLOG devices, that sort of thing)
(Consumer SSDs frequently have endurances only in the 0.1 - 0.3 DWPD range for comparison, and I’ve seen as low as 0.05)
You’ll also find these tiers roughly line up with the SSDs that expose different capacities while having the same amount of flash inside; where a consumer drive would be 512GB, an enterprise RI would be 480GB, and a MU/WI only 400GB. Similarly 1TB/960GB/800GB, 2TB/1.92TB/1.6TB, etc.
If you only get a TBW figure, just divide by the capacity and the length of the warranty. For instance a 1.92TB 1DWPD with 5y warranty might list 3.5PBW.
Got it. So I’m thinking my ZFS is what killed these poor drives, who didn’t sign up for that sort of life. I think short term I’ll run over to best buy and get a decent 1 or 2 TB drive to migrate things to just to keep it running (and not use ZFS). From what I’m reading on other forums - yeah ZFS was the killer here.
Long term, maybe enterprise drives, or really deciding if my app server even needs a pool. I did that last time as a “I don’t want to run out of storage for a while” but I’m seeing 4TB drives now for a few hundred bucks. Not cheap, but much cheaper than the 2k they were just a few years ago. I don’t store anything on the app servers, just containers and vms.
Read the data sheets.
You’re mostly going to be concerned with IOPS and endurance for VM hosting.
Endurance rating in TBW or PBW is the main indicator I look for. That and a DRAM cache.
Samsung SM863 enterprise SSDs can be found cheap on ebay, and they’re rated for quite a lot.
SSDs often need firmware upgrades.
Sadly for consumer SSDs the upgrades are not so easy to do on Linux.
Other than that you are unlikely to really need anything other than higher quality consumer SSDs for homeserver needs.
Even the slowest SSD write speeds should be faster than an HDD, and those have been running systems perfectly fine for decades. I’ve never used enterprise SSDs (usually one little consumer SSD, or even USB, for boot/cache and a bunch of HDDs for storage) and I’ve never had a problem.
What kind of hardware are you using?
Currently a few samsung drives. I thought I’d be smart and zfs them together for proxmox, but that hasn’t been working well. Maybe that’s the issue and I just need to split them, I just liked the idea of a lot of storage split up, and that may give me even faster reads/writes. It’s been nothing but a pain though. Hell maybe one of them failed and I haven’t even noticed.
What are you using for a drive controller?
I use mostly Samsung, SK Hynix, Micron and SanDisk. For bulk storage it doesn’t really matter which of those you pick but for fast storage you’ll want to be sure the drive offers PLP.
Go hit up fleaBay and see what’s available in the way of enterprise drives in the size you need then google the model numbers and check out the datasheets. Once you know what each drive is capable of you can decide which to buy. I usually try to buy 3 dwpd models for VM storage and 1.3 dwpd for bulk, you might prefer to focus on IOPS over endurance it’ll depend on your application.
Edit: for a VM host pool you’re primarily going to be concerned with IOPS, endurance and having PLP for better ZFS performance. For bulk storage you can skimp on specs to some extent. I prefer to use cheaper drives like the SanDisk cloudspeed eco line for a bulk storage pool and whatever high IOPS+endurance drives I can find cheap for my VM host pool. When you split your pools you can do things like use mirror zdevs for performance for VMs and raid z whatever for bulk storage.
How many drives are you looking to use, what are they for, what interfaces do you have available on the machine (SAS backplane, SATA, any number of available NVMe hookups of some flavor, etc), what pool topology are you trying to use and what is the intended workload you want to jenga tower off of all of the above? With more info people can give you more specific recommendations. (E: and what sort of machine are you running things on while I’m at it, processors and amount of RAM would be useful)
When looking at the drives, check the “disk writes per year”, as that will give an estimate of what the vendor thinks the life will be.
Seagate brands some good ones. I had a few that went years and none failed. Samsung makes some PCI monsters which are likely overkill.
I would only get new and register then ASAP for the warranty, or cheap from a trusted friend.
You can used Samsung 22110 1TB NVME Drives for $25-$35 and they can be an excellent choice despite the used flash
Where are you finding then for $25-$35? Ebay has them going for $70+
They admittedly seem to have gone up in price since I last looked, but here is a listing for only $40
Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:
Fewer Letters More Letters NVMe Non-Volatile Memory Express interface for mass storage SATA Serial AT Attachment interface for mass storage SSD Solid State Drive mass storage
3 acronyms in this thread; the most compressed thread commented on today has 13 acronyms.
[Thread #323 for this sub, first seen 1st Dec 2023, 20:25] [FAQ] [Full list] [Contact] [Source code]