Guest User

Untitled

a guest
May 9th, 2025
14
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 5.99 KB | None | 0 0
  1. After reading over most of this thread... The requirements are vague, but I'll take a stab at a interpretation of the requirements and a solution to fulfill those requirements.
  2. Sidenote, in the following stream of thinking, I realized I am using byte and tibibyte measurements interchangeably (GB/GiB, TB/TiB, PB/PiB, etc). If this triggers your inner pedant, you will get over it...
  3. Requirements:
  4. 1PB +
  5. Two system - replicate data
  6. Ability to grow the filesystem without rebuilding
  7. Standard hybrid performance
  8. Backup solution that keeps all changes for 1 year
  9. To get you anything better than that, the following list of information would be helpful.
  10. Current system specs
  11. IOPS and throughput metrics during normal use
  12. Network utilization metrics during normal use
  13. The output from the following commands
  14. lsblk
  15.  
  16. lsblk -d -o VENDOR,MODEL,NAME,LOG-SEC,PHY-SEC,MIN-IO,SIZE,HCTL,ROTA,TRAN,TYPE
  17.  
  18. zpool status
  19.  
  20. zpool list -o health,capacity,size,free,allocated,fragmentation,dedupratio,dedup_table_size,ashift
  21.  
  22. sudo zfs list -o type,volsize,used,available,referenced,usedbysnapshots,usedbydataset,usedbychildren,dedup,logicalused,logicalreferenced,recordsize,volblocksize,compression,compressratio,atime,special_small_blocks
  23.  
  24. Replacement Systems Spec:
  25. If it was me in your shoes... With the information about your situation that we have...
  26. I'd do the following.
  27. Get two of the following systems. One for the primary storage and the other as your replica target.
  28. Dell R750/R760/R770 (or similar, and brand will do)
  29. 24 x 2.5" nvme
  30. NVME is key here.
  31. 2 x Xeon Gold (or AMD equiv. I'm just not as well versed in AMD server CPUs)
  32. 12+ core / CPU
  33. Fewer fast cores is better than many slow cores, but it's a balance
  34. IMHO, I'm open to others thoughts on this.
  35. It's a bit difficult to know how much CPU overhead will be required, so better to spec too much than not enough.
  36. 512GB+ memory
  37. More if possible, your ARC will thank you.
  38. Recent Xeon CPU's have 8 memory channels each
  39. 8 x 2 = 16 sticks of mem
  40. 16 x 32GB = 512GB
  41. 16 x 64GB = 1TB
  42. Dell Boss card
  43. or any raid1 boot device
  44. multiple 10/25Gbe NIC Ports
  45. or 40/50/100Gbe if your usage justifies it
  46. SAS HBA with external ports
  47. JBOD Expansion Disk Shelf(s)
  48. SAS connected
  49. 3.5" Drive Slots
  50. Enough drive slots to hit space requirements + redundance and spares
  51. Multiple options for this part.
  52. Lets go with the Dell ME484 (For the sake of discussion...)
  53. SAS JBOD
  54. 84 x 3.5" SAS Drive Slots
  55. Storage Setup:
  56. Let's assume we have all of our hardware except the storage drives.
  57. Our hardware is racked, connected, powered on, and OS installed. (I'll ramble about the OS selection later)
  58. We now need to select the drives and pool configuration for our new storage server.
  59. What we have to work with:
  60. 24 x 2.5" NVME drive slots
  61. 84 x 3.5" SAS drive slots
  62. Assumptions:
  63. 3.5" Capacity Drives
  64. Intended use: Primary storage
  65. 84 x 20TiB SAS
  66. 2.5" NVME Drives
  67. Intended Use:
  68. Special vdev
  69. SLOG
  70. L2ARC
  71. Multiple possibilities here
  72. Option 1 - Easy Setup/Good Performance
  73. 3 x 3.2TiB NVME mixed-use SSD
  74. Special
  75. This could be a single mirror if your risk tolerance allows it
  76. 2 x 400/800GiB NVME write-intensive/mixed-use SSD
  77. SLOG
  78. This could be a single disk if your risk tolerance allows it
  79. 400GiB+ is way overkill for an slog. But the best performing NVME don't come in 10-20 GiB sizes...
  80. 1 x 3TiB+ NVME/SAS mixed-use/read-intensive SSD
  81. L2ARC
  82. Option 2 - More challenging setup/Better Performance
  83. 6 x 3.2TiB or 6.4TiB NVME mixed-use SSD
  84. Special/SLOG/L2ARC
  85. For a general use workload, I'd buildout something like this...
  86. zPool Structure:
  87. 8 RAIDz2 vDEVs
  88. Each vdev = 10 x 3.5" 20TiB
  89. Usable Space = 1.28PiB
  90. Support VDEVs
  91. Option 1 (Easy setup/Slower/Boring)
  92. Special VDEV
  93. triple mirror - 3.2TiB
  94. SLOG
  95. mirror - 400/800GiB
  96. Depending on your risk tolerance, this could be a stripe
  97. L2ARC
  98. Single 3TiB+
  99. Option 2 - (Significantly better performance/challenging setup)
  100. 6 x 3.2TiB+ mixed-use
  101. Split each NVME disk into three separate namespaces
  102. NS1 - slog - 10GiB
  103. This will likely never need to be larger than 10GiB
  104. NS2 - L2ARC - 1TiB
  105. ~30% remaining space (loose guideline I made up just now)
  106. NS3 - Special - 2 TiB
  107. The rest of the remaining free space
  108. Config
  109. SLOG - NS1
  110. 3 x mirrors (Safe option)
  111. 1 x 6 disk stripe (Double performance/slightly higher risk)
  112. Likely, either option will be bottlenecked by the spinning disks.
  113. L2ARC - NS2
  114. 1 x 6 disk stripe
  115. 6TiB Total size of L2ARC
  116. Special VDEV - NS3
  117. 2 x triple mirror (Safe option)
  118. 4TiB for metadata
  119. 3 x mirror (50% faster/slightly higher risk)
  120. 6TiB for metadata
  121. Storage Summary:
  122. 1.28 Petabytes = Total Usable Space
  123. 4/6 Terabytes = NVME SSD storage for metadata
  124. 6 Terabytes = NVME SSD storage for L2ARC (Read cache)
  125. 60 Gigabytes = NVME SSD storage for SLOG (Write cache)
  126.  
  127. Future Expansion:
  128. Primary storage:
  129. Add another disk shelf that is populated with a minimum of 10 disks.
  130. zpool add POOL-NAME raidz2 new-disk1..10
  131. Boom! you just added 160TiB to your pool.
  132. Support vdev's:
  133. This gets a bit more complicated since it will vary based on which support vdev config you picked. But, the minimum number of disks to expand the SSD pools is equal to the single mirrored vdev with the most disks. So if you have a triple mirror, you have to add 3 disks to expand. If you only have a single mirror, you would need two disks to expand.
  134. Let's assume you went with the better performing and more complex config.
  135. Now, since all three support vdevs occupy part of each of the NVME disks, when we expand one, for simplicity sake, we expand all.
  136. SLOG and L2ARC are both single disk stripes. They can be expanded with only a single new disk. But, the Special vdev is made of multiple 2-disk mirrors. So to expand it, we need 2 new disks.
  137. So, pop two new matching NVME disks into the available slots. Create your three namespaces on each. Then...
  138. zpool add POOL-NAME log new-disk1 new-disk2
  139. zpool add POOL-NAME special mirror new-disk1 new-disk2
  140. zpool add POOL-NAME cache new-disk1 new-disk2
  141.  
  142. I have thoughts on your backups too. But that will need to wait for another time.
Add Comment
Please, Sign In to add comment