We have two 24 x NVME storage backends and multiple frontends. I am testing different ways to bring storage to a frontend server and might as well display the results here more publicly.
I used fio and hdparm and I have tested the following configurations so far:
I am using an RDMA Infiniband 56/40Gbps network with ConnectX-3 cards. Each server has a single active connection. I could possibly speed things up by using a different connection per storage server but that may not be required.
For testing I am using two commands:
Supermicro MegaRAID based 2 SSD disk raid as an operating system disk.
hdparm -Tt /dev/sda:
Timing cached reads: 21894 MB in 1.99 seconds = 11005.29 MB/sec
Timing buffered disk reads: 3186 MB in 3.00 seconds = 1061.92 MB/sec
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [w(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=3542: Mon Jun 28 18:39:00 2021
write: IOPS=46.5k, BW=182MiB/s (191MB/s)(12.0GiB/67589msec); 0 zone resets
slat (nsec): min=879, max=127715, avg=1779.53, stdev=650.57
clat (nsec): min=202, max=1674.4k, avg=8290.97, stdev=2533.87
lat (usec): min=7, max=1676, avg=10.07, stdev= 2.73
clat percentiles (nsec):
| 1.00th=[ 6688], 5.00th=[ 6944], 10.00th=[ 7136], 20.00th=[ 7328],
| 30.00th=[ 7456], 40.00th=[ 7520], 50.00th=[ 7648], 60.00th=[ 7712],
| 70.00th=[ 7904], 80.00th=[ 8512], 90.00th=[10816], 95.00th=[11456],
| 99.00th=[16192], 99.50th=[24192], 99.90th=[28032], 99.95th=[36096],
| 99.99th=[50432]
bw ( KiB/s): min=78840, max=418144, per=100.00%, avg=363084.28, stdev=77272.87, samples=69
iops : min=19710, max=104536, avg=90771.12, stdev=19318.21, samples=69
lat (nsec) : 250=0.01%
lat (usec) : 10=85.06%, 20=13.99%, 50=0.94%, 100=0.01%, 250=0.01%
lat (usec) : 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%
cpu : usr=15.36%, sys=14.12%, ctx=3195292, majf=0, minf=701
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,3145729,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=182MiB/s (191MB/s), 182MiB/s-182MiB/s (191MB/s-191MB/s), io=12.0GiB (12.9GB), run=67589-67589msec
Disk stats (read/write):
dm-0: ios=0/127661, merge=0/0, ticks=0/1201685, in_queue=1201685, util=50.69%, aggrios=0/161924, aggrmerge=0/12, aggrticks=0/2302922, aggrin_queue=2302922, aggrutil=56.08%
sda: ios=0/161924, merge=0/12, ticks=0/2302922, in_queue=2302922, util=56.08%
Intel Corporation SSDPE2KE016T8O PCIe NVME local device.
hdparm -Tt /dev/nvme2n1:
Timing cached reads: 17856 MB in 2.00 seconds = 8943.15 MB/sec
Timing buffered disk reads: 6910 MB in 3.00 seconds = 2303.19 MB/sec
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [w(1)][100.0%][w=349MiB/s][w=89.2k IOPS][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=3782: Mon Jun 28 18:45:44 2021
write: IOPS=88.6k, BW=346MiB/s (363MB/s)(20.3GiB/60194msec); 0 zone resets
slat (nsec): min=580, max=81950, avg=1524.49, stdev=380.18
clat (nsec): min=220, max=600600, avg=7236.48, stdev=1917.52
lat (usec): min=4, max=602, avg= 8.76, stdev= 2.18
clat percentiles (nsec):
| 1.00th=[ 4320], 5.00th=[ 4576], 10.00th=[ 4768], 20.00th=[ 5344],
| 30.00th=[ 6368], 40.00th=[ 7520], 50.00th=[ 7776], 60.00th=[ 7904],
| 70.00th=[ 8096], 80.00th=[ 8256], 90.00th=[ 8768], 95.00th=[ 9152],
| 99.00th=[11072], 99.50th=[12224], 99.90th=[16768], 99.95th=[19072],
| 99.99th=[34560]
bw ( KiB/s): min=20728, max=622960, per=100.00%, avg=411116.83, stdev=106584.98, samples=103
iops : min= 5182, max=155740, avg=102779.18, stdev=26646.25, samples=103
lat (nsec) : 250=0.01%, 750=0.01%
lat (usec) : 4=0.01%, 10=97.95%, 20=2.00%, 50=0.04%, 100=0.01%
lat (usec) : 250=0.01%, 500=0.01%, 750=0.01%
cpu : usr=18.80%, sys=25.26%, ctx=5379783, majf=0, minf=780
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,5333109,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=346MiB/s (363MB/s), 346MiB/s-346MiB/s (363MB/s-363MB/s), io=20.3GiB (21.8GB), run=60194-60194msec
Disk stats (read/write):
nvme0n1: ios=1/1405100, merge=0/23, ticks=0/2932174, in_queue=2932174, util=24.13%
The previous Intel NVME exported with NVME-of, as seen in one front end server.
hdparm -Tt /dev/nvme0n3:
Timing cached reads: 21334 MB in 1.99 seconds = 10722.82 MB/sec
Timing buffered disk reads: 5940 MB in 3.00 seconds = 1979.68 MB/sec
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [F(1)][100.0%][w=4839KiB/s][w=1209 IOPS][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=3456: Mon Jun 28 18:37:08 2021
write: IOPS=77.8k, BW=304MiB/s (319MB/s)(18.2GiB/61171msec); 0 zone resets
slat (nsec): min=860, max=110739, avg=1750.52, stdev=316.16
clat (usec): min=4, max=576, avg= 8.12, stdev= 2.13
lat (usec): min=7, max=578, avg= 9.87, stdev= 2.26
clat percentiles (nsec):
| 1.00th=[ 6624], 5.00th=[ 6944], 10.00th=[ 7072], 20.00th=[ 7200],
| 30.00th=[ 7328], 40.00th=[ 7392], 50.00th=[ 7456], 60.00th=[ 7584],
| 70.00th=[ 7712], 80.00th=[ 8384], 90.00th=[10688], 95.00th=[11328],
| 99.00th=[13632], 99.50th=[17280], 99.90th=[29824], 99.95th=[44288],
| 99.99th=[61696]
bw ( KiB/s): min=15976, max=430968, per=100.00%, avg=370133.55, stdev=90692.00, samples=102
iops : min= 3994, max=107742, avg=92533.38, stdev=22673.00, samples=102
lat (usec) : 10=85.87%, 20=13.78%, 50=0.32%, 100=0.02%, 250=0.01%
lat (usec) : 500=0.01%, 750=0.01%
cpu : usr=23.38%, sys=20.65%, ctx=4824959, majf=0, minf=526
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,4762087,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=304MiB/s (319MB/s), 304MiB/s-304MiB/s (319MB/s-319MB/s), io=18.2GiB (19.5GB), run=61171-61171msec
Disk stats (read/write):
nvme0n1: ios=1/738871, merge=0/28, ticks=0/279727, in_queue=279727, util=25.48%
hdparm -Tt /dev/datavault/testvolume:
Timing cached reads: 21458 MB in 1.99 seconds = 10784.51 MB/sec
Timing buffered disk reads: 7640 MB in 3.00 seconds = 2546.47 MB/sec
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [F(1)][100.0%][w=2029KiB/s][w=507 IOPS][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=4526: Tue Jun 29 08:05:20 2021
write: IOPS=76.2k, BW=298MiB/s (312MB/s)(17.8GiB/61302msec); 0 zone resets
slat (nsec): min=802, max=142478, avg=1831.73, stdev=521.29
clat (usec): min=5, max=3873, avg= 8.61, stdev= 6.04
lat (usec): min=7, max=3874, avg=10.44, stdev= 6.13
clat percentiles (nsec):
| 1.00th=[ 6624], 5.00th=[ 6880], 10.00th=[ 7072], 20.00th=[ 7264],
| 30.00th=[ 7392], 40.00th=[ 7520], 50.00th=[ 7648], 60.00th=[ 7840],
| 70.00th=[ 8768], 80.00th=[10432], 90.00th=[11072], 95.00th=[11584],
| 99.00th=[14016], 99.50th=[22656], 99.90th=[29312], 99.95th=[43264],
| 99.99th=[52480]
bw ( KiB/s): min= 838, max=438626, per=100.00%, avg=349216.78, stdev=92392.10, samples=106
iops : min= 209, max=109656, avg=87303.93, stdev=23097.92, samples=106
lat (usec) : 10=74.54%, 20=24.87%, 50=0.57%, 100=0.01%, 250=0.01%
lat (usec) : 500=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.01%
cpu : usr=23.60%, sys=22.23%, ctx=4868012, majf=0, minf=979
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,4674124,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=298MiB/s (312MB/s), 298MiB/s-298MiB/s (312MB/s-312MB/s), io=17.8GiB (19.1GB), run=61302-61302msec
Disk stats (read/write):
dm-3: ios=1/891284, merge=0/0, ticks=0/694354549, in_queue=694354549, util=25.73%, aggrios=1/904069, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
md0: ios=1/904069, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=0/903755, aggrmerge=0/188, aggrticks=0/333576, aggrin_queue=333577, aggrutil=23.45%
nvme0n1: ios=1/903756, merge=0/188, ticks=1/338804, in_queue=338805, util=23.45%
nvme1n1: ios=0/903755, merge=0/189, ticks=0/328349, in_queue=328349, util=23.27%
I set up iSER target to the storage backend, iSER initiator to the frontend and created an XFS filesystem. Exactly as with NVME-of cases. Looks slightly less in writing speed but very solid latency.
hdparm -Tt /dev/disk/by-path/ip-xxx-lun-1:
Timing cached reads: 21130 MB in 1.99 seconds = 10618.36 MB/sec
Timing buffered disk reads: 3704 MB in 3.00 seconds = 1234.42 MB/sec
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [F(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=3116: Tue Jun 29 15:52:07 2021
write: IOPS=70.8k, BW=277MiB/s (290MB/s)(17.3GiB/63904msec); 0 zone resets
slat (nsec): min=836, max=324487, avg=1694.97, stdev=519.52
clat (nsec): min=267, max=938988, avg=7613.02, stdev=1547.30
lat (usec): min=6, max=940, avg= 9.31, stdev= 1.67
clat percentiles (nsec):
| 1.00th=[ 6688], 5.00th=[ 6880], 10.00th=[ 7008], 20.00th=[ 7136],
| 30.00th=[ 7264], 40.00th=[ 7328], 50.00th=[ 7392], 60.00th=[ 7520],
| 70.00th=[ 7584], 80.00th=[ 7776], 90.00th=[ 8256], 95.00th=[ 8512],
| 99.00th=[10432], 99.50th=[22400], 99.90th=[25728], 99.95th=[26752],
| 99.99th=[29568]
bw ( KiB/s): min=33408, max=423456, per=100.00%, avg=394176.32, stdev=72408.07, samples=91
iops : min= 8352, max=105864, avg=98544.10, stdev=18102.02, samples=91
lat (nsec) : 500=0.01%
lat (usec) : 4=0.01%, 10=98.87%, 20=0.55%, 50=0.58%, 100=0.01%
lat (usec) : 250=0.01%, 750=0.01%, 1000=0.01%
cpu : usr=21.22%, sys=20.55%, ctx=4836908, majf=0, minf=893
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,4526608,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=277MiB/s (290MB/s), 277MiB/s-277MiB/s (290MB/s-290MB/s), io=17.3GiB (18.5GB), run=63904-63904msec
Disk stats (read/write):
sdd: ios=1/620934, merge=0/5878, ticks=1/224797, in_queue=224798, util=34.91%
Similar setup to NVME-of but using iSER instead.
hdparm -Tt /dev/datavault/testvolume:
Timing cached reads: 21368 MB in 1.99 seconds = 10739.75 MB/sec
Timing buffered disk reads: 3898 MB in 3.00 seconds = 1298.67 MB/sec
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [w(1)][100.0%][w=74.6MiB/s][w=19.1k IOPS][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=10059: Mon Jul 5 12:01:40 2021
write: IOPS=69.7k, BW=272MiB/s (285MB/s)(16.1GiB/60525msec); 0 zone resets
slat (nsec): min=839, max=261971, avg=1676.40, stdev=620.25
clat (nsec): min=205, max=1344.2k, avg=8360.25, stdev=8926.87
lat (usec): min=7, max=1345, avg=10.04, stdev= 8.97
clat percentiles (usec):
| 1.00th=[ 7], 5.00th=[ 7], 10.00th=[ 8], 20.00th=[ 8],
| 30.00th=[ 8], 40.00th=[ 8], 50.00th=[ 8], 60.00th=[ 8],
| 70.00th=[ 8], 80.00th=[ 8], 90.00th=[ 9], 95.00th=[ 10],
| 99.00th=[ 25], 99.50th=[ 35], 99.90th=[ 155], 99.95th=[ 186],
| 99.99th=[ 273]
bw ( KiB/s): min=21752, max=424928, per=100.00%, avg=367624.85, stdev=96931.23, samples=91
iops : min= 5440, max=106232, avg=91906.21, stdev=24232.72, samples=91
lat (nsec) : 250=0.01%, 500=0.01%
lat (usec) : 4=0.01%, 10=95.68%, 20=2.66%, 50=1.27%, 100=0.14%
lat (usec) : 250=0.23%, 500=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%
cpu : usr=20.85%, sys=21.57%, ctx=4476176, majf=0, minf=871
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,4215656,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=272MiB/s (285MB/s), 272MiB/s-272MiB/s (285MB/s-285MB/s), io=16.1GiB (17.3GB), run=60525-60525msec
Disk stats (read/write):
dm-3: ios=1/548969, merge=0/0, ticks=1/830242240, in_queue=830242241, util=36.64%, aggrios=1/623957, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
md0: ios=1/623957, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=0/623048, aggrmerge=0/526, aggrticks=0/213754, aggrin_queue=213754, aggrutil=35.33%
sdb: ios=1/623078, merge=0/497, ticks=1/211402, in_queue=211402, util=35.33%
sdc: ios=0/623019, merge=0/556, ticks=0/216106, in_queue=216106, util=35.32%
hdparm -Tt /dev/sdb:
Timing cached reads: 21054 MB in 1.99 seconds = 10581.09 MB/sec
Timing buffered disk reads: 3908 MB in 3.00 seconds = 1302.55 MB/sec
fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --numjobs=1 --size=4g --iodepth=1 --runtime=60 --time_based --end_fsync=1
random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.19
Starting 1 process
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [F(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=11449: Tue Jul 6 14:14:06 2021
write: IOPS=62.5k, BW=244MiB/s (256MB/s)(15.7GiB/65801msec); 0 zone resets
slat (nsec): min=913, max=134212, avg=1668.50, stdev=498.55
clat (usec): min=3, max=1588, avg= 7.58, stdev= 2.04
lat (usec): min=7, max=1590, avg= 9.25, stdev= 2.13
clat percentiles (nsec):
| 1.00th=[ 6560], 5.00th=[ 6752], 10.00th=[ 6880], 20.00th=[ 7072],
| 30.00th=[ 7200], 40.00th=[ 7328], 50.00th=[ 7456], 60.00th=[ 7520],
| 70.00th=[ 7648], 80.00th=[ 7776], 90.00th=[ 8032], 95.00th=[ 8512],
| 99.00th=[10176], 99.50th=[22656], 99.90th=[25984], 99.95th=[27008],
| 99.99th=[36096]
bw ( KiB/s): min=67536, max=451744, per=100.00%, avg=396963.38, stdev=70142.99, samples=82
iops : min=16884, max=112936, avg=99240.79, stdev=17535.73, samples=82
lat (usec) : 4=0.01%, 10=98.97%, 20=0.42%, 50=0.61%, 100=0.01%
lat (usec) : 250=0.01%, 500=0.01%
lat (msec) : 2=0.01%
cpu : usr=20.62%, sys=16.00%, ctx=4239733, majf=0, minf=807
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,4111551,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=244MiB/s (256MB/s), 244MiB/s-244MiB/s (256MB/s-256MB/s), io=15.7GiB (16.8GB), run=65801-65801msec
Disk stats (read/write):
sdb: ios=1/189398, merge=0/3781, ticks=1/229437, in_queue=229437, util=38.13%
hdparm -Tt /dev/datavault/test:
Timing cached reads: 20898 MB in 1.99 seconds = 10501.52 MB/sec
Timing buffered disk reads: 3812 MB in 3.00 seconds = 1270.48 MB/sec
fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --numjobs=1 --size=4g --iodepth=1 --runtime=60 --time_based --end_fsync=1 random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.19
Starting 1 process
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [F(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=11743: Tue Jul 6 14:25:35 2021
write: IOPS=61.8k, BW=241MiB/s (253MB/s)(15.6GiB/66078msec); 0 zone resets
slat (nsec): min=815, max=152844, avg=1670.57, stdev=552.98
clat (usec): min=5, max=164, avg= 7.63, stdev= 1.64
lat (usec): min=7, max=170, avg= 9.30, stdev= 1.76
clat percentiles (nsec):
| 1.00th=[ 6752], 5.00th=[ 6944], 10.00th=[ 7072], 20.00th=[ 7200],
| 30.00th=[ 7264], 40.00th=[ 7392], 50.00th=[ 7456], 60.00th=[ 7520],
| 70.00th=[ 7584], 80.00th=[ 7712], 90.00th=[ 8032], 95.00th=[ 8512],
| 99.00th=[10944], 99.50th=[23168], 99.90th=[26240], 99.95th=[27264],
| 99.99th=[40704]
bw ( KiB/s): min= 1912, max=419584, per=100.00%, avg=399272.78, stdev=66461.85, samples=81
iops : min= 478, max=104896, avg=99818.16, stdev=16615.56, samples=81
lat (usec) : 10=98.81%, 20=0.42%, 50=0.76%, 100=0.01%, 250=0.01%
cpu : usr=18.56%, sys=16.97%, ctx=4211106, majf=0, minf=763
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,4084062,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=241MiB/s (253MB/s), 241MiB/s-241MiB/s (253MB/s-253MB/s), io=15.6GiB (16.7GB), run=66078-66078msec
Disk stats (read/write):
dm-3: ios=1/103363, merge=0/0, ticks=0/116893, in_queue=116893, util=35.07%, aggrios=1/204245, aggrmerge=0/1728, aggrticks=0/235478, aggrin_queue=235478, aggrutil=38.83%
sdb: ios=1/204245, merge=0/1728, ticks=0/235478, in_queue=235478, util=38.83%
I still have not studied hdparm / fio analysis and possible alternate analysis methods well enough to make big conclusions. But quick study shows the Infiniband network seems to be quite transparent and performance seems good.
The latency seems to be unchanged on NVME-of compared to local NVME which is incredible. With RAID1 + LVM there is a slight increase but we are talking only a three nanosecond increase from 10 to 13 or so, unless I am reading that fio output wrong.
iSER surprises with it's solid latency but for some reason gains no advantage when using LVM striping.
My only worry would be the CPU usage of the MD RAID1. I may try DRBD backend replication and some solution on the front end to swap block devices on the fly.
So far the best performing options (rough roundings there) of networked storages would be:
Please leave a comment if you have any ideas.
All comments and corrections are welcome.