Dolphin Issues on IX/PX Deb10/11
Overview
As a general rule Debian 11 works better with PX and Debian 10 works better with IX.
Debian 11 | Debian 10 | |
---|---|---|
PX 5.19 |
Working on V4 and W22s V1 High Max Latency |
Segment ID issue workaround found. Dolphin Bug: cannot create more than one segment, so latency test does not include cdsrfm |
PX 5.20 |
Working on V4 and W22s V1 High Max Latency |
Segment ID issue workaround found. Dolphin Bug: cannot create more than one segment, so latency test does not include cdsrfm |
IX 5.20 | V1 High Max Latency | Untested |
IX 5.19 | One way communication issues, expect 5.20 to fix | Working in production |
Open Issues
- Debian 10 with PX on the large test stand, still has some transient errors on DTS0
- Not a blocker as Debian 10/PX is not a planned configuration.
Debian 10 PX Workaround
Workaround using segments PCIE->0, RFM_CS->1, RFM_EX->2, RFM_EY->3. This depletes most of our segment IDs, 4/4 corner station and 2/4 on end stations, but allows us to run.
V1 Max Latency Issue
On Debian 11 machines V1's experience a transient huge max latency that was causing timing over dolphin issued on DTS1. Debian 10 does not have this issue.
Configurations
PX 5.19
Dolphin Benchmark Debian 11
x2lsc0 <-> x2seih16
[68670.595965] Count was 891104862, err_cnt: 0
[68670.595965] min: 2219 ns, max: 14607 ns, avg: 2625 ns
[68670.595966] Histogram Of All Latencies (ns)
[68670.595966] <2000 : 0
[68670.595966] [2000, 4000) : 891086829
[68670.595967] [4000, 5000) : 8738
[68670.595967] [5000, 6000) : 1826
[68670.595967] [6000, 7000) : 1314
[68670.595968] [7000, 8000) : 1305
[68670.595968] [8000, 9000) : 1451
[68670.595969] [9000, 10000) : 1238
[68670.595969] [10000, 11000) : 686
[68670.595969] [11000, 12000) : 568
[68670.595970] [12000, 13000) : 486
[68670.595970] [13000, 14000) : 326
[68670.595970] [14000, 15000) : 93
[68670.595971] [15000, 16000) : 0
[68670.595971] [16000, 17000) : 0
[68670.595971] [17000, 18000) : 0
[68670.595972] [18000, 20000) : 0
[68670.595972] [20000, 22000) : 0
[68670.595972] [22000, 25000) : 0
[68670.595973] [25000, 30000) : 0
[68670.595973] >30000 : 0
Dolphin Deb 10 Bug
The PX drivers have a bug where you cannot create
Dolphin Benchmark Debian 10
x2lsc0 <-> x2seih16
ligo-dolphin-px-srcdis/unstable,now 5.19.2-2 all
[ 8319.599174] Count was 1123301708, err_cnt: 0
[ 8319.599175] min: 2199 ns, max: 5420 ns, avg: 2436 ns
[ 8319.599175] Histogram Of All Latencies (ns)
[ 8319.599176] <2000 : 0
[ 8319.599176] [2000, 4000) : 1123272300
[ 8319.599177] [4000, 5000) : 29403
[ 8319.599177] [5000, 6000) : 3
[ 8319.599177] [6000, 7000) : 0
[ 8319.599177] [7000, 8000) : 0
[ 8319.599178] [8000, 9000) : 0
[ 8319.599178] [9000, 10000) : 0
[ 8319.599178] [10000, 11000) : 0
[ 8319.599179] [11000, 12000) : 0
[ 8319.599179] [12000, 13000) : 0
[ 8319.599179] [13000, 14000) : 0
[ 8319.599180] [14000, 15000) : 0
[ 8319.599180] [15000, 16000) : 0
[ 8319.599180] [16000, 17000) : 0
[ 8319.599180] [17000, 18000) : 0
[ 8319.599181] [18000, 20000) : 0
[ 8319.599181] [20000, 22000) : 0
[ 8319.599181] [22000, 25000) : 0
[ 8319.599181] [25000, 30000) : 0
[ 8319.599182] >30000 : 0
Dolphin benchmark records max round trip latency of ~15 us, suggesting one way MAX latency of ~7.5 us. The RCG code suggests an expected max latency of ~5 us with previous versions.
PX 5.20
No cdsrfm round trip times
[ 6038.500263] min: 2204 ns, max: 11409 ns, avg: 2417 ns
[ 6038.500263] Histogram Of All Latencies (ns)
[ 6038.500263] <2000 : 0
[ 6038.500264] [2000, 4000) : 39372536
[ 6038.500264] [4000, 5000) : 315
[ 6038.500265] [5000, 6000) : 83
[ 6038.500265] [6000, 7000) : 60
[ 6038.500265] [7000, 8000) : 51
[ 6038.500266] [8000, 9000) : 55
[ 6038.500266] [9000, 10000) : 40
[ 6038.500266] [10000, 11000) : 8
[ 6038.500267] [11000, 12000) : 1
[ 6038.500267] [12000, 13000) : 0
[ 6038.500267] [13000, 14000) : 0
[ 6038.500268] [14000, 15000) : 0
[ 6038.500268] [15000, 16000) : 0
[ 6038.500268] [16000, 17000) : 0
[ 6038.500269] [17000, 18000) : 0
[ 6038.500269] [18000, 20000) : 0
[ 6038.500269] [20000, 22000) : 0
[ 6038.500270] [22000, 25000) : 0
[ 6038.500270] [25000, 30000) : 0
[ 6038.500270] >30000 : 0
With cdsrfm running
[ 6522.464323] Count was 39371649, err_cnt: 0
[ 6522.464324] min: 2204 ns, max: 12399 ns, avg: 2417 ns
[ 6522.464324] Histogram Of All Latencies (ns)
[ 6522.464325] <2000 : 0
[ 6522.464325] [2000, 4000) : 39371069
[ 6522.464326] [4000, 5000) : 290
[ 6522.464326] [5000, 6000) : 74
[ 6522.464326] [6000, 7000) : 56
[ 6522.464327] [7000, 8000) : 46
[ 6522.464327] [8000, 9000) : 60
[ 6522.464327] [9000, 10000) : 41
[ 6522.464328] [10000, 11000) : 9
[ 6522.464328] [11000, 12000) : 1
[ 6522.464328] [12000, 13000) : 1
[ 6522.464329] [13000, 14000) : 0
[ 6522.464329] [14000, 15000) : 0
[ 6522.464329] [15000, 16000) : 0
[ 6522.464330] [16000, 17000) : 0
[ 6522.464330] [17000, 18000) : 0
[ 6522.464330] [18000, 20000) : 0
[ 6522.464330] [20000, 22000) : 0
[ 6522.464331] [22000, 25000) : 0
[ 6522.464331] [25000, 30000) : 0
[ 6522.464331] >30000 : 0
IX
Debian 11 Small Test Stand
[80549.898008] Count was 86863792, err_cnt: 0
[80549.898009] min: 2239 ns, max: 14379 ns, avg: 3149 ns
[80549.898010] Histogram Of All Latencies (ns)
[80549.898010] <2000 : 0
[80549.898011] [2000, 4000) : 86861947
[80549.898011] [4000, 5000) : 985
[80549.898012] [5000, 6000) : 191
[80549.898012] [6000, 7000) : 93
[80549.898013] [7000, 8000) : 113
[80549.898013] [8000, 9000) : 119
[80549.898014] [9000, 10000) : 115
[80549.898014] [10000, 11000) : 105
[80549.898015] [11000, 12000) : 80
[80549.898015] [12000, 13000) : 35
[80549.898016] [13000, 14000) : 6
[80549.898016] [14000, 15000) : 1
[80549.898017] [15000, 16000) : 0
[80549.898017] [16000, 17000) : 0
[80549.898018] [17000, 18000) : 0
[80549.898018] [18000, 20000) : 0
[80549.898019] [20000, 22000) : 0
[80549.898019] [22000, 25000) : 0
[80549.898020] [25000, 30000) : 0
[80549.898020] >30000 : 0
Edited by Ezekiel Dohmen