Possible race condition/deadlock on the CA SDF code
From bugzilla ticket #1127
Jonathan Hanks (LIGO - Hanford Observatory) 2018-10-23 14:18:13 PDT
See FRS
https://services.ligo-la.caltech.edu/FRS/show_bug.cgi?id=11694
Daniel noticed that the Beckhoff SDF monitoring system for h1sysecatc1plc[1,3] and h1sysecaty1plc2 was frozen. The medm pages resolved correctly, but button presses were not being acted on.
When investigating I noticed that the system was stuck waiting for a mutex, the point it left the application code (and entered EPICS base) was a call to ca_clear_subscription(...) while it was syncing the internal state of the system with changes that had taken place in EPICS.
The specific spot happened to be while handling a enum value, looks like it was trying to change how it handled the enum. CA SDF tries to display enums as strings if it can, or falls back to numeric, but it takes some work to figure out which one to use.
Jonathan Hanks (LIGO - Hanford Observatory) 2018-10-23 14:21:11 PDT