GStreamer 1.0: Fix illegal memory access in postcoh
I've uncovered an issue in my latest version of the GStreamer 1.0 upgrade
i.e. using gstlal here:
/fred/oz996/tdavies/spiir_project/sources/gstlal/gstlal
at commit 8adb683058382897c44380e41002dc0f6dd48300,
and spiir here:
/fred/oz996/tdavies/spiir_project/sources/spiir/gstlal-spiir
at commit 8adb683058382897c44380e41002dc0f6dd48300
On running for a short time, I see the following error:
*** Error in `python': corrupted double-linked list: 0x00005621445563a0 ***
CUDA_CHECK: Error 'an illegal memory access was encountered' at line '1725' in file 'postcoh/postcoh.c'
CUDA_CHECK: Error 'an illegal memory access was encountered' at line '1725' in file 'postcoh/postcoh.c'
CUDA_CHECK: Error 'an illegal memory access was encountered' at line '1398' in file 'postcoh/postcoh_kernel.cu'
CUDA_CHECK: Error 'driver shutting down' at line '839' in file 'multiratespiir/multiratespiir_kernel.cu'
CUDA_CHECK: Error 'driver shutting down' at line '839' in file 'multiratespiir/multiratespiir_kernel.cu'
The versions I'm using are still being finalized for an MR, but a similar issue occurs if I run the 'gw170817' script for an extended duration.
The precise error message varies, which isn't surprising with all the async tasks happening, but so far it's always included an error at line 1725 in postcoh.c
CUDA_CHECK(cudaMemcpyAsync(pklist->d_snglsnr_buffer, snglsnr, one_take_size, cudaMemcpyHostToDevice, postcoh->stream));
cudaMemcpyAsync
docs are here
This call copies from 'snglsnr' to pklist->d_snglsnr_buffer
.
snglsnr
itself is now created using gst_adapter_map
instead of gst_adapter_peek_cuda
. We did this for simplicity, assuming it would at worst slow done the memory copy, but it may be the source of this error. That would imply that the ACCELERATE_POSTCOH_MEMORY_COPY
code path was the only working one, which would be unfortunate but not unlikely.