Skip to content

Investigate modifying ligolw_publish_threaded_dqxml_dqsegdb to accept https connections to ligolw_dtd.txt

Currently, if ligolw_publish_threaded_dqxml_dqsegdb tries to read http://ldas-sw.ligo.caltech.edu/doc/ligolwAPI/html/ligolw_dtd.txt and gets redirected to an https connection, the read fails, like this:

Error: connection to host "ldas-sw.ligo.caltech.edu" failed for http URL "http://ldas-sw.ligo.caltech.edu/doc/ligolwAPI/html/ligolw_dtd.txt": Connection timed out
Traceback (most recent call last):
  File "/usr/bin/ligolw_publish_threaded_dqxml_dqsegdb", line 240, in <module>
    result=InsertMultipleDQXMLFileThreaded(infiles,logger,options.segment_url,hackDec11=False,debug=local_debug,threads=thread_count)
  File "/usr/lib/python3.6/site-packages/dqsegdb/apicalls.py", line 801, in InsertMultipleDQXMLFileThreaded
    segment_md = setupSegment_md(filename,xmlparser,lwtparser,debug)
  File "/usr/lib/python3.6/site-packages/dqsegdb/apicalls.py", line 738, in setupSegment_md
    segment_md.parse(xmltext)
  File "/usr/lib/python3.6/site-packages/dqsegdb/ldbd.py", line 286, in parse
    ligolwtup = self.xmlparser(xml.encode("utf-8"))
pyRXPU.error: Error: Couldn't open dtd entity http://ldas-sw.ligo.caltech.edu/doc/ligolwAPI/html/ligolw_dtd.txt
 in unnamed entity at line 1 char 115 of [unknown]
Couldn't open dtd entity http://ldas-sw.ligo.caltech.edu/doc/ligolwAPI/html/ligolw_dtd.txt
Parse Failed!

While there are many files that specifically point to the http version, so that needs to remain available, it would be good for resilience if this tool could access the file over https, as well.