ligo-scald issueshttps://git.ligo.org/gstlal-visualisation/ligo-scald/-/issues2024-03-20T20:14:14Zhttps://git.ligo.org/gstlal-visualisation/ligo-scald/-/issues/61More informative messages for scald Influx queries (Feature request/Bug Fix)2024-03-20T20:14:14ZMatthew CarneyMore informative messages for scald Influx queries (Feature request/Bug Fix)Currently, when querying data from InfluxDB using scald tools, the `_query_influx_data()` method of the `Consumer` class does not return useful information upon an unsuccessful query. For example, querying a time for which no data exists...Currently, when querying data from InfluxDB using scald tools, the `_query_influx_data()` method of the `Consumer` class does not return useful information upon an unsuccessful query. For example, querying a time for which no data exists returns the same error when querying a non-existent measurement for a time when data **does** exist:
```
File "/cvmfs/software.igwn.org/conda/envs/igwn/lib/python3.10/site-packages/ligo/scald/io/influx.py", line 1339, in _query_influx_data
return data['results'][0]['series'][0]['columns'], data['results'][0]['series'][0]['values']
KeyError: 'series'
```
For reproducibility, the above error message was produced by running `/home/matthew.carney/projects/influx_test.py` on the LIGO Hanford cluster. As stated above this same message was produced for both of the aforementioned cases.
It would be very helpful for debugging purposes if there was a way to differentiate between error causes when querying is unsuccessful. Both my calibration Influx querying code [ligo-calibplot](https://git.ligo.org/matthew.carney/ligo-calibplot) and the portion of [pydarm](https://git.ligo.org/Calibration/pydarm) that adapts ligo-calibplot rely on this scald submodule, so there are downstream implications to this request as well.https://git.ligo.org/gstlal-visualisation/ligo-scald/-/issues/60Off by one bug in format_point?2023-04-30T18:28:40ZChad HannaOff by one bug in format_point?I have a code that is calling ```Consumer.query()``` for "rows" datatypes. Eventually this calls ```_retrieve_rows_by_tag``` and I am seeing an error that is
```
...
File "./process_log", line 28, in get_keys_from_influx
out = co...I have a code that is calling ```Consumer.query()``` for "rows" datatypes. Eventually this calls ```_retrieve_rows_by_tag``` and I am seeing an error that is
```
...
File "./process_log", line 28, in get_keys_from_influx
out = consumer.query(MEASUREMENT, 'rows', start, end)
File "/ligo/shared-scratch/observing/4/dev/runs/log_test/influx.py", line 661, in query
return self.retrieve_rows_by_tag(s['measurement'], start, end, s['tag_key'], aggregate=s['aggregate'], **kwargs)
File "/ligo/shared-scratch/observing/4/dev/runs/log_test/influx.py", line 709, in retrieve_rows_by_tag
return _retrieve_rows_by_tag(self.client, self.database, measurement, self.schema[measurement], start, end, tag, aggregate=aggregate, dt=dt, datetime=datetime)
File "/ligo/shared-scratch/observing/4/dev/runs/log_test/influx.py", line 1205, in _retrieve_rows_by_tag
tag_val = row['tags'][tag]
KeyError: 'job_type'
```
where "job_type" is my only tag in the schema. Looking at the raw output of ```points``` defined [here](https://git.ligo.org/gstlal-visualisation/ligo-scald/-/blob/main/ligo/scald/io/influx.py#L1193). I see (for one item):
```python
[1992658615000000000, -7997921891068774238, '02/17/23', 'gstlal_inspiral.002E0', '55761486', '11', ' warnings.warn("disabling service discovery, this web server won\'t be able to advertise the location of the services it provides.")\\n\\n%4|1676693829.661|CONFWARN|rdkafka#producer-1| [thrd:app]: Configuration property group.id is a consumer property and will be ignored by this producer instance\\n\\n%4|1676693830.192|CONFWARN|rdkafka#producer-3| [thrd:app]: Configuration property group.id is a consumer property and will be ignored by this producer instance\\n', 'gstlal_inspiral']
```
Which by visual inspection shows that the tag is the last column. However, [this code](https://git.ligo.org/gstlal-visualisation/ligo-scald/-/blob/main/ligo/scald/io/influx.py#L1360) which formats the point actually ignores this last column and returns an empty tag dictionary. In this case, deleting the `1` would fix it, but I have no idea why that should be the case in the context of the broader functionality supported here.Patrick GodwinPatrick Godwinhttps://git.ligo.org/gstlal-visualisation/ligo-scald/-/issues/56Replace use of `distutils` once support for python < 3.8 is dropped2022-02-18T14:11:34ZPatrick GodwinReplace use of `distutils` once support for python < 3.8 is dropped`distutils` will be removed in a future version of python but we rely on this for the `copy_tree()` feature. In Python 3.8+ however, a solution with `shutil` can be leveraged which will emulate the needed behavior. See https://stackoverf...`distutils` will be removed in a future version of python but we rely on this for the `copy_tree()` feature. In Python 3.8+ however, a solution with `shutil` can be leveraged which will emulate the needed behavior. See https://stackoverflow.com/a/64340026.https://git.ligo.org/gstlal-visualisation/ligo-scald/-/issues/50First-class support for raw data2019-10-08T15:55:41ZPatrick GodwinFirst-class support for raw dataCurrently the main focus is on storing aggregates with a maximum sampling rate of 1 Hz. However, it would be nice to also store raw data in a cleaner way. There is a way to store raw data with `aggregate=None` but it seems like a bit of ...Currently the main focus is on storing aggregates with a maximum sampling rate of 1 Hz. However, it would be nice to also store raw data in a cleaner way. There is a way to store raw data with `aggregate=None` but it seems like a bit of a hack, and also undocumented.https://git.ligo.org/gstlal-visualisation/ligo-scald/-/issues/37Add a retrieve_event() method to consumer in influx.py2019-08-27T01:46:10ZPatrick GodwinAdd a retrieve_event() method to consumer in influx.pyThis should be very similar to `retrieve_triggers()` except that it doesn't aggregate at all, just like calling `retrieve_timeseries()` with `aggregate=None`.This should be very similar to `retrieve_triggers()` except that it doesn't aggregate at all, just like calling `retrieve_timeseries()` with `aggregate=None`.https://git.ligo.org/gstlal-visualisation/ligo-scald/-/issues/22Add 'store_events' method into aggregators2019-08-27T01:46:22ZPatrick GodwinAdd 'store_events' method into aggregatorsThis would allow ingestion of things like inspiral triggers with FARs that would not be aggregated (although possibly at the 1 Hz level). At the very least, things like 'reduce_by_tag' and 'reduce_across_tags' wouldn't apply in the same ...This would allow ingestion of things like inspiral triggers with FARs that would not be aggregated (although possibly at the 1 Hz level). At the very least, things like 'reduce_by_tag' and 'reduce_across_tags' wouldn't apply in the same way they do for timeseries.