Saving data

ensure data outputs are compatible between different python versions (this may come for free with the above changes, but let's use this checkbox to remind ourselves to check this is satisfied).

mentioned in issue #72 (closed)

mentioned in issue #73 (closed)

marked the checklist item Ensure that the h5 file does not just pickle the data (see #73 (closed) , and maybe #72 (closed)). Currently we just dump the Result() object into the pickle, which deepdish does not know how to save so it just pickles it. We need to rewrite it as a dictionary. as completed

I've forced the output to be a dictionary in 6bf77295.

changed milestone to %0.2

changed milestone to %0.1.1

A sub-issue raised by @paul-lasky : when reading in the results in python3 (which where made in pytho2) the search_parameter_keys are byte arrays. These need to be converted to strings so that things like plot_corner() work.

mentioned in commit b15a91ca

mentioned in merge request !50 (merged)

mentioned in commit 3a5771ef

marked the checklist item Add labels to the saved prior.txt file and implement loading such a file (not sure if this last part is already done?) as completed

marked the checklist item Add some option to save as a text file for when people inevitably can't handle h5 files as completed

marked the checklist item Reduce the saved data filesize by thinking about what we want to save - for example we currently save the samples twice, once in an array and once in a data frame as completed

marked the checklist item maybe separate the output: save the samples in a separate data frame and the Results() (which contain the logz and details of the run). The upside is this might reduce the filesizes and allow quick concatenation of samples. The downside is that samples can get separated from information about how they where produced. as completed

marked the checklist item Add a help tutorial for the saved data, noting things like how we save the data and tools such as ddls see discussion in here which can be used to quickly check what data is saved. as completed

I've now checked this writing data and reading it back with a different version of python and it seems to work fine. I'll close this issue.

closed

reopened

I hate to re-open old wounds, but running tupak with python2, and plotting results with python3 is not working. Specifically, chains.plot_corner() returns

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-3-9e79cd3ad4da> in <module>()
----> 1 chains.plot_corner()

/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tupak/core/result.py in plot_corner(self, parameters, save, dpi, **kwargs)
    232             defaults_kwargs['color'] = '#FF8C00'
    233
--> 234         xs = self.posterior[parameters].values
    235         kwargs['labels'] = kwargs.get(
    236             'labels', self.get_latex_labels_from_parameter_keys(

/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2131         if isinstance(key, (Series, np.ndarray, Index, list)):
   2132             # either boolean or fancy integer index
-> 2133             return self._getitem_array(key)
   2134         elif isinstance(key, DataFrame):
   2135             return self._getitem_frame(key)

/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/frame.py in _getitem_array(self, key)
   2175             return self._take(indexer, axis=0, convert=False)
   2176         else:
-> 2177             indexer = self.loc._convert_to_indexer(key, axis=1)
   2178             return self._take(indexer, axis=1, convert=True)
   2179

/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/indexing.py in _convert_to_indexer(self, obj, axis, is_setter)
   1267                 if mask.any():
   1268                     raise KeyError('{mask} not in index'
-> 1269                                    .format(mask=objarr[mask]))
   1270
   1271                 return _values_from_object(indexer)

KeyError: "[b'phi_jl' b'psi' b'a_2' b'a_1' b'geocent_time' b'phi_12'\n b'luminosity_distance' b'ra' b'phase' b'mass_2' b'mass_1'\n b'dec' b'tilt_2' b'iota' b'tilt_1'] not in index"

I think this is the issue mentioned above in this thread, but in reverse; i.e., the search_parameter_keys are byte arrays.

changed milestone to %0.3

added Bug and removed ~678 labels

It looks like we can try from __future__ import unicode_literals (http://python-future.org/compatible_idioms.html#strings-and-bytes).

Coming back to this after a while. I'm actually in favour of removing all the hacks that have been put in have interopability. This is for two reasons

Numpy won't support python 2 from sometime in 2019
The hacks are ugly and ultimately make the code less clear.

I therefore propose to solve this bug by removing the things we have done. People can still read in the results using there favourite hdf5 converter and make corner plots, but will will require manual lifting to do so.

mentioned in commit a7caec77

mentioned in merge request !131 (closed)

After more discussion with other people, I'm going to close this issue as it seems it just isn't possible to solve within tupak (at least with our current expertise). Fundamentally, python 2 and python 3 handle strings differently and handling every case is going to end up make the code more complicated for little gain. There seems to be no problem running in either python 2 or python 3, just not both. So, if this error comes up in future, the advise is to use a single version of python.

closed

unassigned @gregory.ashton

Saving data

Designs

Child items ...

Activity