A wish-list for things to change in how we save data. This should probably be at most v 0.2.
Ensure that the
h5file does not just pickle the data (see #73 (closed) , and maybe #72 (closed)). Currently we just dump the
Result()object into the pickle, which
deepdishdoes not know how to save so it just pickles it. We need to rewrite it as a dictionary.
- Reduce the saved data filesize by thinking about what we want to save - for example we currently save the samples twice, once in an array and once in a data frame
maybe separate the output: save the samples in a separate data frame and the
Results()(which contain the logz and details of the run). The upside is this might reduce the filesizes and allow quick concatenation of samples. The downside is that samples can get separated from information about how they where produced.
- Add some option to save as a text file for when people inevitably can't handle h5 files
Add a help tutorial for the saved data, noting things like how we save the data and tools such as
ddlssee discussion in here which can be used to quickly check what data is saved.
Add labels to the saved
prior.txtfile and implement loading such a file (not sure if this last part is already done?)
Feel free to add other things as well. I'm not actively working on any of this as I don't think it is urgent (if you use the same version of python it's fine), but its a good to gather everything together in one place.