Skip to content
Snippets Groups Projects

Adding counter to likelihood

Merged Rhys Green requested to merge rhys.green/bilby:adding_counter_to_likelihood into master
All threads resolved!

This adds a counter for each likelihood evaluation which can then be used for a fairly consistent benchmarking metric

Edited by Rhys Green

Merge request reports

Pipeline #65262 passed

Pipeline passed for 39949d13 on rhys.green:adding_counter_to_likelihood

Approved by

Merged by Moritz HuebnerMoritz Huebner 5 years ago (Jun 13, 2019 6:35am UTC)

Merge details

  • Changes merged into master with 4a340c1e (commits were squashed).
  • Did not delete the source branch.

Pipeline #67109 passed

Pipeline passed for 4a340c1e on master

Test coverage 69.00% from 0 jobs

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Rhys Green resolved all discussions

    resolved all discussions

  • Rhys Green added 1 commit

    added 1 commit

    • beef3589 - removing redundant line calling super

    Compare with previous version

  • Hi @rhys.green, can you merge master into this branch, we (mostly Greg) fixed a bunch of stuff in the CI today, so it may just work.

  • Rhys Green added 3 commits

    added 3 commits

    Compare with previous version

  • Gregory Ashton
  • Gregory Ashton
  • Gregory Ashton
  • Hi Rhys, nice idea. I can see the motivation for this, but I just want to check a few things.

    1. Is not possible to get this in post processing already? I can't directly see how, but it would be cleaner rather than doing it while sampling.

    2. There is a built-in Counter object in python, it would be really good to use this as this is the sort of thing which someone sees and wants to add "prior_evaluation_counter", "other_thing_counter" etc etc. The Counter means there is only one object and there are explicit names for what it is counting.

    3. There are a few things I'm a little concerned about in the implementation (see the review comments) - if we have to hack things into weird places then okay, but it would be best if we don't.

    4. There are a few places where you have added an extra line which isn't relevant to this MR. Can you remove these? It is petty I know, but it makes things much simpler if the MR doesn't include various minor additions/deletions.

  • Author Reporter

    Hi Greg, thanks for looking at this !

    1. that was my first thought too but I couldn't see any obvious ways to do it. I tried doing this with something like profiling but that seemed messier than this way ? If you have any suggestions I'd be happy to change the implementation

    2. I was not aware of the counter object so yea it makes sense to use that

    3. I agree, (see replies in review above)

    4. Sure I don't think those were actually supposed to be in this MR

  • Author Reporter

    to follow up on point 2 I'm not sure if the counter object is the best thing to use, it looks like it's not really designed to count function calls. I think the manual counter is probably cleaner If we want to make it more general, Maybe specify self.likelihood_counts rather than self.count? you could then keep the add_count function but have other self.something_else_counts like you mentioned ?

  • Rhys Green resolved all discussions

    resolved all discussions

  • Rhys Green added 1 commit

    added 1 commit

    • faffbc60 - replacing self.count with result.num counter, this makes the code cleaner as...

    Compare with previous version

  • Rhys Green added 1 commit

    added 1 commit

    • edec6e8a - adding changes to all samplers, re-adding line after init

    Compare with previous version

  • Rhys Green added 35 commits

    added 35 commits

    Compare with previous version

  • Rhys Green added 1 commit

    added 1 commit

    Compare with previous version

  • Author Reporter

    @john-veitch @colm.talbot @gregory.ashton @christopher-berry this works fine for each of the samplers apart from cpnest. The counter works for cpnest i.e. you can force it to print whilst sampling and the counter goes up but I think potentially at some point after sampling the result object is called again and the num_likelihood_evals is set to None. Anyone have any idea why this is happening/a fix ?

    Edited by Rhys Green
  • Rhys Green added 1 commit

    added 1 commit

    Compare with previous version

  • Hi @rhys.green. What do you mean the result object is "called again"? CPNest can't access anything in its Bilby control class so it can't be over-writing it.

    I'm guessing this is because CPNest runs its samplers in separate processes. That means that the wrapper each sampler sees is a copy of the parent process's object as it was before you call cpnest.run(). When the samplers finish their copies of the object are destroyed, and the likelihood counts along with them. I don't see any way of passing this back to Bilby without inter-process communication, so how do the other multi-threaded samplers do it?

    Probably the simplest thing to do is to add this functionality to CPNest. I'm already thinking of overhauling the IPC to fix the checkpointing bug, and that'll make it much simpler to track this sort of information.

    Edited by John Douglas Veitch
  • Alternatively, and as a quick sanity check, you could declare your counter as a multiprocessing.Value object so that all processes can see it. I wouldn't recommend using that for benchmarking though as it'll slow things down to some extent.

    Edited by John Douglas Veitch
  • Rhys Green added 3 commits

    added 3 commits

    • 8f368893 - reverting to using add_count function
    • 323071a8 - reverting to using add_count function
    • be2df0af - reverting to using add_count function

    Compare with previous version

  • Rhys Green added 1 commit
  • Author Reporter

    Hi @john-veitch what I meant by that is if I print the self.likelihood_count inside of the cpnest model log_likelihood (line 70 of cpnest.py) the numbers go up, so it must be getting that count information from the base sampler? After it has finished sampling the count goes back to zero, I didn't realise the thread copies of the result object were destroyed so this potentially explains it.

    Ideally I'd like to be able to return the likelihood count information to the "parent" result objects in a similar way that the samples are but I'm not sure how to do this through bilby.

    I'll try it with multi-processing value object as I suppose using that would still give a fair score in terms of samples per likelihood evaluation.

  • Author Reporter

    Also in the logging information when a cpnest run finishes, a quantity called input samles is returned. What is this ?

  • Rhys Green added 2 commits

    added 2 commits

    • 3f99d311 - changing implementation to allow for multiprocessing, this is much nicer than before and works
    • 7ea67c04 - changing implementation to allow for multiprocessing, this is much nicer than before and works

    Compare with previous version

  • Author Reporter

    Ok so I think this now works, it's using the multiprocessing value object. I haven't noticed any significant slowdown from that but I haven't checked quantitatively. I think this implementation is okay and is potentially ready to merge if people want to include this ? @john-veitch @christopher-berry @gregory.ashton @colm.talbot what do you think ? Obviously I'm happy to change things if you guys want

  • Sounds good to me. So long as the feature can be turned on/off we shouldn't worry too much about a slow down in wall-time.

    Is there any unit testing that we need?

  • Rhys Green added 1 commit

    added 1 commit

    • 6d5dcb78 - typo - only assigned num_likelihood_evals in test section

    Compare with previous version

  • @rhys.green,

    if I print the self.likelihood_count inside of the cpnest model log_likelihood (line 70 of cpnest.py) the numbers go up, so it must be getting that count information from the base sampler? After it has finished sampling the count goes back to zero, I didn't realise the thread copies of the result object were destroyed so this potentially explains it.

    Just to be clear, you are confusing multiple objects that are held by different processes. When you add a print statement, it is executed in one of the parallel processes, so it is printing the counter that belongs to that process. The version that is held by the main process does not get printed or incremented unless the likelihood is called from that process, which it isn't. So it's not "going back" to zero, it always was.

    Using a multiprocessing.Value will allow you to work around this because then all processes see the same counter. It should work for any code that's based on the multiprocessing module, but that is not the only possibility. However since you just want to make progress and it doesn't produce a lot of overhead then it's probably ok.

  • John Douglas Veitch
  • Rhys Green added 1 commit

    added 1 commit

    • d9d3add0 - adding boolean switch to turn on/off likelihood benchmarking

    Compare with previous version

  • Gregory Ashton changed milestone to %0.5.0

    changed milestone to %0.5.0

  • Is there any update on this?

  • @rhys.green The MR looks mostly good to me, you need to resolve the PEP8 issues though

  • Gregory Ashton changed milestone to %0.5.1

    changed milestone to %0.5.1

  • Author Reporter

    I'll get this tidied up over the next couple of days, sorry it's been left on the back burner for a little while now !

  • Rhys Green added 235 commits

    added 235 commits

    Compare with previous version

  • Rhys Green added 1 commit

    added 1 commit

    • 9704e7fb - fixing problem where not counting caused error

    Compare with previous version

  • Rhys Green added 1 commit
  • Rhys Green added 1 commit

    added 1 commit

    Compare with previous version

  • Rhys Green unmarked as a Work In Progress

    unmarked as a Work In Progress

  • Author Reporter

    @moritz.huebner @colm.talbot @gregory.ashton I think this is probably ready to be merged

  • Colm Talbot
  • Colm Talbot
  • Rhys Green added 1 commit

    added 1 commit

    • bd367f3b - moving likelihood count function to base sampler to avoid duplication

    Compare with previous version

  • Rhys Green added 1 commit
  • Rhys Green added 1 commit
  • Rhys Green added 1 commit
  • Rhys Green added 1 commit

    added 1 commit

    • fb070eef - typo - didnt have self.result.num....

    Compare with previous version

  • Rhys Green added 1 commit
  • Rhys Green added 1 commit

    added 1 commit

    • 7a0d5e68 - removing redundant else statements

    Compare with previous version

  • Author Reporter

    @colm.talbot cheers for the helpful comments! I think I've managed to address them and this implementation is a bit tidier what do you think ?

  • Looks good to me! It looks like there are some conflicts. If you fix them we promise we'll merge this one next.

  • Gregory Ashton changed milestone to %0.5.2

    changed milestone to %0.5.2

  • Rhys Green added 35 commits

    added 35 commits

    Compare with previous version

  • Colm Talbot approved this merge request

    approved this merge request

  • Moritz Huebner approved this merge request

    approved this merge request

  • Moritz Huebner resolved all discussions

    resolved all discussions

  • Moritz Huebner mentioned in commit 4a340c1e

    mentioned in commit 4a340c1e

  • Please register or sign in to reply
    Loading