Skip to content
Snippets Groups Projects

Add a catch for linalg exceptions

Merged Gregory Ashton requested to merge add-catch-for-linalg-error into master

Merge request reports

Pipeline #516560 passed

Pipeline passed for 062c5c32 on add-catch-for-linalg-error

Test coverage 69.00% (0.00%) from 1 job

Merged by Colm TalbotColm Talbot 1 year ago (Apr 28, 2023 2:54pm UTC)

Loading

Pipeline #520339 passed with warnings

Pipeline passed with warnings for 11dde570 on master

Test coverage 69.00% (0.00%) from 1 job

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Gregory Ashton added 1 commit

    added 1 commit

    Compare with previous version

  • Colm Talbot approved this merge request

    approved this merge request

  • Just for the record, I assume the linalg error happens when the fisher matrix is ill-conditioned?

  • Colm Talbot changed milestone to %2.1.1

    changed milestone to %2.1.1

  • added <10 lines Bug labels

  • added Sampling label

  • Matthew Pitkin approved this merge request

    approved this merge request

  • Colm Talbot mentioned in commit 11dde570

    mentioned in commit 11dde570

  • merged

  • Colm Talbot picked the changes into the branch release/2.1.x with commit 95a450ab

    picked the changes into the branch release/2.1.x with commit 95a450ab

  • Colm Talbot mentioned in commit 95a450ab

    mentioned in commit 95a450ab

  • Colm Talbot mentioned in merge request !1255 (merged)

    mentioned in merge request !1255 (merged)

  • Is there some more information about the issue this resolves somewhere? I see that the changes are trivial, but how do I find out what the original issue was, or see that this resolves it?

  • @simon-stevenson happy to give context (sorry I should have added this in the first case).

    In !1242 (merged) we added the Fisher Information Matrix proposals to bilby_mcmc which showed a mild improvement in ACT for BBH runs. Unfortunately, I did not test it on BNS runs. Therefore, when @natalie.williams run bilby-mcmc on GW190425, she encountered this error:

    Traceback (most recent call last):
      File "/local/condor/execute/dir_575469/condor_exec.exe", line 8, in <module>
        sys.exit(main())
      File "/home/gregory.ashton/.conda/envs/o4-review/lib/python3.9/site-packages/bilby_pipe/data_analysis.py", line 379, in main
        analysis.run_sampler()
      File "/home/gregory.ashton/.conda/envs/o4-review/lib/python3.9/site-packages/bilby_pipe/data_analysis.py", line 264, in run_sampler
        self.result = bilby.run_sampler(
      File "/home/gregory.ashton/.conda/envs/o4-review/lib/python3.9/site-packages/bilby/core/sampler/__init__.py", line 234, in run_sampler
        result = sampler.run_sampler()
      File "/home/gregory.ashton/.conda/envs/o4-review/lib/python3.9/site-packages/bilby/core/sampler/base_sampler.py", line 96, in wrapped
        output = method(self, *args, **kwargs)
      File "/home/gregory.ashton/.conda/envs/o4-review/lib/python3.9/site-packages/bilby/bilby_mcmc/sampler.py", line 241, in run_sampler
        self.draw()
      File "/home/gregory.ashton/.conda/envs/o4-review/lib/python3.9/site-packages/bilby/bilby_mcmc/sampler.py", line 327, in draw
        self.ptsampler.step_all_chains()
      File "/home/gregory.ashton/.conda/envs/o4-review/lib/python3.9/site-packages/bilby/bilby_mcmc/sampler.py", line 767, in step_all_chains
        self.sampler_list = self.pool.map(call_step, self.sampler_list)
      File "/home/gregory.ashton/.conda/envs/o4-review/lib/python3.9/multiprocessing/pool.py", line 364, in map
        return self._map_async(func, iterable, mapstar, chunksize).get()
      File "/home/gregory.ashton/.conda/envs/o4-review/lib/python3.9/multiprocessing/pool.py", line 771, in get
        raise self._value
    numpy.linalg.LinAlgError: singular matrix

    Which is raised by scipy.linalg.inv in this line. It occurs if the matrix is singular.

    This seems to happen for the BNS case early in the run while the MCMC chain is far away from the maximum likelihood. I'm unclear on exactly why and why it doesn't happen in the BBH case, but it can presumably occur.

    The change does two things. First, it catches the exception and falls back to the previous iFIM or simply return the current sample. Both of these options are robust and will not bias an MCMC chain (just make it less efficient potentially). Second, it changes the fd_eps which controls the size of the perturbation used for numerical finite differencing. In testing, I found that increasing it slightly reduced the number of cases where the error was hit.

    This has been tested as Natlie ran her 190425 tests using a special environment see here which included bilby 2.1.0 and only this MR. You can verify this in the PESummary page here where the bilby version for the MCMC run states the commit 062c5c32 (the tip of this MR).

    Let me know if you have any more questions.

  • Thanks @gregory.ashton, this is what I was missing! This all looks good.

Please register or sign in to reply
Loading