|
## Questions asked
|
|
## Questions asked
|
|
1. For the "number_of_karoo_runs = 2", shouldn't it be 200?
|
|
1.**For the "number_of_karoo_runs = 2", shouldn't it be 200?**
|
|
|
|
|
|
Ans: Yes. It was limited to 2 just to test changes. The final probabilities are generated with all 200 runs
|
|
Ans: Yes. It was limited to 2 just to test changes. The final probabilities are generated with all 200 runs
|
|
|
|
|
|
2. What is at the output of gp.fx_karoo_gp? Was it reviewed?
|
|
2. **What is at the output of gp.fx_karoo_gp? Was it reviewed?**
|
|
|
|
|
|
Ans: Karoo-gp is an external package based on genetic_programming. Its performance on well known datasets is shown in referenced in the user guide : [guide](https://github.com/kstaats/karoo_gp/blob/master/Karoo_GP_User_Guide.pdf). My understanding was external package review is outside the scope of LIGO.
|
|
Ans: Karoo-gp is an external package based on genetic_programming. Its performance on well known datasets is shown in referenced in the user guide : [guide](https://github.com/kstaats/karoo_gp/blob/master/Karoo_GP_User_Guide.pdf). My understanding was external package review is outside the scope of LIGO.
|
|
|
|
|
|
3. "far_threshold = 3.858e-7", why this threshold?
|
|
3. **"far_threshold = 3.858e-7", why this threshold?**
|
|
|
|
|
|
Ans: The far threshold is in combined_far and is currently set at 1/month.
|
|
Ans: The far threshold is in combined_far and is currently set at 1/month.
|
|
|
|
|
|
4. If you do the split 0.7, do you have enough data to train? I think I've already asked about it.
|
|
4. **If you do the split 0.7, do you have enough data to train? I think I've already asked about it.**
|
|
|
|
|
|
Ans: The total dataset is ~200000. Using 70% for training is enough for training the algorithm.
|
|
Ans: The total dataset is ~200000. Using 70% for training is enough for training the algorithm.
|
|
|
|
|
|
|
|
|
|
5. How often are the end expressions not unique? And how much overlap is between non-unique expressions?
|
|
5. **How often are the end expressions not unique? And how much overlap is between non-unique expressions?**
|
|
|
|
|
|
Ans: In case for has_Remnant, there were 2-3 repeating expressions per 100 training. For has_NS there were ~20 repeating expressions per 100 training. We checked the the performance of repeating expressions. They usually have a high TPR and low FPR.
|
|
Ans: In case for has_Remnant, there were 2-3 repeating expressions per 100 training. For has_NS there were ~20 repeating expressions per 100 training. We checked the the performance of repeating expressions. They usually have a high TPR and low FPR.
|
|
|
|
|
|
6. What happens when the training fail, did it happen?
|
|
6. **What happens when the training fail, did it happen?**
|
|
|
|
|
|
Ans: Training GP for all EOS is a lengthy process. A single training usually takes ~30 min to 1 hr
|
|
Ans: Training GP for all EOS is a lengthy process. A single training usually takes ~30 min to 1 hr
|
|
depending on the GPU compatibility. Currently each training instance is carried out independently in condor. There are cases where one of the training instance can fail leading to only 199 expressions instead of 200. We currently dont have the feature to automatically identify this and restart the training process. It needs to be done manually now. I can think of some idea to automate this in future editions.
|
|
depending on the GPU compatibility. Currently each training instance is carried out independently in condor. There are cases where one of the training instance can fail leading to only 199 expressions instead of 200. We currently dont have the feature to automatically identify this and restart the training process. It needs to be done manually now. I can think of some idea to automate this in future editions.
|
|
|
|
|
|
|
|
|
|
7. What is the distribution of the expression scores?
|
|
7. **What is the distribution of the expression scores?**
|
|
|
|
|
|
Ans: The distribution of expression scores is shown below:
|
|
Ans: The distribution of expression score is shown below. The top figure is for has_remnant expression and bottom is for has_NS
|
|
|
|
|
|

|
|

|
|
|
|
|
... | | ... | |