Skip to content

Results, Experiments and Project Notes

The results of QT analysis are still to be published. Here, I am summarizing the preliminary findings and giving some future directions.

Model verification

We used simulation-based calibration to verify the correctness of our likelihood implementation vs the generative model:

  • we simulate data with the simulators (short “genes” 10–100 nucleotides, small trees: < 50 sp)
  • we then infer using the inference in TreePPL
  • we plot the true known values against their inferred credible intervals and expect to find that the credible intervals are well calibrated

We found what we expected:

Biological results

As the bird dataset was not available, we attempted to apply the model on an echolocation dataset from Pelican. However, all three genes linked to echolocation had > 2,000 nucleotides. SMC inference resulted in out-of-memory errors, whereas MCMC inference did not converge in reasonable time. We are currently improving the MCMC backend of TreePPL and it is possible that the new backend can handle the model.

Archive

Our Python notebooks containing numerical exploration, as well as lab notes are available to download here. However, those are unorganized and contain experimental explorative code. Therefore, they are also password-protected1.


  1. openssl enc -d -aes-256-cbc -pbkdf2 -in experiments.tar.bz2.enc -out experiments-decrypted.tar.bz2 -k PWD -iter 10000