This would be a task added to stage1, and could also be a standalone task in pipeline.py to survey the subsampled reads in an unsupervised way for their organism compositon.
This would consume a prebuilt index using the the kraken2 index building repo:
https://github.com/BenjaminJPerry/build-TaFFE-DBs