Here are some of the common problems which may occur while running Leif. Don't use network drives Leif creates many large files which should be stored locally. Leif will run extremely slowly and may even crash when running on a network drive. Run it from a directory which is stored on a local drive to avoid these issues. facheck/qblast report gi2taxid errors The NCBI gi_taxid_nucl.dmp file often does not list all gis present in NCBI BLAST FASTA database files. It is not clear why this happens: there may be a bug in the NCBI tool which writes gi_taxid_nucl.dmp. gis which do not have a taxid assigned to them are given the default taxid of 1 ("root"). As a consequences, if reads match with NCBI BLAST FASTA sequences bearing such gis, their consensus taxid will become non-specific ("root"). Why do I have exotic eukaryote reads (such as Pantholops hodgsonii)? Some NCBI BLAST FASTA sequences bear the wrong taxonomic identifier. This occurs because reads used to create de novo assembled genomes often include a small number of bacterial contaminants. Some of these bacterial contaminants' genomes had not been uploaded to Genbank when the genomes of the exotic eukaryote were de novo assembled, so these bacterial sequences looked like small unplaced contigs (and thus were retained in the final genome uploaded to Genbank). "Time to go" estimate for qblast is not very accurate This is normal. Some regions of the NCBI BLAST database may take longer to align than others, and this property is run/read specific. It is common for the initial time estimate to be much longer than the actual run time, and it is also common for the end of alignment to take much longer than estimated (eg. the "time to go" field decreases very slowly at the end of the run). qblast crashes while loading (usually reporting an "out of memory" error) qblast is the most RAM intensive component of the Leif Microbiome Analyzer. It cannot use more than ~1.5 GB of RAM since it is a Win32 application, which means no more than ~200000 read pairs can be processed in a single run. fxsample's parameters can be modified to limit the number of read pairs reaching qblast. |
Leif™ >