Rate This Document
Findability
Accuracy
Completeness
Readability

Running and Verification

Procedure

  1. Use PuTTY to log in to the server as the root user.
  2. Download the case files.
    wget http://labshare.cshl.edu/shares/schatzlab/www-data/ectools/w303/Illumina_500bp_2x300_R1.fastq.gz
    wget http://labshare.cshl.edu/shares/schatzlab/www-data/ectools/w303/Pacbio.fasta.gz
  3. Decompress the case files.
    gzip Pacbio.fasta.gz –d
    gzip Illumina_500bp_2x300_R1.fastq.gz -d
  4. Process the data.
    SelectLongestReads sum 600000000 longest 0 o Illumina_50x.fastq f Illumina_500bp_2x300_R1.fastq

    SelectLongestReads sum 260000000 longest 0 o Pacbio_20x.fasta f Pacbio.fasta

  5. Create an Illumina_data directory and copy the generated FASTQ file to the Illumina_data directory.
    mkdir Illumina_data && cp Illumina_50x.fastq Illumina_data/
  6. Create an Pacbio_data directory and copy the generated FASTQ file to the Pacbio_data directory.
    mkdir Pacbio_data && cp Pacbio_20x.fasta Pacbio_data/
  7. Create a step1 directory and switch to the directory.
    mkdir step1 && cd step1
  8. Assemble a Contigs sequence using the Illumina fragment library data.
    SparseAssembler LD 0 k 51 g 15 NodeCovTh 1 EdgeCovTh 0 GS 12000000 f ../Illumina_data/Illumina_50x.fastq

    SparseAssembler LD 1 NodeCovTh 2 EdgeCovTh 1 k 51 g 15 GS 12000000 f ../Illumina_data/Illumina_50x.fastq

    The following files are generated:

  9. Find the overlap between the Contigs sequence and Pacbio reads and perform layout.
    DBG2OLC k 17 AdaptiveTh 0.0001 KmerCovTh 2 MinOverlap 20 RemoveChimera 1 Contigs Contigs.txt f ../Pacbio_data/Pacbio_20x.fasta

    Information similar to the following is displayed:

    The following files are generated:

  10. Use the python and shell scripts in the /opt/biosoft/DBG2OLC/utility/ directory to invoke Sparc of the blasr and consensus modules for calculation.
    1. Modify the split_and_run_sparc.sh script.
      vi split_and_run_sparc.sh
    2. Press i to enter the insert mode, comment out line 27, and add line 28.

      After performing 10.a, press Esc, type :set nu, and press Enter to display the line number.

    3. Press Esc, type :wq!, and press Enter to save the file and exit.
    4. Run the following command in the step1 directory:
      cp ../Pacbio_20x.fasta .
      cat Contigs.txt Pacbio_20x.fasta > ctg_pb.fasta
      mkdir consensus_dir
      split_and_run_sparc.sh backbone_raw.fasta DBG2OLC_Consensus_info.txt ctg_pb.fasta ./consensus_dir 2 >cns_log.txt

      The following files are generated: