Rate This Document
Findability
Accuracy
Completeness
Readability

Running and Verifying Velvet

Procedure

  1. Use PuTTY to log in to the server as the root user.
  2. Run the following command to switch to the test directory:
    cd /path/to/VELVET/velvet-1.2.10
  3. Run the following command to run the executable file velveth to prepare data:
    ./velveth output/ 31 -shortPaired -fastq.gz tests/read1.fq.gz -shortPaired -fastq.gz tests/read2.fq.gz
    • output: specifies the output directory.
    • shortPaired: specifies the sequencing type.
    • Fastaq: specifies the sequence format. Different parameters are used to specify the format of the input file. -fasta corresponds to the FASTA format. -fastq corresponds to the FASTQ format. -fastq.gz corresponds to the fastq.gz format. -fasta.gz corresponds to the fasta.gz format. -sam: corresponds to the sam format. -bam corresponds to the bam format.

    The following is an example of the command output:

    [0.000000] Reading FastQ file tests/read1.fq.gz;
    [0.083994] 25000 sequences found
    [0.084004] Done
    [0.084266] Reading FastQ file tests/read2.fq.gz;
    [0.162083] 25000 sequences found
    [0.162091] Done
    [0.364858] Reading read set file output//Sequences;
    [0.376048] 50000 sequences found
    [0.427913] Done
    [0.427929] 50000 sequences in total.
    [0.428268] Writing into roadmap file output//Roadmaps...
    [0.468365] Inputting sequences...
    [0.468486] Inputting sequence 0 / 50000
    [1.182309]  === Sequences loaded in 0.713954 s
    [1.303762] Done inputting sequences
    [1.303770] Destroying splay table
    [1.305914] Splay table destroyed
  4. Run the following command to run the executable file velvetg to assemble the genome:
    ./velvetg output/ -min_contig_lgth 100
    • output: specifies the output directory of 3.
    • min_contig_lgth: specifies the minimum length of the Contig field. A Contig field whose length is less than the minimum length will be deleted and will not be displayed in the final result. After the running is complete, the contigs.fa file in the output directory is the final assembly result.

    The following is an example of the command output:

    [1.258599] Concatenation over!
    [1.259034] Writing contigs into output//contigs.fa...
    [1.292427] Writing into stats file output//stats.txt...
    [1.327635] Writing into graph file output//LastGraph...
    Final graph has 987 nodes and n50 of 199, max 2546, total 110172, using 0/50000 reads