Rate This Document
Findability
Accuracy
Completeness
Readability

Running GATK4 in Non-Spark Mode

Procedure

  1. Use PuTTY to log in to the server as the root user.
  2. Run the following command to create an index for the FASTA file:
    bwa index -a bwtsw human_g1k_v37.fasta
  3. Run the following commands to compare the fastq file:
    gatk CreateSequenceDictionary -R human_g1k_v37.fasta -O human_g1k_v37.dict
    bwa mem -M -t 64 human_g1k_v37.fasta SRR742200_1.fastq SRR742200_2.fastq > SRR7.sam
  4. Run the following command to re-sort the gene sequence:
    gatk ReorderSam -I SRR7.sam -O SRR7_reorder.bam -R human_g1k_v37.fasta
  5. Run the following command to sort gene sequences by coordinate in descending order:
    gatk SortSam -I SRR7_reorder.bam -O SRR7_sorted.bam --SORT_ORDER coordinate
  6. Run the following command to add a head for the gene sequence:
    gatk AddOrReplaceReadGroups -I SRR7_sorted.bam -O SRR7_addhead.bam -LB lib1 -PL illumina -PU unit1 -SM 20
  7. Run the following command to deduplicate the gene sequences:
    gatk MarkDuplicates -I SRR7_addhead.bam -M test.metric -O SRR7_markdup.bam
  8. Run the following command to perform the recalibration:
    1. Re-sort the FASTA reference files to generate the fai file:
      samtools faidx human_g1k_v37.fasta > human_g1k_v37.fai
    2. Add an index to the vcf file:
      gatk IndexFeatureFile -F dbsnp132_20101103.vcf
      gatk BaseRecalibrator -I SRR7_markdup.bam --known-sites dbsnp132_20101103.vcf -O SRR7_bqsr.bam -R human_g1k_v37.fasta
      gatk ApplyBQSR -bqsr SRR7_bqsr.bam -I SRR7_markdup.bam -O SRR7_aybqsr.bam
  9. Run the following command to check the HaplortypeCaller variation process:
    gatk HaplotypeCaller -I SRR7_aybqsr.bam -O SRR7_raw.vcf -R human_g1k_v37.fasta