Advertisement
Guest User

Untitled

a guest
Sep 14th, 2022
45
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 218.48 KB | None | 0 0
  1. {
  2. "class": "Workflow",
  3. "cwlVersion": "v1.0",
  4. "doc": "This workflow represents the GATK Best Practices for SNP and INDEL calling on DNA data.\n\nStarting from a processed **BAM** file, the workflow performs variant calling with respect to the reference genome. Depending on **HaplotypeCaller's** output file type (**VCF** or **g.VCF**, the resulting file of this workflow can be used as a stand alone result for single-sample analysis, or as one of the cohort files downstream joint calling analysis. On the GATK website you can find more detailed information about calling germline variants for single sample or joint calling analysis [1].\n\n### Common Use Cases\n\n* The **haplotypecaller-gvcf-gatk4** (original WDL name) workflow [1] runs the **HaplotypeCaller** tool from GATK4 in GVCF mode on a single sample according to GATK Best Practices. \n* To run HaplotypeCaller in a mode appropriate for joint calling analysis, one needs to set (`--emit_ref_confidence`) parameter to GVCF. \n* When executed, the workflow scatters the **HaplotypeCaller** tool over the **Calling intervals** file (`--in_intervals`). \n* The resulting g.VCF files are merged with **GATK MergeVCF**. \n* The output file produced will be a single g.VCF file which can be further processed in joint-discovery workflow. \n* By default, the output file is compressed with gzip, leading to a G.VCF.GZ extension.\n\n* This workflow can also be used for single sample analysis. For that purpose, it produces a VCF file which is obtained by setting (the `--emit_ref_confidence`) parameter to NONE. \n\n\n### Changes Introduced by Seven Bridges\n\n* The original **Generic germline variant per-sample calling** WDL implementation has a step called **CramToBamTask** which accepts CRAM files and converts them to BAM files with **samtools view**, while also indexing them. In this CWL implementation, this step is skipped as **GATK HaplotypeCaller** has the option to work with CRAM files. Keep in mind that CRAM files need to be indexed. \n\n* To enable scattering of **GATK Haplotypecallier** tool, we have introduced the **GATK IntervalListTool**, solution given in **GATK Production Germline short variant per-sample calling** [2]. \n\n### Common Issues and Important notes\n\n* The **HaplotypeCaller** app uses **Intervals list** to restrict processing to specific genomic intervals. You can set the **Scatter count** value in order to split **Intervals list** into smaller intervals. **HaplotypeCaller** processes these intervals in parallel, which can significantly reduce workflow execution time in some cases.\n\n* The workflow accepts multiple flowcell BAMs on input, however, they must all share the same sample ID. Otherwise, some GATK tools will fail.\n\n* Running a **batch task**: Batching is performed by **Sample ID** metadata field on the **Aligned and Processed BAM** input port. For running analyses in batches, it is necessary to set **Sample ID** metadata for each **Processed and aligned BAM** file.\n\n### Performance Benchmarking\n \n| BAM Input size | Experiment type | Coverage | Duration | Cost | Instance |\n|-----------------------|----------------- |------------ |-------------|--------|--------------|\n| 55.8GiB | WGS (scatter count = 20) | ~50x | 17h 35min | $9.42 | c4.2xlarge |\n| 55.8GiB | WGS (scatter count = 80) | ~50x | 10h 32min | $5.64 | c4.2xlarge |\n| 24.6GiB | WGS (scatter count = 80) | ~10x | 4h 12min | $2.25 | c4.2xlarge |\n| 3.5GiB | WES (scatter count = 1) | ~70x | 17min | $0.16 | c4.2xlarge |\n| 1.9GiB | WES (scatter count = 1) | ~40x | 11min | $0.11 | c4.2xlarge | \n| 1.1GiB | WES (scatter count = 1) | ~20x | 9min | $0.08 | c4.2xlarge | \n| 434MiB | WES (scatter count = 1) | ~10x | 6min | $0.06 | c4.2xlarge | \n\n\n\n### API Python Implementation\nThe app's draft task can also be submitted via the **API**. In order to learn how to get your **Authentication token** and **API endpoint** for corresponding platform visit our [documentation](https://github.com/sbg/sevenbridges-python#authentication-and-configuration).\n\n```python\n# Initialize the SBG Python API\nfrom sevenbridges import Api\napi = Api(token=\"enter_your_token\", url=\"enter_api_endpoint\")\nproject_id = \"your_username/project\"\napp_id = \"your_username/project/app\"\n# Replace inputs with appropriate values\ninputs = {\n \"in_reference\": api.files.query(project=project_id, names=[\"Homo_sapiens_assembly38.fasta\"])[0], \n\t\"in_alignments\": list(api.files.query(project=project_id, names=[\"HCC1143BL_WES_1.processed.bam\"])), \n\t\"in_intervals\": list(api.files.query(project=project_id, names=[\"wgs_calling_regions.hg38.interval_list\"]))}\n\n# Creates draft task\ntask = api.tasks.create(name=\"GATK Best Practice Germline snps and indels 4.1.0.0 - API Run\", project=project_id, app=app_id, inputs=inputs, run=False)\n```\n\nInstructions for installing and configuring the API Python client, are provided on [github](https://github.com/sbg/sevenbridges-python#installation). For more information about using the API Python client, consult [the client documentation](http://sevenbridges-python.readthedocs.io/en/latest/). **More examples** are available [here](https://github.com/sbg/okAPI).\n\nAdditionally, [API R](https://github.com/sbg/sevenbridges-r) and [API Java](https://github.com/sbg/sevenbridges-java) clients are available. To learn more about using these API clients please refer to the [API R client documentation](https://sbg.github.io/sevenbridges-r/), and [API Java client documentation](https://docs.sevenbridges.com/docs/java-library-quickstart).\n\n### References\n\n[1] [Broad germline SNPS and INDELS](https://github.com/gatk-workflows/gatk4-germline-snps-indels)\n\n[2] [Broad Producion WGS germline SNPs and INDELs](https://github.com/gatk-workflows/broad-prod-wgs-germline-snps-indels)",
  5. "label": "GATK Broad Best Practice Germline snps and indels variant calling 4.1.0.0",
  6. "$namespaces": {
  7. "sbg": "https://sevenbridges.com"
  8. },
  9. "inputs": [
  10. {
  11. "id": "in_reference",
  12. "sbg:fileTypes": "FASTA, FA",
  13. "type": "File",
  14. "label": "Reference",
  15. "doc": "Reference FASTA file.",
  16. "secondaryFiles": [
  17. ".fai",
  18. "^.dict"
  19. ],
  20. "sbg:suggestedValue": {
  21. "class": "File",
  22. "path": "5772b6c7507c1752674486d1",
  23. "name": "Homo_sapiens_assembly38.fasta"
  24. },
  25. "sbg:x": -697,
  26. "sbg:y": -168
  27. },
  28. {
  29. "id": "in_intervals",
  30. "sbg:fileTypes": "VCF, INTERVAL_LIST",
  31. "type": "File[]",
  32. "label": "Calling intervals",
  33. "doc": "File with intervals that should be considered for variant calling. This file can be obtained from BED file using GATK BedToIntervalList.",
  34. "sbg:suggestedValue": [
  35. {
  36. "class": "File",
  37. "path": "5e3c4438c80cb0e4c9353b0e",
  38. "name": "wgs_calling_regions.hg38.interval_list"
  39. }
  40. ],
  41. "sbg:x": -743,
  42. "sbg:y": -345
  43. },
  44. {
  45. "id": "in_alignments",
  46. "sbg:fileTypes": "BAM, SAM, CRAM",
  47. "type": "File[]",
  48. "label": "Input alignments",
  49. "doc": "BAM/SAM/CRAM file containing reads this argument must be specified at least once.",
  50. "secondaryFiles": [
  51. "${ \n if(self) { if(self.nameext == '.bam'){\n return self.nameroot + \".bai\";\n }\n else if(self.nameext == '.cram'){\n return self.nameroot + \".crai\";\n } else {\n return null;\n }}\n}"
  52. ],
  53. "sbg:x": -676,
  54. "sbg:y": -15
  55. },
  56. {
  57. "id": "contamination_fraction_to_filter",
  58. "type": "float?",
  59. "label": "Contamination fraction to filter",
  60. "doc": "Fraction of contamination in sequencing data (for all samples) to aggressively remove .",
  61. "sbg:exposed": true
  62. },
  63. {
  64. "id": "create_output_variant_index",
  65. "type": [
  66. "null",
  67. {
  68. "type": "enum",
  69. "symbols": [
  70. "true",
  71. "false"
  72. ],
  73. "name": "create_output_variant_index"
  74. }
  75. ],
  76. "label": "Create output variant index",
  77. "doc": "If true, create a VCF index when writing a coordinate-sorted VCF file.",
  78. "sbg:exposed": true
  79. },
  80. {
  81. "id": "emit_ref_confidence",
  82. "type": [
  83. "null",
  84. {
  85. "type": "enum",
  86. "symbols": [
  87. "NONE",
  88. "BP_RESOLUTION",
  89. "GVCF"
  90. ],
  91. "name": "emit_ref_confidence"
  92. }
  93. ],
  94. "label": "Emit ref confidence",
  95. "doc": "Mode for emitting reference confidence scores.",
  96. "sbg:exposed": true
  97. },
  98. {
  99. "id": "output_mode",
  100. "type": [
  101. "null",
  102. {
  103. "type": "enum",
  104. "symbols": [
  105. "EMIT_VARIANTS_ONLY",
  106. "EMIT_ALL_CONFIDENT_SITES",
  107. "EMIT_ALL_SITES"
  108. ],
  109. "name": "output_mode"
  110. }
  111. ],
  112. "label": "Output mode",
  113. "doc": "Specifies which type of calls we should output.",
  114. "sbg:exposed": true
  115. },
  116. {
  117. "id": "output_extension",
  118. "type": [
  119. "null",
  120. {
  121. "type": "enum",
  122. "symbols": [
  123. "vcf",
  124. "vcf.gz"
  125. ],
  126. "name": "output_extension"
  127. }
  128. ],
  129. "label": "Output VCF extension",
  130. "doc": "Output VCF extension.",
  131. "sbg:exposed": true
  132. },
  133. {
  134. "id": "output_file_format",
  135. "type": [
  136. "null",
  137. {
  138. "type": "enum",
  139. "symbols": [
  140. "vcf",
  141. "bcf",
  142. "vcf.gz"
  143. ],
  144. "name": "output_file_format"
  145. }
  146. ],
  147. "label": "Output file format",
  148. "doc": "Output file format.",
  149. "sbg:exposed": true
  150. },
  151. {
  152. "id": "output_prefix",
  153. "type": "string?",
  154. "label": "Output prefix",
  155. "doc": "Output file name prefix.",
  156. "sbg:exposed": true
  157. }
  158. ],
  159. "outputs": [
  160. {
  161. "id": "out_variants",
  162. "outputSource": [
  163. "gatk_mergevcfs_4_1_0_0/out_variants"
  164. ],
  165. "sbg:fileTypes": "VCF, VCF.GZ, BCF",
  166. "type": "File?",
  167. "label": "VCF file",
  168. "doc": "Merged VCF file.",
  169. "secondaryFiles": [
  170. "${\n if(self){ return self.basename + \".idx\";}\n}\n"
  171. ],
  172. "sbg:x": 17,
  173. "sbg:y": -174.2436981201172
  174. }
  175. ],
  176. "steps": [
  177. {
  178. "id": "gatk_haplotypecaller_4_1_0_0",
  179. "in": [
  180. {
  181. "id": "contamination_fraction_to_filter",
  182. "source": "contamination_fraction_to_filter"
  183. },
  184. {
  185. "id": "create_output_variant_index",
  186. "default": "true",
  187. "source": "create_output_variant_index"
  188. },
  189. {
  190. "id": "emit_ref_confidence",
  191. "source": "emit_ref_confidence"
  192. },
  193. {
  194. "id": "in_alignments",
  195. "source": [
  196. "in_alignments"
  197. ]
  198. },
  199. {
  200. "id": "include_intervals_file",
  201. "source": "gatk_intervallisttools_4_1_0_0/output_interval_list"
  202. },
  203. {
  204. "id": "output_mode",
  205. "source": "output_mode"
  206. },
  207. {
  208. "id": "in_reference",
  209. "source": "in_reference"
  210. },
  211. {
  212. "id": "output_extension",
  213. "default": "vcf",
  214. "source": "output_extension"
  215. }
  216. ],
  217. "out": [
  218. {
  219. "id": "out_variants"
  220. },
  221. {
  222. "id": "out_alignments"
  223. },
  224. {
  225. "id": "out_graph"
  226. },
  227. {
  228. "id": "out_activity_profile"
  229. },
  230. {
  231. "id": "out_assembly_region"
  232. }
  233. ],
  234. "run": {
  235. "class": "CommandLineTool",
  236. "cwlVersion": "v1.0",
  237. "$namespaces": {
  238. "sbg": "https://sevenbridges.com"
  239. },
  240. "id": "uros_sipetic/gatk-4-1-0-0-demo/gatk-haplotypecaller-4-1-0-0/21",
  241. "baseCommand": [
  242. "/opt/gatk-4.1.0.0/gatk --java-options"
  243. ],
  244. "inputs": [
  245. {
  246. "sbg:category": "Advanced Arguments",
  247. "sbg:toolDefaultValue": "0.002",
  248. "id": "active_probability_threshold",
  249. "type": "float?",
  250. "inputBinding": {
  251. "prefix": "--active-probability-threshold",
  252. "shellQuote": false,
  253. "position": 4
  254. },
  255. "label": "Active probability threshold",
  256. "doc": "Minimum probability for a locus to be considered active."
  257. },
  258. {
  259. "sbg:category": "Optional Arguments",
  260. "sbg:toolDefaultValue": "null",
  261. "id": "activity_profile_out",
  262. "type": "string?",
  263. "inputBinding": {
  264. "prefix": "--activity-profile-out",
  265. "shellQuote": false,
  266. "position": 4,
  267. "valueFrom": "${\n if(inputs.activity_profile_out) {\n var tmp = inputs.activity_profile_out.slice(-4).toLowerCase();\n if(tmp == \".igv\") {\n return inputs.activity_profile_out;\n }\n else {\n return inputs.activity_profile_out + '.igv';\n }\n }\n else {\n return null;\n }\n}"
  268. },
  269. "label": "Activity profile output",
  270. "doc": "Output the raw activity profile results in IGV format."
  271. },
  272. {
  273. "sbg:category": "Advanced Arguments",
  274. "sbg:toolDefaultValue": "false",
  275. "id": "adaptive_pruning",
  276. "type": "boolean?",
  277. "inputBinding": {
  278. "prefix": "--adaptive-pruning",
  279. "shellQuote": false,
  280. "position": 4
  281. },
  282. "label": "Adaptive pruning",
  283. "doc": "Use Mutect2's adaptive graph pruning algorithm."
  284. },
  285. {
  286. "sbg:category": "Advanced Arguments",
  287. "sbg:toolDefaultValue": "0.001",
  288. "id": "adaptive_pruning_initial_error_rate",
  289. "type": "float?",
  290. "inputBinding": {
  291. "prefix": "--adaptive-pruning-initial-error-rate",
  292. "shellQuote": false,
  293. "position": 4
  294. },
  295. "label": "Adaptive pruning initial error rate",
  296. "doc": "Initial base error rate estimate for adaptive pruning."
  297. },
  298. {
  299. "sbg:altPrefix": "-add-output-sam-program-record",
  300. "sbg:category": "Optional Arguments",
  301. "sbg:toolDefaultValue": "true",
  302. "id": "add_output_sam_program_record",
  303. "type": [
  304. "null",
  305. {
  306. "type": "enum",
  307. "symbols": [
  308. "true",
  309. "false"
  310. ],
  311. "name": "add_output_sam_program_record"
  312. }
  313. ],
  314. "inputBinding": {
  315. "prefix": "--add-output-sam-program-record",
  316. "shellQuote": false,
  317. "position": 4
  318. },
  319. "label": "Add output SAM program record",
  320. "doc": "If true, adds a PG tag to created SAM/BAM/CRAM files."
  321. },
  322. {
  323. "sbg:altPrefix": "-add-output-vcf-command-line",
  324. "sbg:category": "Optional Arguments",
  325. "sbg:toolDefaultValue": "true",
  326. "id": "add_output_vcf_command_line",
  327. "type": [
  328. "null",
  329. {
  330. "type": "enum",
  331. "symbols": [
  332. "true",
  333. "false"
  334. ],
  335. "name": "add_output_vcf_command_line"
  336. }
  337. ],
  338. "inputBinding": {
  339. "prefix": "--add-output-vcf-command-line",
  340. "shellQuote": false,
  341. "position": 4
  342. },
  343. "label": "Add output VCF command line",
  344. "doc": "If true, adds a command line header line to created VCF files."
  345. },
  346. {
  347. "sbg:category": "Advanced Arguments",
  348. "sbg:toolDefaultValue": "false",
  349. "id": "all_site_pls",
  350. "type": "boolean?",
  351. "inputBinding": {
  352. "prefix": "--all-site-pls",
  353. "shellQuote": false,
  354. "position": 4
  355. },
  356. "label": "Annotate all sites with PLs",
  357. "doc": "Advanced, experimental argument: if SNP likelihood model is specified, and if EMIT_ALL_SITES output mode is set, when we set this argument then we will also emit PLs at all sites. This will give a measure of reference confidence and a measure of which alt alleles are more plausible (if any). WARNINGS: - This feature will inflate VCF file size considerably. - All SNP ALT alleles will be emitted with corresponding 10 PL values. - An error will be emitted if EMIT_ALL_SITES is not set, or if anything other than diploid SNP model is used"
  358. },
  359. {
  360. "sbg:category": "Optional Arguments",
  361. "sbg:toolDefaultValue": "null",
  362. "id": "alleles",
  363. "type": "File?",
  364. "inputBinding": {
  365. "prefix": "--alleles",
  366. "shellQuote": false,
  367. "position": 4
  368. },
  369. "label": "Alleles",
  370. "doc": "The set of alleles at which to genotype when --genotyping_mode is GENOTYPE_GIVEN_ALLELES.",
  371. "sbg:fileTypes": "VCF, VCF.GZ",
  372. "secondaryFiles": [
  373. "${\n if(self) {\n if (self.basename.slice(-4).toLowerCase() == \".vcf\") {\n return self.basename + \".idx\";\n }\n else if (self.basename.slice(-7).toLowerCase() == \".vcf.gz\") {\n return self.basename + \".tbi\";\n }\n else {\n return self.basename + \".idx\";\n }\n }\n else {\n return null;\n }\n}"
  374. ]
  375. },
  376. {
  377. "sbg:category": "Advanced Arguments",
  378. "sbg:toolDefaultValue": "false",
  379. "id": "allow_non_unique_kmers_in_ref",
  380. "type": "boolean?",
  381. "inputBinding": {
  382. "prefix": "--allow-non-unique-kmers-in-ref",
  383. "shellQuote": false,
  384. "position": 4
  385. },
  386. "label": "Allow non unique kmers in ref",
  387. "doc": "Allow graphs that have non-unique kmers in the reference."
  388. },
  389. {
  390. "sbg:category": "Optional Arguments",
  391. "sbg:toolDefaultValue": "false",
  392. "id": "annotate_with_num_discovered_alleles",
  393. "type": "boolean?",
  394. "inputBinding": {
  395. "prefix": "--annotate-with-num-discovered-alleles",
  396. "shellQuote": false,
  397. "position": 4
  398. },
  399. "label": "Annotate with num discovered alleles",
  400. "doc": "If provided, we will annotate records with the number of alternate alleles that were discovered (but not necessarily genotyped) at a given site."
  401. },
  402. {
  403. "sbg:altPrefix": "-A",
  404. "sbg:category": "Optional Arguments",
  405. "sbg:toolDefaultValue": "null",
  406. "id": "annotation",
  407. "type": [
  408. "null",
  409. {
  410. "type": "array",
  411. "items": {
  412. "type": "enum",
  413. "name": "annotation",
  414. "symbols": [
  415. "AlleleFraction",
  416. "AS_BaseQualityRankSumTest",
  417. "AS_FisherStrand",
  418. "AS_InbreedingCoeff",
  419. "AS_MappingQualityRankSumTest",
  420. "AS_QualByDepth",
  421. "AS_ReadPosRankSumTest",
  422. "AS_RMSMappingQuality",
  423. "AS_StrandOddsRatio",
  424. "BaseQuality",
  425. "BaseQualityRankSumTest",
  426. "ChromosomeCounts",
  427. "ClippingRankSumTest",
  428. "CountNs",
  429. "Coverage",
  430. "DepthPerAlleleBySample",
  431. "DepthPerSampleHC",
  432. "ExcessHet",
  433. "FisherStrand",
  434. "FragmentLength",
  435. "GenotypeSummaries",
  436. "InbreedingCoeff",
  437. "LikelihoodRankSumTest",
  438. "MappingQuality",
  439. "MappingQualityRankSumTest",
  440. "MappingQualityZero",
  441. "OriginalAlignment",
  442. "OxoGReadCounts",
  443. "PolymorphicNuMT",
  444. "PossibleDeNovo",
  445. "QualByDepth",
  446. "ReadOrientationArtifact",
  447. "ReadPosition",
  448. "ReadPosRankSumTest",
  449. "ReferenceBases",
  450. "RMSMappingQuality",
  451. "SampleList",
  452. "StrandArtifact",
  453. "StrandBiasBySample",
  454. "StrandOddsRatio",
  455. "TandemRepeat",
  456. "UniqueAltReadCount"
  457. ]
  458. }
  459. }
  460. ],
  461. "inputBinding": {
  462. "shellQuote": false,
  463. "position": 4,
  464. "valueFrom": "${\n if (self)\n {\n var cmd = [];\n for (var i = 0; i < self.length; i++) \n {\n cmd.push('--annotation', self[i]);\n }\n return cmd.join(' ');\n }\n \n}"
  465. },
  466. "label": "Annotation",
  467. "doc": "One or more specific annotations to add to variant calls. This argument may be specified 0 or more times."
  468. },
  469. {
  470. "sbg:altPrefix": "-G",
  471. "sbg:category": "Optional Arguments",
  472. "sbg:toolDefaultValue": "null",
  473. "id": "annotation_group",
  474. "type": [
  475. "null",
  476. {
  477. "type": "array",
  478. "items": {
  479. "type": "enum",
  480. "name": "annotation_group",
  481. "symbols": [
  482. "AS_StandardAnnotation",
  483. "OrientationBiasMixtureModelAnnotation",
  484. "ReducibleAnnotation",
  485. "StandardAnnotation",
  486. "StandardHCAnnotation",
  487. "StandardMutectAnnotation"
  488. ]
  489. }
  490. }
  491. ],
  492. "inputBinding": {
  493. "shellQuote": false,
  494. "position": 4,
  495. "valueFrom": "${\n if (self)\n {\n var cmd = [];\n for (var i = 0; i < self.length; i++) \n {\n cmd.push('--annotation-group', self[i]);\n }\n return cmd.join(' ');\n }\n \n}"
  496. },
  497. "label": "Annotation group",
  498. "doc": "One or more groups of annotations to apply to variant calls. This argument may be specified 0 or more times."
  499. },
  500. {
  501. "sbg:altPrefix": "-AX",
  502. "sbg:category": "Optional Arguments",
  503. "sbg:toolDefaultValue": "null",
  504. "id": "annotations_to_exclude",
  505. "type": [
  506. "null",
  507. {
  508. "type": "array",
  509. "items": {
  510. "type": "enum",
  511. "name": "annotations_to_exclude",
  512. "symbols": [
  513. "BaseQualityRankSumTest",
  514. "ChromosomeCounts",
  515. "Coverage",
  516. "DepthPerAlleleBySample",
  517. "DepthPerSampleHC",
  518. "ExcessHet",
  519. "FisherStrand",
  520. "InbreedingCoeff",
  521. "MappingQualityRankSumTest",
  522. "QualByDepth",
  523. "ReadPosRankSumTest",
  524. "RMSMappingQuality",
  525. "StrandOddsRatio"
  526. ]
  527. }
  528. }
  529. ],
  530. "inputBinding": {
  531. "shellQuote": false,
  532. "position": 4,
  533. "valueFrom": "${\n if (self)\n {\n var cmd = [];\n for (var i = 0; i < self.length; i++) \n {\n cmd.push('--annotations-to-exclude', self[i]);\n }\n return cmd.join(' ');\n }\n \n}"
  534. },
  535. "label": "Annotations to exclude",
  536. "doc": "One or more specific annotations to exclude from variant calls. This argument may be specified 0 or more times. Which annotations to exclude from output in the variant calls. Note that this argument has higher priority than the -A or -G arguments, so these annotations will be excluded even if they are explicitly included with the other options."
  537. },
  538. {
  539. "sbg:category": "Optional Arguments",
  540. "sbg:toolDefaultValue": "null",
  541. "id": "assembly_region_out",
  542. "type": "string?",
  543. "inputBinding": {
  544. "prefix": "--assembly-region-out",
  545. "shellQuote": false,
  546. "position": 4,
  547. "valueFrom": "${\n if(inputs.assembly_region_out) {\n var tmp = inputs.assembly_region_out.slice(-4).toLowerCase();\n if(tmp == \".igv\") {\n return inputs.assembly_region_out;\n }\n else {\n return inputs.assembly_region_out + '.igv';\n }\n }\n else {\n return null;\n }\n}"
  548. },
  549. "label": "Assembly region output",
  550. "doc": "Output the assembly region to this IGV formatted file."
  551. },
  552. {
  553. "sbg:category": "Advanced Arguments",
  554. "sbg:toolDefaultValue": "100",
  555. "id": "assembly_region_padding",
  556. "type": "int?",
  557. "inputBinding": {
  558. "prefix": "--assembly-region-padding",
  559. "shellQuote": false,
  560. "position": 4
  561. },
  562. "label": "Assembly region padding",
  563. "doc": "Number of additional bases of context to include around each assembly region."
  564. },
  565. {
  566. "sbg:altPrefix": "-bamout",
  567. "sbg:category": "Advanced Arguments",
  568. "sbg:toolDefaultValue": "null",
  569. "id": "bam_output",
  570. "type": "string?",
  571. "inputBinding": {
  572. "prefix": "--bam-output",
  573. "shellQuote": false,
  574. "position": 4,
  575. "valueFrom": "${\n if(inputs.bam_output) {\n var tmp = inputs.bam_output.slice(-4).toLowerCase();\n if(tmp == \".bam\") {\n return inputs.bam_output;\n }\n else {\n return inputs.bam_output + '.bam';\n }\n }\n else {\n return null;\n }\n}"
  576. },
  577. "label": "BAM output",
  578. "doc": "File to which assembled haplotypes should be written."
  579. },
  580. {
  581. "sbg:category": "Advanced Arguments",
  582. "sbg:toolDefaultValue": "CALLED_HAPLOTYPES",
  583. "id": "bam_writer_type",
  584. "type": [
  585. "null",
  586. {
  587. "type": "enum",
  588. "symbols": [
  589. "ALL_POSSIBLE_HAPLOTYPES",
  590. "CALLED_HAPLOTYPES"
  591. ],
  592. "name": "bam_writer_type"
  593. }
  594. ],
  595. "inputBinding": {
  596. "prefix": "--bam-writer-type",
  597. "shellQuote": false,
  598. "position": 4
  599. },
  600. "label": "BAM writer type",
  601. "doc": "Which haplotypes should be written to the BAM."
  602. },
  603. {
  604. "sbg:category": "Optional Arguments",
  605. "sbg:toolDefaultValue": "18",
  606. "id": "base_quality_score_threshold",
  607. "type": "int?",
  608. "inputBinding": {
  609. "prefix": "--base-quality-score-threshold",
  610. "shellQuote": false,
  611. "position": 4
  612. },
  613. "label": "Base quality score threshold",
  614. "doc": "Base qualities below this threshold will be reduced to the minimum (6)."
  615. },
  616. {
  617. "sbg:altPrefix": "-comp",
  618. "sbg:category": "Advanced Arguments",
  619. "sbg:toolDefaultValue": "null",
  620. "id": "comp",
  621. "type": "File[]?",
  622. "inputBinding": {
  623. "shellQuote": false,
  624. "position": 4,
  625. "valueFrom": "${\n if (inputs.comp)\n {\n var c = [].concat(inputs.comp);\n var cmd = [];\n for (var i = 0; i < c.length; i++) \n {\n cmd.push('--comp', c[i].path);\n }\n return cmd.join(' ');\n }\n return '';\n}"
  626. },
  627. "label": "Comparison VCF",
  628. "doc": "Comparison vcf file(s). If a call overlaps with a record from the provided comp track, the INFO field will be annotated as such in the output with the track name. Records that are filtered in the comp track will be ignored. Note that 'dbSNP' has been special-cased (see the --dbsnp)",
  629. "sbg:fileTypes": "VCF, VCF.GZ",
  630. "secondaryFiles": [
  631. "${\n if(self) {\n var comp = self;\n if (comp.nameext == '.vcf' || comp.nameext == '.VCF') {\n return comp.basename + \".idx\";\n }\n else if (comp.nameext == \".gz\" || comp.nameext == '.GZ') {\n var tmp = comp.basename.slice(-7);\n if(tmp.toLowerCase() == '.vcf.gz') {\n return comp.basename + \".tbi\"; \n }\n }\n else {\n return comp.basename + \".idx\";\n }\n }\n}"
  632. ]
  633. },
  634. {
  635. "sbg:category": "Advanced Arguments",
  636. "sbg:toolDefaultValue": "false",
  637. "id": "consensus",
  638. "type": "boolean?",
  639. "inputBinding": {
  640. "prefix": "--consensus",
  641. "shellQuote": false,
  642. "position": 4
  643. },
  644. "label": "Consensus",
  645. "doc": "1000g consensus mode."
  646. },
  647. {
  648. "sbg:altPrefix": "-contamination-file",
  649. "sbg:category": "Advanced Arguments",
  650. "id": "contamination_fraction_per_sample_file",
  651. "type": "File?",
  652. "inputBinding": {
  653. "prefix": "--contamination-fraction-per-sample-file",
  654. "shellQuote": false,
  655. "position": 4
  656. },
  657. "label": "Contamination fraction per sample",
  658. "doc": "Tab-separated file containing fraction of contamination in sequencing data (per sample) to aggressively remove. Format should be \"<SampleID><TAB><Contamination>\" (Contamination is double) per line; No header.",
  659. "sbg:fileTypes": "TSV"
  660. },
  661. {
  662. "sbg:altPrefix": "-contamination",
  663. "sbg:category": "Optional Arguments",
  664. "sbg:toolDefaultValue": "0.0",
  665. "id": "contamination_fraction_to_filter",
  666. "type": "float?",
  667. "inputBinding": {
  668. "prefix": "--contamination-fraction-to-filter",
  669. "shellQuote": false,
  670. "position": 4
  671. },
  672. "label": "Contamination fraction to filter",
  673. "doc": "Fraction of contamination in sequencing data (for all samples) to aggressively remove ."
  674. },
  675. {
  676. "sbg:category": "Optional Arguments",
  677. "sbg:toolDefaultValue": "false",
  678. "id": "correct_overlapping_quality",
  679. "type": "boolean?",
  680. "inputBinding": {
  681. "prefix": "--correct-overlapping-quality",
  682. "shellQuote": false,
  683. "position": 4
  684. },
  685. "label": "Correct overlapping quality",
  686. "doc": "Undocumented option."
  687. },
  688. {
  689. "sbg:altPrefix": "-OBI",
  690. "sbg:category": "Optional Arguments",
  691. "sbg:toolDefaultValue": "true",
  692. "id": "create_output_bam_index",
  693. "type": [
  694. "null",
  695. {
  696. "type": "enum",
  697. "symbols": [
  698. "true",
  699. "false"
  700. ],
  701. "name": "create_output_bam_index"
  702. }
  703. ],
  704. "inputBinding": {
  705. "prefix": "--create-output-bam-index",
  706. "shellQuote": false,
  707. "position": 4
  708. },
  709. "label": "Create output BAM index",
  710. "doc": "If true, create a BAM/CRAM index when writing a coordinate-sorted BAM/CRAM file."
  711. },
  712. {
  713. "sbg:altPrefix": "-OVI",
  714. "sbg:category": "Optional Arguments",
  715. "sbg:toolDefaultValue": "true",
  716. "id": "create_output_variant_index",
  717. "type": [
  718. "null",
  719. {
  720. "type": "enum",
  721. "symbols": [
  722. "true",
  723. "false"
  724. ],
  725. "name": "create_output_variant_index"
  726. }
  727. ],
  728. "inputBinding": {
  729. "prefix": "--create-output-variant-index",
  730. "shellQuote": false,
  731. "position": 4
  732. },
  733. "label": "Create output variant index",
  734. "doc": "If true, create a VCF index when writing a coordinate-sorted VCF file."
  735. },
  736. {
  737. "sbg:altPrefix": "-D",
  738. "sbg:category": "Optional Arguments",
  739. "sbg:toolDefaultValue": "null",
  740. "id": "dbsnp",
  741. "type": "File?",
  742. "inputBinding": {
  743. "prefix": "--dbsnp",
  744. "shellQuote": false,
  745. "position": 4
  746. },
  747. "label": "dbSNP",
  748. "doc": "dbSNP file.",
  749. "sbg:fileTypes": "VCF, VCF.GZ",
  750. "secondaryFiles": [
  751. "${\n if(self) {\n if (self.basename.slice(-4).toLowerCase() == \".vcf\") {\n return self.basename + \".idx\";\n }\n else if (self.basename.slice(-7).toLowerCase() == \".vcf.gz\") {\n return self.basename + \".tbi\";\n }\n else {\n return self.basename + \".idx\";\n }\n }\n else {\n return null;\n }\n}"
  752. ]
  753. },
  754. {
  755. "sbg:altPrefix": "-debug",
  756. "sbg:category": "Advanced Arguments",
  757. "sbg:toolDefaultValue": "false",
  758. "id": "debug",
  759. "type": "boolean?",
  760. "inputBinding": {
  761. "prefix": "--debug",
  762. "shellQuote": false,
  763. "position": 4
  764. },
  765. "label": "Debug",
  766. "doc": "Print out very verbose debug information about each triggering active region."
  767. },
  768. {
  769. "sbg:altPrefix": "-DBIC",
  770. "sbg:category": "Optional Arguments",
  771. "sbg:toolDefaultValue": "false",
  772. "id": "disable_bam_index_caching",
  773. "type": "boolean?",
  774. "inputBinding": {
  775. "prefix": "--disable-bam-index-caching",
  776. "shellQuote": false,
  777. "position": 4
  778. },
  779. "label": "Disable BAM index caching",
  780. "doc": "If true, don't cache BAM indexes, this will reduce memory requirements but may harm performance if many intervals are specified. Caching is automatically disabled if there are no intervals specified."
  781. },
  782. {
  783. "sbg:category": "Advanced Arguments",
  784. "sbg:toolDefaultValue": "false",
  785. "id": "disable_optimizations",
  786. "type": "boolean?",
  787. "inputBinding": {
  788. "prefix": "--disable-optimizations",
  789. "shellQuote": false,
  790. "position": 4
  791. },
  792. "label": "Disable optimizations",
  793. "doc": "Don't skip calculations in active regions with no variants."
  794. },
  795. {
  796. "sbg:altPrefix": "-DF",
  797. "sbg:category": "Optional Arguments",
  798. "sbg:toolDefaultValue": "null",
  799. "id": "disable_read_filter",
  800. "type": [
  801. "null",
  802. {
  803. "type": "array",
  804. "items": {
  805. "type": "enum",
  806. "name": "disable_read_filter",
  807. "symbols": [
  808. "GoodCigarReadFilter",
  809. "MappedReadFilter",
  810. "MappingQualityAvailableReadFilter",
  811. "MappingQualityReadFilter",
  812. "NonZeroReferenceLengthAlignmentReadFilter",
  813. "NotDuplicateReadFilter",
  814. "NotSecondaryAlignmentReadFilter",
  815. "PassesVendorQualityCheckReadFilter",
  816. "WellformedReadFilter"
  817. ]
  818. }
  819. }
  820. ],
  821. "inputBinding": {
  822. "shellQuote": false,
  823. "position": 4,
  824. "valueFrom": "${\n if (self)\n {\n var cmd = [];\n for (var i = 0; i < self.length; i++) \n {\n cmd.push('--disable-read-filter', self[i]);\n }\n return cmd.join(' ');\n }\n \n}"
  825. },
  826. "label": "Disable read filter",
  827. "doc": "Read filters to be disabled before analysis. This argument may be specified 0 or more times."
  828. },
  829. {
  830. "sbg:altPrefix": "-disable-sequence-dictionary-validation",
  831. "sbg:category": "Optional Arguments",
  832. "sbg:toolDefaultValue": "false",
  833. "id": "disable_sequence_dictionary_validation",
  834. "type": "boolean?",
  835. "inputBinding": {
  836. "prefix": "--disable-sequence-dictionary-validation",
  837. "shellQuote": false,
  838. "position": 4
  839. },
  840. "label": "Disable sequence dictionary validation",
  841. "doc": "If specified, do not check the sequence dictionaries from our inputs for compatibility. Use at your own risk!"
  842. },
  843. {
  844. "sbg:altPrefix": "-disable-tool-default-annotations",
  845. "sbg:category": "Advanced Arguments",
  846. "sbg:toolDefaultValue": "false",
  847. "id": "disable_tool_default_annotations",
  848. "type": "boolean?",
  849. "inputBinding": {
  850. "prefix": "--disable-tool-default-annotations",
  851. "shellQuote": false,
  852. "position": 4
  853. },
  854. "label": "Disable tool default annotations",
  855. "doc": "Disable all tool default annotations."
  856. },
  857. {
  858. "sbg:altPrefix": "-disable-tool-default-read-filters",
  859. "sbg:category": "Advanced Arguments",
  860. "sbg:toolDefaultValue": "false",
  861. "id": "disable_tool_default_read_filters",
  862. "type": "boolean?",
  863. "inputBinding": {
  864. "prefix": "--disable-tool-default-read-filters",
  865. "shellQuote": false,
  866. "position": 4
  867. },
  868. "label": "Disable tool default read filters",
  869. "doc": "Disable all tool default read filters (warning: many tools will not function correctly without their default read filters on)."
  870. },
  871. {
  872. "sbg:category": "Advanced Arguments",
  873. "sbg:toolDefaultValue": "false",
  874. "id": "do_not_run_physical_phasing",
  875. "type": "boolean?",
  876. "inputBinding": {
  877. "prefix": "--do-not-run-physical-phasing",
  878. "shellQuote": false,
  879. "position": 4
  880. },
  881. "label": "Do not run physical phasing",
  882. "doc": "Disable physical phasing."
  883. },
  884. {
  885. "sbg:category": "Advanced Arguments",
  886. "sbg:toolDefaultValue": "false",
  887. "id": "dont_increase_kmer_sizes_for_cycles",
  888. "type": "boolean?",
  889. "inputBinding": {
  890. "prefix": "--dont-increase-kmer-sizes-for-cycles",
  891. "shellQuote": false,
  892. "position": 4
  893. },
  894. "label": "Dont increase kmer sizes for cycles",
  895. "doc": "Disable iterating over kmer sizes when graph cycles are detected."
  896. },
  897. {
  898. "sbg:category": "Advanced Arguments",
  899. "sbg:toolDefaultValue": "false",
  900. "id": "dont_trim_active_regions",
  901. "type": "boolean?",
  902. "inputBinding": {
  903. "prefix": "--dont-trim-active-regions",
  904. "shellQuote": false,
  905. "position": 4
  906. },
  907. "label": "Dont trim active regions",
  908. "doc": "If specified, we will not trim down the active region from the full region (active + extension) to just the active interval for genotyping."
  909. },
  910. {
  911. "sbg:category": "Advanced Arguments",
  912. "sbg:toolDefaultValue": "false",
  913. "id": "dont_use_soft_clipped_bases",
  914. "type": "boolean?",
  915. "inputBinding": {
  916. "prefix": "--dont-use-soft-clipped-bases",
  917. "shellQuote": false,
  918. "position": 4
  919. },
  920. "label": "Do not use soft clipped bases",
  921. "doc": "Do not analyze soft clipped bases in the reads."
  922. },
  923. {
  924. "sbg:altPrefix": "-ERC",
  925. "sbg:category": "Advanced Arguments",
  926. "sbg:toolDefaultValue": "NONE",
  927. "id": "emit_ref_confidence",
  928. "type": [
  929. "null",
  930. {
  931. "type": "enum",
  932. "symbols": [
  933. "NONE",
  934. "BP_RESOLUTION",
  935. "GVCF"
  936. ],
  937. "name": "emit_ref_confidence"
  938. }
  939. ],
  940. "inputBinding": {
  941. "prefix": "--emit-ref-confidence",
  942. "shellQuote": false,
  943. "position": 4
  944. },
  945. "label": "Emit ref confidence",
  946. "doc": "Mode for emitting reference confidence scores."
  947. },
  948. {
  949. "sbg:category": "Advanced Arguments",
  950. "sbg:toolDefaultValue": "false",
  951. "id": "enable_all_annotations",
  952. "type": "boolean?",
  953. "inputBinding": {
  954. "prefix": "--enable-all-annotations",
  955. "shellQuote": false,
  956. "position": 4
  957. },
  958. "label": "Enable all annotations",
  959. "doc": "Use all possible annotations (not for the faint of heart)."
  960. },
  961. {
  962. "sbg:altPrefix": "-XL",
  963. "sbg:category": "Optional Arguments",
  964. "sbg:toolDefaultValue": "null",
  965. "id": "exclude_intervals_string",
  966. "type": "string[]?",
  967. "inputBinding": {
  968. "shellQuote": false,
  969. "position": 4,
  970. "valueFrom": "${\n if (inputs.exclude_intervals_string)\n {\n var exclude_string = [].concat(inputs.exclude_intervals_string);\n var cmd = [];\n for (var i = 0; i < exclude_string.length; i++) \n {\n cmd.push('--exclude-intervals', exclude_string[i]);\n }\n return cmd.join(' ');\n }\n return '';\n}\n"
  971. },
  972. "label": "Exclude intervals string",
  973. "doc": "One or more genomic intervals to exclude from processing. This argument may be specified 0 or more times."
  974. },
  975. {
  976. "sbg:altPrefix": "-founder-id",
  977. "sbg:category": "Optional Arguments",
  978. "sbg:toolDefaultValue": "null",
  979. "id": "founder_id",
  980. "type": "string[]?",
  981. "inputBinding": {
  982. "shellQuote": false,
  983. "position": 4,
  984. "valueFrom": "${\n if (inputs.founder_id)\n {\n var f_id = [].concat(inputs.founder_id);\n var cmd = [];\n for (var i = 0; i < f_id.length; i++) \n {\n cmd.push('--founder-id', f_id[i]);\n }\n return cmd.join(' ');\n }\n return '';\n}"
  985. },
  986. "label": "Founder ID",
  987. "doc": "Samples representing the population \"founders\". This argument may be specified 0 or more times."
  988. },
  989. {
  990. "sbg:category": "Advanced Arguments",
  991. "sbg:toolDefaultValue": "false",
  992. "id": "genotype_filtered_alleles",
  993. "type": "boolean?",
  994. "inputBinding": {
  995. "prefix": "--genotype-filtered-alleles",
  996. "shellQuote": false,
  997. "position": 4
  998. },
  999. "label": "Genotype filtered alleles",
  1000. "doc": "Whether to genotype all given alleles, even filtered ones, --genotyping_mode is GENOTYPE_GIVEN_ALLELES."
  1001. },
  1002. {
  1003. "sbg:category": "Optional Arguments",
  1004. "sbg:toolDefaultValue": "DISCOVERY",
  1005. "id": "genotyping_mode",
  1006. "type": [
  1007. "null",
  1008. {
  1009. "type": "enum",
  1010. "symbols": [
  1011. "DISCOVERY",
  1012. "GENOTYPE_GIVEN_ALLELES"
  1013. ],
  1014. "name": "genotyping_mode"
  1015. }
  1016. ],
  1017. "inputBinding": {
  1018. "prefix": "--genotyping-mode",
  1019. "shellQuote": false,
  1020. "position": 4
  1021. },
  1022. "label": "Genotyping mode",
  1023. "doc": "Specifies how to determine the alternate alleles to use for genotyping."
  1024. },
  1025. {
  1026. "sbg:altPrefix": "-graph",
  1027. "sbg:category": "Optional Arguments",
  1028. "sbg:toolDefaultValue": "null",
  1029. "id": "graph_output",
  1030. "type": "string?",
  1031. "inputBinding": {
  1032. "prefix": "--graph-output",
  1033. "shellQuote": false,
  1034. "position": 4,
  1035. "valueFrom": "${\n if(inputs.graph_output) {\n var tmp = inputs.graph_output.slice(-4).toLowerCase();\n if(tmp == \".txt\") {\n return inputs.graph_output;\n }\n else {\n return inputs.graph_output + '.txt';\n }\n }\n else {\n return null;\n }\n}"
  1036. },
  1037. "label": "Graph output",
  1038. "doc": "Write debug assembly graph information to this file."
  1039. },
  1040. {
  1041. "sbg:altPrefix": "-GQB",
  1042. "sbg:category": "Advanced Arguments",
  1043. "sbg:toolDefaultValue": "1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 70, 80, 90, 99",
  1044. "id": "gvcf_gq_bands",
  1045. "type": "int[]?",
  1046. "inputBinding": {
  1047. "shellQuote": false,
  1048. "position": 4,
  1049. "valueFrom": "${\n if (inputs.gvcf_gq_bands)\n {\n var gq = [].concat(inputs.gvcf_gq_bands);\n var cmd = [];\n for (var i = 0; i < gq.length; i++) \n {\n cmd.push('--gvcf-gq-bands', gq[i]);\n }\n return cmd.join(' ');\n }\n return '';\n}"
  1050. },
  1051. "label": "GVCF GQ bands",
  1052. "doc": "Exclusive upper bounds for reference confidence GQ bands (must be in [1, 100] and specified in increasing order). This argument may be specified 0 or more times."
  1053. },
  1054. {
  1055. "sbg:category": "Optional Arguments",
  1056. "sbg:toolDefaultValue": "0.001",
  1057. "id": "heterozygosity",
  1058. "type": "float?",
  1059. "inputBinding": {
  1060. "prefix": "--heterozygosity",
  1061. "shellQuote": false,
  1062. "position": 4
  1063. },
  1064. "label": "Heterozygosity",
  1065. "doc": "Heterozygosity value used to compute prior likelihoods for any locus. See the GATKDocs for full details on the meaning of this population genetics concept."
  1066. },
  1067. {
  1068. "sbg:category": "Optional Arguments",
  1069. "sbg:toolDefaultValue": "0.01",
  1070. "id": "heterozygosity_stdev",
  1071. "type": "float?",
  1072. "inputBinding": {
  1073. "prefix": "--heterozygosity-stdev",
  1074. "shellQuote": false,
  1075. "position": 4
  1076. },
  1077. "label": "Heterozygosity stdev",
  1078. "doc": "Standard deviation of heterozygosity for SNP and indel calling."
  1079. },
  1080. {
  1081. "sbg:category": "Optional Arguments",
  1082. "sbg:toolDefaultValue": "1.25E-4",
  1083. "id": "indel_heterozygosity",
  1084. "type": "float?",
  1085. "inputBinding": {
  1086. "prefix": "--indel-heterozygosity",
  1087. "shellQuote": false,
  1088. "position": 4
  1089. },
  1090. "label": "Indel heterozygosity",
  1091. "doc": "Heterozygosity for indel calling. See the GATKDocs for heterozygosity for full details on the meaning of this population genetics concept."
  1092. },
  1093. {
  1094. "sbg:category": "Advanced Arguments",
  1095. "sbg:toolDefaultValue": "10",
  1096. "id": "indel_size_to_eliminate_in_ref_model",
  1097. "type": "int?",
  1098. "inputBinding": {
  1099. "prefix": "--indel-size-to-eliminate-in-ref-model",
  1100. "shellQuote": false,
  1101. "position": 4
  1102. },
  1103. "label": "Indel size to eliminate in ref model",
  1104. "doc": "The size of an indel to check for in the reference model."
  1105. },
  1106. {
  1107. "sbg:altPrefix": "-I",
  1108. "sbg:category": "Required Arguments",
  1109. "id": "in_alignments",
  1110. "type": "File[]",
  1111. "inputBinding": {
  1112. "shellQuote": false,
  1113. "position": 4,
  1114. "valueFrom": "${\n if (inputs.in_alignments) {\n var alignments = [].concat(inputs.in_alignments);\n var cmd = [];\n for (var i=0; i<alignments.length; i++) {\n cmd.push('--input', alignments[i].path);\n }\n return cmd.join(' ');\n } \n return '';\n}"
  1115. },
  1116. "label": "Input alignments",
  1117. "doc": "BAM/SAM/CRAM file containing reads. This argument must be specified at least once.",
  1118. "sbg:fileTypes": "BAM, CRAM",
  1119. "secondaryFiles": [
  1120. "${\n var in_alignments = self;\n if (in_alignments.nameext == '.bam' || in_alignments.nameext == '.BAM') {\n return [in_alignments.basename + \".bai\", in_alignments.nameroot + \".bai\"];\n }\n else if (in_alignments.nameext == \".cram\" || in_alignments.nameext == '.CRAM') {\n return [in_alignments.basename + \".crai\", in_alignments.nameroot + \".crai\", in_alignments.basename + \".bai\"]; \n }\n return '';\n}\n\n"
  1121. ]
  1122. },
  1123. {
  1124. "sbg:category": "Advanced Arguments",
  1125. "sbg:toolDefaultValue": "null",
  1126. "id": "input_prior",
  1127. "type": "float[]?",
  1128. "inputBinding": {
  1129. "shellQuote": false,
  1130. "position": 4,
  1131. "valueFrom": "${\n if (inputs.input_prior)\n {\n var prior = [].concat(inputs.input_prior);\n var cmd = [];\n for (var i = 0; i < prior.length; i++) \n {\n cmd.push('--input-prior', prior[i]);\n }\n return cmd.join(' ');\n }\n return '';\n}"
  1132. },
  1133. "label": "Input prior",
  1134. "doc": "Input prior for calls. This argument may be specified 0 or more times."
  1135. },
  1136. {
  1137. "sbg:altPrefix": "-ixp",
  1138. "sbg:category": "Optional Arguments",
  1139. "sbg:toolDefaultValue": "0",
  1140. "id": "interval_exclusion_padding",
  1141. "type": "int?",
  1142. "inputBinding": {
  1143. "prefix": "--interval-exclusion-padding",
  1144. "shellQuote": false,
  1145. "position": 4
  1146. },
  1147. "label": "Interval exclusion padding",
  1148. "doc": "Amount of padding (in bp) to add to each interval you are excluding."
  1149. },
  1150. {
  1151. "sbg:altPrefix": "-imr",
  1152. "sbg:category": "Optional Arguments",
  1153. "sbg:toolDefaultValue": "ALL",
  1154. "id": "interval_merging_rule",
  1155. "type": [
  1156. "null",
  1157. {
  1158. "type": "enum",
  1159. "symbols": [
  1160. "ALL",
  1161. "OVERLAPPING_ONLY"
  1162. ],
  1163. "name": "interval_merging_rule"
  1164. }
  1165. ],
  1166. "inputBinding": {
  1167. "prefix": "--interval-merging-rule",
  1168. "shellQuote": false,
  1169. "position": 4
  1170. },
  1171. "label": "Interval merging rule",
  1172. "doc": "Interval merging rule for abutting intervals."
  1173. },
  1174. {
  1175. "sbg:altPrefix": "-ip",
  1176. "sbg:category": "Optional Arguments",
  1177. "sbg:toolDefaultValue": "0",
  1178. "id": "interval_padding",
  1179. "type": "int?",
  1180. "inputBinding": {
  1181. "prefix": "--interval-padding",
  1182. "shellQuote": false,
  1183. "position": 4
  1184. },
  1185. "label": "Interval padding",
  1186. "doc": "Amount of padding (in bp) to add to each interval you are including."
  1187. },
  1188. {
  1189. "sbg:altPrefix": "-isr",
  1190. "sbg:category": "Optional Arguments",
  1191. "sbg:toolDefaultValue": "UNION",
  1192. "id": "interval_set_rule",
  1193. "type": [
  1194. "null",
  1195. {
  1196. "type": "enum",
  1197. "symbols": [
  1198. "UNION",
  1199. "INTERSECTION"
  1200. ],
  1201. "name": "interval_set_rule"
  1202. }
  1203. ],
  1204. "inputBinding": {
  1205. "prefix": "--interval-set-rule",
  1206. "shellQuote": false,
  1207. "position": 4
  1208. },
  1209. "label": "Interval set rule",
  1210. "doc": "Set merging approach to use for combining interval inputs."
  1211. },
  1212. {
  1213. "sbg:altPrefix": "-L",
  1214. "sbg:category": "Optional Arguments",
  1215. "sbg:toolDefaultValue": "null",
  1216. "id": "include_intervals_file",
  1217. "type": "File[]?",
  1218. "inputBinding": {
  1219. "prefix": "--intervals",
  1220. "shellQuote": false,
  1221. "position": 4
  1222. },
  1223. "label": "Include intervals file",
  1224. "doc": "One or more genomic intervals over which to operate.",
  1225. "sbg:fileTypes": "INTERVAL_LIST, LIST, BED"
  1226. },
  1227. {
  1228. "sbg:altPrefix": "-L",
  1229. "sbg:category": "Optional Arguments",
  1230. "sbg:toolDefaultValue": "null",
  1231. "id": "include_intervals_string",
  1232. "type": "string[]?",
  1233. "inputBinding": {
  1234. "shellQuote": false,
  1235. "position": 4,
  1236. "valueFrom": "${\n if (inputs.include_intervals_string)\n {\n var include_string = [].concat(inputs.include_intervals_string);\n var cmd = [];\n for (var i = 0; i < include_string.length; i++) \n {\n cmd.push('--intervals', include_string[i]);\n }\n return cmd.join(' ');\n }\n return '';\n}\n\n\n"
  1237. },
  1238. "label": "Include intervals string",
  1239. "doc": "One or more genomic intervals over which to operate. This argument may be specified 0 or more times."
  1240. },
  1241. {
  1242. "sbg:category": "Advanced Arguments",
  1243. "sbg:toolDefaultValue": "10, 25",
  1244. "id": "kmer_size",
  1245. "type": "int[]?",
  1246. "inputBinding": {
  1247. "shellQuote": false,
  1248. "position": 4,
  1249. "valueFrom": "${\n if (inputs.kmer_size)\n {\n var kmer = [].concat(inputs.kmer_size);\n var cmd = [];\n for (var i = 0; i < kmer.length; i++) \n {\n cmd.push('--kmer-size', kmer[i]);\n }\n return cmd.join(' ');\n }\n return '';\n}"
  1250. },
  1251. "label": "Kmer size",
  1252. "doc": "Kmer size to use in the read threading assembler. This argument may be specified 0 or more times."
  1253. },
  1254. {
  1255. "sbg:altPrefix": "-LE",
  1256. "sbg:category": "Optional Arguments",
  1257. "sbg:toolDefaultValue": "false",
  1258. "id": "lenient",
  1259. "type": "boolean?",
  1260. "inputBinding": {
  1261. "prefix": "--lenient",
  1262. "shellQuote": false,
  1263. "position": 4
  1264. },
  1265. "label": "Lenient",
  1266. "doc": "Lenient processing of VCF files."
  1267. },
  1268. {
  1269. "sbg:category": "Advanced Arguments",
  1270. "sbg:toolDefaultValue": "6",
  1271. "id": "max_alternate_alleles",
  1272. "type": "int?",
  1273. "inputBinding": {
  1274. "prefix": "--max-alternate-alleles",
  1275. "shellQuote": false,
  1276. "position": 4
  1277. },
  1278. "label": "Max alternate alleles",
  1279. "doc": "Maximum number of alternate alleles to genotype."
  1280. },
  1281. {
  1282. "sbg:category": "Advanced Arguments",
  1283. "sbg:toolDefaultValue": "300",
  1284. "id": "max_assembly_region_size",
  1285. "type": "int?",
  1286. "inputBinding": {
  1287. "prefix": "--max-assembly-region-size",
  1288. "shellQuote": false,
  1289. "position": 4
  1290. },
  1291. "label": "Max assembly region size",
  1292. "doc": "Maximum size of an assembly region."
  1293. },
  1294. {
  1295. "sbg:category": "Advanced Arguments",
  1296. "sbg:toolDefaultValue": "1024",
  1297. "id": "max_genotype_count",
  1298. "type": "int?",
  1299. "inputBinding": {
  1300. "prefix": "--max-genotype-count",
  1301. "shellQuote": false,
  1302. "position": 4
  1303. },
  1304. "label": "Max genotype count",
  1305. "doc": "Maximum number of genotypes to consider at any site."
  1306. },
  1307. {
  1308. "sbg:altPrefix": "-mnp-dist",
  1309. "sbg:category": "Advanced Arguments",
  1310. "sbg:toolDefaultValue": "0",
  1311. "id": "max_mnp_distance",
  1312. "type": "int?",
  1313. "inputBinding": {
  1314. "prefix": "--max-mnp-distance",
  1315. "shellQuote": false,
  1316. "position": 4
  1317. },
  1318. "label": "Max MNP distance",
  1319. "doc": "Two or more phased substitutions separated by this distance or less are merged into MNPs. Warning: when used in GVCF mode, resulting GVCFs cannot be joint-genotyped."
  1320. },
  1321. {
  1322. "sbg:category": "Advanced Arguments",
  1323. "sbg:toolDefaultValue": "128",
  1324. "id": "max_num_haplotypes_in_population",
  1325. "type": "int?",
  1326. "inputBinding": {
  1327. "prefix": "--max-num-haplotypes-in-population",
  1328. "shellQuote": false,
  1329. "position": 4
  1330. },
  1331. "label": "Max num haplotypes in population",
  1332. "doc": "Maximum number of haplotypes to consider for your population."
  1333. },
  1334. {
  1335. "sbg:category": "Advanced Arguments",
  1336. "sbg:toolDefaultValue": "50",
  1337. "id": "max_prob_propagation_distance",
  1338. "type": "int?",
  1339. "inputBinding": {
  1340. "prefix": "--max-prob-propagation-distance",
  1341. "shellQuote": false,
  1342. "position": 4
  1343. },
  1344. "label": "Max prob propagation distance",
  1345. "doc": "Upper limit on how many bases away probability mass can be moved around when calculating the boundaries between active and inactive assembly regions."
  1346. },
  1347. {
  1348. "sbg:category": "Optional Arguments",
  1349. "sbg:toolDefaultValue": "50",
  1350. "id": "max_reads_per_alignment_start",
  1351. "type": "int?",
  1352. "inputBinding": {
  1353. "prefix": "--max-reads-per-alignment-start",
  1354. "shellQuote": false,
  1355. "position": 4
  1356. },
  1357. "label": "Max reads per alignment start",
  1358. "doc": "Maximum number of reads to retain per alignment start position. Reads above this threshold will be downsampled. Set to 0 to disable."
  1359. },
  1360. {
  1361. "sbg:category": "Advanced Arguments",
  1362. "sbg:toolDefaultValue": "100",
  1363. "id": "max_unpruned_variants",
  1364. "type": "int?",
  1365. "inputBinding": {
  1366. "prefix": "--max-unpruned-variants",
  1367. "shellQuote": false,
  1368. "position": 4
  1369. },
  1370. "label": "Max unpruned variants",
  1371. "doc": "Maximum number of variants in graph the adaptive pruner will allow."
  1372. },
  1373. {
  1374. "sbg:category": "Conditional Arguments for readFilter",
  1375. "sbg:toolDefaultValue": "null",
  1376. "id": "maximum_mapping_quality",
  1377. "type": "int?",
  1378. "inputBinding": {
  1379. "prefix": "--maximum-mapping-quality",
  1380. "shellQuote": false,
  1381. "position": 5
  1382. },
  1383. "label": "Maximum mapping quality",
  1384. "doc": "Valid only if \"MappingQualityReadFilter\" is specified:\nMaximum mapping quality to keep (inclusive)."
  1385. },
  1386. {
  1387. "sbg:category": "Platform Options",
  1388. "sbg:toolDefaultValue": "100",
  1389. "id": "mem_overhead_per_job",
  1390. "type": "int?",
  1391. "label": "Memory overhead per job",
  1392. "doc": "It allows a user to set the desired overhead memory (in MB) when running a tool or adding it to a workflow."
  1393. },
  1394. {
  1395. "sbg:category": "Platform Options",
  1396. "sbg:toolDefaultValue": "4000",
  1397. "id": "mem_per_job",
  1398. "type": "int?",
  1399. "label": "Memory per job",
  1400. "doc": "It allows a user to set the desired memory requirement (in MB) when running a tool or adding it to a workflow."
  1401. },
  1402. {
  1403. "sbg:category": "Advanced Arguments",
  1404. "sbg:toolDefaultValue": "50",
  1405. "id": "min_assembly_region_size",
  1406. "type": "int?",
  1407. "inputBinding": {
  1408. "prefix": "--min-assembly-region-size",
  1409. "shellQuote": false,
  1410. "position": 4
  1411. },
  1412. "label": "Min assembly region size",
  1413. "doc": "Minimum size of an assembly region."
  1414. },
  1415. {
  1416. "sbg:altPrefix": "-mbq",
  1417. "sbg:category": "Optional Arguments",
  1418. "sbg:toolDefaultValue": "10",
  1419. "id": "min_base_quality_score",
  1420. "type": "int?",
  1421. "inputBinding": {
  1422. "prefix": "--min-base-quality-score",
  1423. "shellQuote": false,
  1424. "position": 4
  1425. },
  1426. "label": "Min base quality score",
  1427. "doc": "Minimum base quality required to consider a base for calling."
  1428. },
  1429. {
  1430. "sbg:category": "Advanced Arguments",
  1431. "sbg:toolDefaultValue": "4",
  1432. "id": "min_dangling_branch_length",
  1433. "type": "int?",
  1434. "inputBinding": {
  1435. "prefix": "--min-dangling-branch-length",
  1436. "shellQuote": false,
  1437. "position": 4
  1438. },
  1439. "label": "Min dangling branch length",
  1440. "doc": "Minimum length of a dangling branch to attempt recovery."
  1441. },
  1442. {
  1443. "sbg:category": "Advanced Arguments",
  1444. "sbg:toolDefaultValue": "2",
  1445. "id": "min_pruning",
  1446. "type": "int?",
  1447. "inputBinding": {
  1448. "prefix": "--min-pruning",
  1449. "shellQuote": false,
  1450. "position": 4
  1451. },
  1452. "label": "Min pruning",
  1453. "doc": "Minimum support to not prune paths in the graph."
  1454. },
  1455. {
  1456. "sbg:category": "Conditional Arguments for readFilter",
  1457. "sbg:toolDefaultValue": "20",
  1458. "id": "minimum_mapping_quality",
  1459. "type": "int?",
  1460. "inputBinding": {
  1461. "prefix": "--minimum-mapping-quality",
  1462. "shellQuote": false,
  1463. "position": 5
  1464. },
  1465. "label": "Minimum mapping quality",
  1466. "doc": "Valid only if \"MappingQualityReadFilter\" is specified:\nMinimum mapping quality to keep (inclusive)."
  1467. },
  1468. {
  1469. "sbg:category": "Optional Arguments",
  1470. "sbg:toolDefaultValue": "4",
  1471. "id": "native_pair_hmm_threads",
  1472. "type": "int?",
  1473. "inputBinding": {
  1474. "prefix": "--native-pair-hmm-threads",
  1475. "shellQuote": false,
  1476. "position": 4
  1477. },
  1478. "label": "Native pairHMM threads",
  1479. "doc": "How many threads should a native pairHMM implementation use."
  1480. },
  1481. {
  1482. "sbg:category": "Optional Arguments",
  1483. "sbg:toolDefaultValue": "false",
  1484. "id": "native_pair_hmm_use_double_precision",
  1485. "type": "boolean?",
  1486. "inputBinding": {
  1487. "prefix": "--native-pair-hmm-use-double-precision",
  1488. "shellQuote": false,
  1489. "position": 4
  1490. },
  1491. "label": "Native pairHMM use double precision",
  1492. "doc": "Use double precision in the native pairHMM. This is slower but matches the java implementation better."
  1493. },
  1494. {
  1495. "sbg:category": "Advanced Arguments",
  1496. "sbg:toolDefaultValue": "1",
  1497. "id": "num_pruning_samples",
  1498. "type": "int?",
  1499. "inputBinding": {
  1500. "prefix": "--num-pruning-samples",
  1501. "shellQuote": false,
  1502. "position": 4
  1503. },
  1504. "label": "Num pruning samples",
  1505. "doc": "Number of samples that must pass the minPruning threshold."
  1506. },
  1507. {
  1508. "sbg:category": "Optional Arguments",
  1509. "sbg:toolDefaultValue": "0",
  1510. "id": "num_reference_samples_if_no_call",
  1511. "type": "int?",
  1512. "inputBinding": {
  1513. "prefix": "--num-reference-samples-if-no-call",
  1514. "shellQuote": false,
  1515. "position": 4
  1516. },
  1517. "label": "Num reference samples if no call",
  1518. "doc": "Number of hom-ref genotypes to infer at sites not present in a panel."
  1519. },
  1520. {
  1521. "sbg:category": "Config Inputs",
  1522. "id": "prefix",
  1523. "type": "string?",
  1524. "label": "Output name prefix",
  1525. "doc": "Output file name prefix of a file to which variants should be written."
  1526. },
  1527. {
  1528. "sbg:category": "Optional Arguments",
  1529. "sbg:toolDefaultValue": "EMIT_VARIANTS_ONLY",
  1530. "id": "output_mode",
  1531. "type": [
  1532. "null",
  1533. {
  1534. "type": "enum",
  1535. "symbols": [
  1536. "EMIT_VARIANTS_ONLY",
  1537. "EMIT_ALL_CONFIDENT_SITES",
  1538. "EMIT_ALL_SITES"
  1539. ],
  1540. "name": "output_mode"
  1541. }
  1542. ],
  1543. "inputBinding": {
  1544. "prefix": "--output-mode",
  1545. "shellQuote": false,
  1546. "position": 4
  1547. },
  1548. "label": "Output mode",
  1549. "doc": "Specifies which type of calls we should output."
  1550. },
  1551. {
  1552. "sbg:category": "Advanced Arguments",
  1553. "sbg:toolDefaultValue": "10",
  1554. "id": "pair_hmm_gap_continuation_penalty",
  1555. "type": "int?",
  1556. "inputBinding": {
  1557. "prefix": "--pair-hmm-gap-continuation-penalty",
  1558. "shellQuote": false,
  1559. "position": 4
  1560. },
  1561. "label": "Pair HMM gap continuation penalty",
  1562. "doc": "Flat gap continuation penalty for use in the pairHMM."
  1563. },
  1564. {
  1565. "sbg:altPrefix": "-pairHMM",
  1566. "sbg:category": "Advanced Arguments",
  1567. "sbg:toolDefaultValue": "FASTEST_AVAILABLE",
  1568. "id": "pair_hmm_implementation",
  1569. "type": [
  1570. "null",
  1571. {
  1572. "type": "enum",
  1573. "symbols": [
  1574. "EXACT",
  1575. "ORIGINAL",
  1576. "LOGLESS_CACHING",
  1577. "AVX_LOGLESS_CACHING",
  1578. "AVX_LOGLESS_CACHING_OMP",
  1579. "EXPERIMENTAL_FPGA_LOGLESS_CACHING",
  1580. "FASTEST_AVAILABLE"
  1581. ],
  1582. "name": "pair_hmm_implementation"
  1583. }
  1584. ],
  1585. "inputBinding": {
  1586. "prefix": "--pair-hmm-implementation",
  1587. "shellQuote": false,
  1588. "position": 4
  1589. },
  1590. "label": "Pair HMM implementation",
  1591. "doc": "The pairHMM implementation to use for genotype likelihood calculations."
  1592. },
  1593. {
  1594. "sbg:category": "Advanced Arguments",
  1595. "sbg:toolDefaultValue": "CONSERVATIVE",
  1596. "id": "pcr_indel_model",
  1597. "type": [
  1598. "null",
  1599. {
  1600. "type": "enum",
  1601. "symbols": [
  1602. "NONE",
  1603. "HOSTILE",
  1604. "AGGRESSIVE",
  1605. "CONSERVATIVE"
  1606. ],
  1607. "name": "pcr_indel_model"
  1608. }
  1609. ],
  1610. "inputBinding": {
  1611. "prefix": "--pcr-indel-model",
  1612. "shellQuote": false,
  1613. "position": 4
  1614. },
  1615. "label": "PCR indel model",
  1616. "doc": "The PCR indel model to use."
  1617. },
  1618. {
  1619. "sbg:altPrefix": "-ped",
  1620. "sbg:category": "Optional Arguments",
  1621. "sbg:toolDefaultValue": "null",
  1622. "id": "pedigree",
  1623. "type": "File?",
  1624. "inputBinding": {
  1625. "prefix": "--pedigree",
  1626. "shellQuote": false,
  1627. "position": 4
  1628. },
  1629. "label": "Pedigree",
  1630. "doc": "Pedigree file for determining the population \"founders\".",
  1631. "sbg:fileTypes": "PED"
  1632. },
  1633. {
  1634. "sbg:category": "Advanced Arguments",
  1635. "sbg:toolDefaultValue": "45",
  1636. "id": "phred_scaled_global_read_mismapping_rate",
  1637. "type": "int?",
  1638. "inputBinding": {
  1639. "prefix": "--phred-scaled-global-read-mismapping-rate",
  1640. "shellQuote": false,
  1641. "position": 4
  1642. },
  1643. "label": "Phred scaled global read mismapping rate",
  1644. "doc": "The global assumed mismapping rate for reads."
  1645. },
  1646. {
  1647. "sbg:altPrefix": "-population",
  1648. "sbg:category": "Optional Arguments",
  1649. "sbg:toolDefaultValue": "null",
  1650. "id": "population_callset",
  1651. "type": "File?",
  1652. "inputBinding": {
  1653. "prefix": "--population-callset",
  1654. "shellQuote": false,
  1655. "position": 4
  1656. },
  1657. "label": "Population callset",
  1658. "doc": "Callset to use in calculating genotype priors.",
  1659. "sbg:fileTypes": "VCF, VCF.GZ",
  1660. "secondaryFiles": [
  1661. "${\n if(self) {\n if (self.basename.slice(-4).toLowerCase() == \".vcf\") {\n return self.basename + \".idx\";\n }\n else if (self.basename.slice(-7).toLowerCase() == \".vcf.gz\") {\n return self.basename + \".tbi\";\n }\n else {\n return self.basename + \".idx\";\n }\n }\n else {\n return null;\n }\n}"
  1662. ]
  1663. },
  1664. {
  1665. "sbg:category": "Advanced Arguments",
  1666. "sbg:toolDefaultValue": "1.0",
  1667. "id": "pruning_lod_threshold",
  1668. "type": "float?",
  1669. "inputBinding": {
  1670. "prefix": "--pruning-lod-threshold",
  1671. "shellQuote": false,
  1672. "position": 4
  1673. },
  1674. "label": "Pruning lod threshold",
  1675. "doc": "Log-10 likelihood ratio threshold for adaptive pruning algorithm."
  1676. },
  1677. {
  1678. "sbg:altPrefix": "-RF",
  1679. "sbg:category": "Optional Arguments",
  1680. "sbg:toolDefaultValue": "null",
  1681. "id": "read_filter",
  1682. "type": [
  1683. "null",
  1684. {
  1685. "type": "array",
  1686. "items": {
  1687. "type": "enum",
  1688. "name": "read_filter",
  1689. "symbols": [
  1690. "AlignmentAgreesWithHeaderReadFilter",
  1691. "AllowAllReadsReadFilter",
  1692. "AmbiguousBaseReadFilter",
  1693. "CigarContainsNoNOperator",
  1694. "FirstOfPairReadFilter",
  1695. "FragmentLengthReadFilter",
  1696. "GoodCigarReadFilter",
  1697. "HasReadGroupReadFilter",
  1698. "LibraryReadFilter",
  1699. "MappedReadFilter",
  1700. "MappingQualityAvailableReadFilter",
  1701. "MappingQualityNotZeroReadFilter",
  1702. "MappingQualityReadFilter",
  1703. "MatchingBasesAndQualsReadFilter",
  1704. "MateDifferentStrandReadFilter",
  1705. "MateOnSameContigOrNoMappedMateReadFilter",
  1706. "MetricsReadFilter",
  1707. "NonChimericOriginalAlignmentReadFilter",
  1708. "NonZeroFragmentLengthReadFilter",
  1709. "NonZeroReferenceLengthAlignmentReadFilter",
  1710. "NotDuplicateReadFilter",
  1711. "NotOpticalDuplicateReadFilter",
  1712. "NotSecondaryAlignmentReadFilter",
  1713. "NotSupplementaryAlignmentReadFilter",
  1714. "OverclippedReadFilter",
  1715. "PairedReadFilter",
  1716. "PassesVendorQualityCheckReadFilter",
  1717. "PlatformReadFilter",
  1718. "PlatformUnitReadFilter",
  1719. "PrimaryLineReadFilter",
  1720. "ProperlyPairedReadFilter",
  1721. "ReadGroupBlackListReadFilter",
  1722. "ReadGroupReadFilter",
  1723. "ReadLengthEqualsCigarLengthReadFilter",
  1724. "ReadLengthReadFilter",
  1725. "ReadNameReadFilter",
  1726. "ReadStrandFilter",
  1727. "SampleReadFilter",
  1728. "SecondOfPairReadFilter",
  1729. "SeqIsStoredReadFilter",
  1730. "ValidAlignmentEndReadFilter",
  1731. "ValidAlignmentStartReadFilter",
  1732. "WellformedReadFilter"
  1733. ]
  1734. }
  1735. }
  1736. ],
  1737. "inputBinding": {
  1738. "shellQuote": false,
  1739. "position": 4,
  1740. "valueFrom": "${\n if (self)\n {\n var cmd = [];\n for (var i = 0; i < self.length; i++) \n {\n cmd.push('--read-filter', self[i]);\n }\n return cmd.join(' ');\n }\n \n}"
  1741. },
  1742. "label": "Read filter",
  1743. "doc": "Read filters to be applied before analysis. This argument may be specified 0 or more times."
  1744. },
  1745. {
  1746. "sbg:altPrefix": "-VS",
  1747. "sbg:category": "Optional Arguments",
  1748. "sbg:toolDefaultValue": "SILENT",
  1749. "id": "read_validation_stringency",
  1750. "type": [
  1751. "null",
  1752. {
  1753. "type": "enum",
  1754. "symbols": [
  1755. "STRICT",
  1756. "LENIENT",
  1757. "SILENT"
  1758. ],
  1759. "name": "read_validation_stringency"
  1760. }
  1761. ],
  1762. "inputBinding": {
  1763. "prefix": "--read-validation-stringency",
  1764. "shellQuote": false,
  1765. "position": 4
  1766. },
  1767. "label": "Read validation stringency",
  1768. "doc": "Validation stringency for all SAM/BAM/CRAM/SRA files read by this program. The default stringency value silent can improve performance when processing a bam file in which variable-length data (read, qualities, tags) do not otherwise need to be decoded."
  1769. },
  1770. {
  1771. "sbg:altPrefix": "-R",
  1772. "sbg:category": "Required Arguments",
  1773. "sbg:toolDefaultValue": "FASTA, FA",
  1774. "id": "in_reference",
  1775. "type": "File",
  1776. "inputBinding": {
  1777. "prefix": "--reference",
  1778. "shellQuote": false,
  1779. "position": 4
  1780. },
  1781. "label": "Reference",
  1782. "doc": "Reference sequence file.",
  1783. "sbg:fileTypes": "FASTA, FA",
  1784. "secondaryFiles": [
  1785. ".fai",
  1786. "^.dict"
  1787. ]
  1788. },
  1789. {
  1790. "sbg:altPrefix": "-ALIAS",
  1791. "sbg:category": "Optional Arguments",
  1792. "sbg:toolDefaultValue": "null",
  1793. "id": "sample_name",
  1794. "type": "string?",
  1795. "inputBinding": {
  1796. "prefix": "--sample-name",
  1797. "shellQuote": false,
  1798. "position": 4
  1799. },
  1800. "label": "Sample name",
  1801. "doc": "Name of single sample to use from a multi-sample bam."
  1802. },
  1803. {
  1804. "sbg:altPrefix": "-ploidy",
  1805. "sbg:category": "Optional Arguments",
  1806. "sbg:toolDefaultValue": "2",
  1807. "id": "sample_ploidy",
  1808. "type": "int?",
  1809. "inputBinding": {
  1810. "prefix": "--sample-ploidy",
  1811. "shellQuote": false,
  1812. "position": 4
  1813. },
  1814. "label": "Sample ploidy",
  1815. "doc": "Ploidy (number of chromosomes) per sample. For pooled data, set to (number of samples in each pool x Sample Ploidy)."
  1816. },
  1817. {
  1818. "sbg:altPrefix": "-sequence-dictionary",
  1819. "sbg:category": "Optional Arguments",
  1820. "id": "sequence_dictionary",
  1821. "type": "File?",
  1822. "inputBinding": {
  1823. "prefix": "--sequence-dictionary",
  1824. "shellQuote": false,
  1825. "position": 4
  1826. },
  1827. "label": "Sequence dictionary",
  1828. "doc": "Use the given sequence dictionary as the master/canonical sequence dictionary. Must be a .dict file.",
  1829. "sbg:fileTypes": "DICT"
  1830. },
  1831. {
  1832. "sbg:category": "Optional Arguments",
  1833. "sbg:toolDefaultValue": "false",
  1834. "id": "sites_only_vcf_output",
  1835. "type": "boolean?",
  1836. "inputBinding": {
  1837. "prefix": "--sites-only-vcf-output",
  1838. "shellQuote": false,
  1839. "position": 4
  1840. },
  1841. "label": "Sites only VCF output",
  1842. "doc": "If true, don't emit genotype fields when writing VCF file output."
  1843. },
  1844. {
  1845. "sbg:category": "Advanced Arguments",
  1846. "sbg:toolDefaultValue": "JAVA",
  1847. "id": "smith_waterman",
  1848. "type": [
  1849. "null",
  1850. {
  1851. "type": "enum",
  1852. "symbols": [
  1853. "FASTEST_AVAILABLE",
  1854. "AVX_ENABLED",
  1855. "JAVA"
  1856. ],
  1857. "name": "smith_waterman"
  1858. }
  1859. ],
  1860. "inputBinding": {
  1861. "prefix": "--smith-waterman",
  1862. "shellQuote": false,
  1863. "position": 4
  1864. },
  1865. "label": "Smith waterman",
  1866. "doc": "Which Smith-Waterman implementation to use, generally FASTEST_AVAILABLE is the right choice."
  1867. },
  1868. {
  1869. "sbg:altPrefix": "-stand-call-conf",
  1870. "sbg:category": "Optional Arguments",
  1871. "sbg:toolDefaultValue": "30.0",
  1872. "id": "standard_min_confidence_threshold_for_calling",
  1873. "type": "float?",
  1874. "inputBinding": {
  1875. "prefix": "--standard-min-confidence-threshold-for-calling",
  1876. "shellQuote": false,
  1877. "position": 4
  1878. },
  1879. "label": "Standard min confidence threshold for calling",
  1880. "doc": "The minimum phred-scaled confidence threshold at which variants should be called. When HaplotypeCaller is used in GVCF mode (using either -ERC GVCF or -ERC BP_RESOLUTION) the call threshold is automatically set to zero. Call confidence thresholding will then be performed in the subsequent GenotypeGVCFs command."
  1881. },
  1882. {
  1883. "sbg:category": "Advanced Arguments",
  1884. "sbg:toolDefaultValue": "false",
  1885. "id": "use_alleles_trigger",
  1886. "type": "boolean?",
  1887. "inputBinding": {
  1888. "prefix": "--use-alleles-trigger",
  1889. "shellQuote": false,
  1890. "position": 4
  1891. },
  1892. "label": "Use alleles trigger",
  1893. "doc": "Use additional trigger on variants found in an external alleles file."
  1894. },
  1895. {
  1896. "sbg:category": "Advanced Arguments",
  1897. "sbg:toolDefaultValue": "false",
  1898. "id": "use_filtered_reads_for_annotations",
  1899. "type": "boolean?",
  1900. "inputBinding": {
  1901. "prefix": "--use-filtered-reads-for-annotations",
  1902. "shellQuote": false,
  1903. "position": 4
  1904. },
  1905. "label": "Use filtered reads for annotations",
  1906. "doc": "Use the contamination-filtered read maps for the purposes of annotating variants."
  1907. },
  1908. {
  1909. "sbg:altPrefix": "-old-qual",
  1910. "sbg:category": "Optional Arguments",
  1911. "sbg:toolDefaultValue": "false",
  1912. "id": "use_old_qual_calculator",
  1913. "type": "boolean?",
  1914. "inputBinding": {
  1915. "prefix": "--use-old-qual-calculator",
  1916. "shellQuote": false,
  1917. "position": 4
  1918. },
  1919. "label": "Use old qual calculator",
  1920. "doc": "Use the old AF model."
  1921. },
  1922. {
  1923. "sbg:altPrefix": "-XL",
  1924. "sbg:category": "Optional Arguments",
  1925. "id": "exclude_intervals_file",
  1926. "type": "File?",
  1927. "inputBinding": {
  1928. "prefix": "--exclude-intervals",
  1929. "shellQuote": false,
  1930. "position": 4
  1931. },
  1932. "label": "Exclude intervals file",
  1933. "doc": "One or more genomic intervals to exclude from processing.",
  1934. "sbg:fileTypes": "INTERVAL_LIST, LIST, BED"
  1935. },
  1936. {
  1937. "sbg:category": "Platform Options",
  1938. "sbg:toolDefaultValue": "1",
  1939. "id": "cpu_per_job",
  1940. "type": "int?",
  1941. "label": "CPU per job",
  1942. "doc": "Number of CPUs to be used per job."
  1943. },
  1944. {
  1945. "sbg:category": "Conditional Arguments for readFilter",
  1946. "sbg:toolDefaultValue": "null",
  1947. "id": "ambig_filter_bases",
  1948. "type": "int?",
  1949. "inputBinding": {
  1950. "prefix": "--ambig-filter-bases",
  1951. "shellQuote": false,
  1952. "position": 5
  1953. },
  1954. "label": "Ambig filter bases",
  1955. "doc": "Valid only if \"AmbiguousBaseReadFilter\" is specified:\nThreshold number of ambiguous bases. If null, uses threshold fraction; otherwise, overrides threshold fraction. Cannot be used in conjuction with argument(s) ambig-filter-frac."
  1956. },
  1957. {
  1958. "sbg:category": "Conditional Arguments for readFilter",
  1959. "sbg:toolDefaultValue": "0.05",
  1960. "id": "ambig_filter_frac",
  1961. "type": "float?",
  1962. "inputBinding": {
  1963. "prefix": "--ambig-filter-frac",
  1964. "shellQuote": false,
  1965. "position": 5
  1966. },
  1967. "label": "Ambig filter frac",
  1968. "doc": "Valid only if \"AmbiguousBaseReadFilter\" is specified:\nThreshold fraction of ambiguous bases. Cannot be used in conjuction with argument(s) ambig-filter-bases."
  1969. },
  1970. {
  1971. "sbg:category": "Conditional Arguments for readFilter",
  1972. "sbg:toolDefaultValue": "1000000",
  1973. "id": "max_fragment_length",
  1974. "type": "int?",
  1975. "inputBinding": {
  1976. "prefix": "--max-fragment-length",
  1977. "shellQuote": false,
  1978. "position": 5
  1979. },
  1980. "label": "Max fragment length",
  1981. "doc": "Valid only if \"FragmentLengthReadFilter\" is specified:\nMaximum length of fragment (insert size)."
  1982. },
  1983. {
  1984. "sbg:category": "Conditional Arguments for readFilter",
  1985. "sbg:altPrefix": "-library",
  1986. "id": "library",
  1987. "type": "string[]?",
  1988. "inputBinding": {
  1989. "shellQuote": false,
  1990. "position": 5,
  1991. "valueFrom": "${\n if (inputs.library)\n {\n var lib = [].concat(inputs.library);\n var cmd = [];\n for (var i = 0; i < lib.length; i++) \n {\n cmd.push('--library', lib[i]);\n }\n return cmd.join(' ');\n }\n return '';\n}"
  1992. },
  1993. "label": "Library",
  1994. "doc": "Valid only if \"LibraryReadFilter\" is specified:\nName of the library to keep. This argument must be specified at least once. Required."
  1995. },
  1996. {
  1997. "sbg:category": "Conditional Arguments for readFilter",
  1998. "sbg:toolDefaultValue": "false",
  1999. "id": "dont_require_soft_clips_both_ends",
  2000. "type": "boolean?",
  2001. "inputBinding": {
  2002. "prefix": "--dont-require-soft-clips-both-ends",
  2003. "shellQuote": false,
  2004. "position": 5
  2005. },
  2006. "label": "Do not require soft clips",
  2007. "doc": "Valid only if \"OverclippedReadFilter\" is specified:\nAllow a read to be filtered out based on having only 1 soft-clipped block. By default, both ends must have a soft-clipped block, setting this flag requires only 1 soft-clipped block."
  2008. },
  2009. {
  2010. "sbg:category": "Conditional Arguments for readFilter",
  2011. "sbg:toolDefaultValue": "30",
  2012. "id": "filter_too_short",
  2013. "type": "int?",
  2014. "inputBinding": {
  2015. "prefix": "--filter-too-short",
  2016. "shellQuote": false,
  2017. "position": 5
  2018. },
  2019. "label": "Filter too short",
  2020. "doc": "Valid only if \"OverclippedReadFilter\" is specified:\nMinimum number of aligned bases."
  2021. },
  2022. {
  2023. "sbg:category": "Conditional Arguments for readFilter",
  2024. "id": "platform_filter_name",
  2025. "type": "string[]?",
  2026. "inputBinding": {
  2027. "shellQuote": false,
  2028. "position": 5,
  2029. "valueFrom": "${\n if (inputs.platform_filter_name)\n {\n var pfn = [].concat(inputs.platform_filter_name);\n var cmd = [];\n for (var i = 0; i < pfn.length; i++) \n {\n cmd.push('--platform-filter-name', pfn[i]);\n }\n return cmd.join(' ');\n }\n return '';\n}"
  2030. },
  2031. "label": "Platform filter name",
  2032. "doc": "Valid only if \"PlatformReadFilter\" is specified:\nPlatform attribute (PL) to match. This argument must be specified at least once. Required."
  2033. },
  2034. {
  2035. "sbg:category": "Conditional Arguments for readFilter",
  2036. "id": "black_listed_lanes",
  2037. "type": "string[]?",
  2038. "inputBinding": {
  2039. "shellQuote": false,
  2040. "position": 5,
  2041. "valueFrom": "${\n if (inputs.black_listed_lanes)\n {\n var bl_lanes = [].concat(inputs.black_listed_lanes);\n var cmd = [];\n for (var i = 0; i < bl_lanes.length; i++) \n {\n cmd.push('--black-listed-lanes', bl_lanes[i]);\n }\n return cmd.join(' ');\n }\n return '';\n}"
  2042. },
  2043. "label": "Black listed lanes",
  2044. "doc": "Valid only if \"PlatformUnitReadFilter\" is specified:\nPlatform unit (PU) to filter out. This argument must be specified at least once. Required."
  2045. },
  2046. {
  2047. "sbg:category": "Conditional Arguments for readFilter",
  2048. "id": "read_group_black_list",
  2049. "type": "string[]?",
  2050. "inputBinding": {
  2051. "shellQuote": false,
  2052. "position": 5,
  2053. "valueFrom": "${\n if (inputs.read_group_black_list)\n {\n var rgbl = [].concat(inputs.read_group_black_list);\n var cmd = [];\n for (var i = 0; i < rgbl.length; i++) \n {\n cmd.push('--read-group-black-list', rgbl[i]);\n }\n return cmd.join(' ');\n }\n return '';\n}"
  2054. },
  2055. "label": "Read group black list",
  2056. "doc": "Valid only if \"ReadGroupBlackListReadFilter\" is specified:\nThe name of the read group to filter out. This argument must be specified at least once. Required."
  2057. },
  2058. {
  2059. "sbg:category": "Conditional Arguments for readFilter",
  2060. "id": "keep_read_group",
  2061. "type": "string?",
  2062. "inputBinding": {
  2063. "prefix": "--keep-read-group",
  2064. "shellQuote": false,
  2065. "position": 5
  2066. },
  2067. "label": "Keep read group",
  2068. "doc": "Valid only if \"ReadGroupReadFilter\" is specified:\nThe name of the read group to keep. Required."
  2069. },
  2070. {
  2071. "sbg:category": "Conditional Arguments for readFilter",
  2072. "id": "max_read_length",
  2073. "type": "int?",
  2074. "inputBinding": {
  2075. "prefix": "--max-read-length",
  2076. "shellQuote": false,
  2077. "position": 5
  2078. },
  2079. "label": "Max read length",
  2080. "doc": "Valid only if \"ReadLengthReadFilter\" is specified:\nKeep only reads with length at most equal to the specified value. Required."
  2081. },
  2082. {
  2083. "sbg:category": "Conditional Arguments for readFilter",
  2084. "sbg:toolDefaultValue": "1",
  2085. "id": "min_read_length",
  2086. "type": "int?",
  2087. "inputBinding": {
  2088. "prefix": "--min-read-length",
  2089. "shellQuote": false,
  2090. "position": 5
  2091. },
  2092. "label": "Min read length",
  2093. "doc": "Valid only if \"ReadLengthReadFilter\" is specified:\nKeep only reads with length at least equal to the specified value."
  2094. },
  2095. {
  2096. "sbg:category": "Conditional Arguments for readFilter",
  2097. "id": "read_name",
  2098. "type": "string?",
  2099. "inputBinding": {
  2100. "prefix": "--read-name",
  2101. "shellQuote": false,
  2102. "position": 5
  2103. },
  2104. "label": "Read name",
  2105. "doc": "Valid only if \"ReadNameReadFilter\" is specified:\nKeep only reads with this read name. Required."
  2106. },
  2107. {
  2108. "sbg:category": "Conditional Arguments for readFilter",
  2109. "id": "keep_reverse_strand_only",
  2110. "type": [
  2111. "null",
  2112. {
  2113. "type": "enum",
  2114. "symbols": [
  2115. "true",
  2116. "false"
  2117. ],
  2118. "name": "keep_reverse_strand_only"
  2119. }
  2120. ],
  2121. "inputBinding": {
  2122. "prefix": "--keep-reverse-strand-only",
  2123. "shellQuote": false,
  2124. "position": 5
  2125. },
  2126. "label": "Keep reverse strand only",
  2127. "doc": "Valid only if \"ReadStrandFilter\" is specified:\nKeep only reads on the reverse strand. Required."
  2128. },
  2129. {
  2130. "sbg:category": "Conditional Arguments for readFilter",
  2131. "id": "sample",
  2132. "type": "string[]?",
  2133. "inputBinding": {
  2134. "shellQuote": false,
  2135. "position": 5,
  2136. "valueFrom": "${\n if (inputs.sample)\n {\n var samp = [].concat(inputs.sample);\n var cmd = [];\n for (var i = 0; i < samp.length; i++) \n {\n cmd.push('--sample', samp[i]);\n }\n return cmd.join(' ');\n }\n return '';\n}"
  2137. },
  2138. "label": "Sample",
  2139. "doc": "Valid only if \"SampleReadFilter\" is specified:\nThe name of the sample(s) to keep, filtering out all others. This argument must be specified at least once. Required."
  2140. },
  2141. {
  2142. "sbg:category": "Config Inputs",
  2143. "sbg:toolDefaultValue": "vcf.gz",
  2144. "id": "output_extension",
  2145. "type": [
  2146. "null",
  2147. {
  2148. "type": "enum",
  2149. "symbols": [
  2150. "vcf",
  2151. "vcf.gz"
  2152. ],
  2153. "name": "output_extension"
  2154. }
  2155. ],
  2156. "label": "Output VCF extension",
  2157. "doc": "Output VCF extension.",
  2158. "default": "vcf.gz"
  2159. },
  2160. {
  2161. "sbg:toolDefaultValue": "10.00",
  2162. "sbg:category": "Optional Arguments",
  2163. "sbg:altPrefix": "-seconds-between-progress-updates",
  2164. "id": "seconds_between_progress_updates",
  2165. "type": "float?",
  2166. "inputBinding": {
  2167. "prefix": "--seconds-between-progress-updates",
  2168. "shellQuote": false,
  2169. "position": 4
  2170. },
  2171. "label": "Seconds between progress updates",
  2172. "doc": "Output traversal statistics every time this many seconds elapse."
  2173. },
  2174. {
  2175. "sbg:category": "Optional Arguments",
  2176. "sbg:altPrefix": "-read-index",
  2177. "id": "read_index",
  2178. "type": "File[]?",
  2179. "inputBinding": {
  2180. "shellQuote": false,
  2181. "position": 4,
  2182. "valueFrom": "${\n if (inputs.read_index)\n {\n var r_index = [].concat(inputs.read_index);\n var cmd = [];\n for (var i = 0; i < r_index.length; i++) \n {\n cmd.push('--read-index', r_index[i].path);\n }\n return cmd.join(' ');\n }\n return '';\n}"
  2183. },
  2184. "label": "Read index",
  2185. "doc": "Indices to use for the read inputs. If specified, an index must be provided for every read input and in the same order as the read inputs. If this argument is not specified, the path to the index for each input will be inferred automatically.",
  2186. "sbg:fileTypes": "BAI, CRAI"
  2187. }
  2188. ],
  2189. "outputs": [
  2190. {
  2191. "id": "out_variants",
  2192. "doc": "A raw, unfiltered, highly specific callset in VCF format.",
  2193. "label": "VCF output",
  2194. "type": "File?",
  2195. "outputBinding": {
  2196. "glob": "${\n var output_ext = inputs.output_extension ? inputs.output_extension : \"vcf.gz\";\n return \"*.\" + output_ext;\n}",
  2197. "outputEval": "$(inheritMetadata(self, inputs.in_alignments))"
  2198. },
  2199. "secondaryFiles": [
  2200. "${\n var output_ext = inputs.output_extension ? inputs.output_extension : \"vcf.gz\";\n if (output_ext == \"vcf\") {\n return self.basename + \".idx\";\n }\n else if (output_ext == \"vcf.gz\") {\n return self.basename + \".tbi\";\n }\n else {\n return null;\n }\n}"
  2201. ],
  2202. "sbg:fileTypes": "VCF, VCF.GZ"
  2203. },
  2204. {
  2205. "id": "out_alignments",
  2206. "doc": "Assembled haplotypes.",
  2207. "label": "BAM output",
  2208. "type": "File?",
  2209. "outputBinding": {
  2210. "glob": "${\n if(inputs.bam_output) {\n var tmp = inputs.bam_output.slice(-4).toLowerCase();\n if(tmp == \".bam\") {\n return inputs.bam_output;\n }\n else {\n return inputs.bam_output + '.bam';\n }\n }\n else {\n return null;\n }\n}",
  2211. "outputEval": "$(inheritMetadata(self, inputs.in_alignments))"
  2212. },
  2213. "secondaryFiles": [
  2214. "${\n\n if (self.nameext == '.bam' || self.nameext == '.BAM')\n {\n return self.nameroot + \".bai\";\n }\n return '';\n}"
  2215. ],
  2216. "sbg:fileTypes": "BAM"
  2217. },
  2218. {
  2219. "id": "out_graph",
  2220. "doc": "Assembly graph information.",
  2221. "label": "Graph output",
  2222. "type": "File?",
  2223. "outputBinding": {
  2224. "glob": "${\n if(inputs.graph_output) {\n return inputs.graph_output + '*';\n }\n else {\n return null;\n }\n}",
  2225. "outputEval": "$(inheritMetadata(self, inputs.in_alignments))"
  2226. },
  2227. "sbg:fileTypes": "TXT"
  2228. },
  2229. {
  2230. "id": "out_activity_profile",
  2231. "doc": "Output the raw activity profile results in IGV format.",
  2232. "label": "Raw activity profile",
  2233. "type": "File?",
  2234. "outputBinding": {
  2235. "glob": "${\n if(inputs.activity_profile_out) {\n return inputs.activity_profile_out + '*';\n }\n else {\n return null;\n }\n}",
  2236. "outputEval": "$(inheritMetadata(self, inputs.in_alignments))"
  2237. },
  2238. "sbg:fileTypes": "IGV"
  2239. },
  2240. {
  2241. "id": "out_assembly_region",
  2242. "doc": "Output the assembly region to this IGV formatted file.",
  2243. "label": "Assembly region",
  2244. "type": "File?",
  2245. "outputBinding": {
  2246. "glob": "${\n if(inputs.assembly_region_out) {\n return inputs.assembly_region_out + '*';\n }\n else {\n return null;\n }\n}",
  2247. "outputEval": "$(inheritMetadata(self, inputs.in_alignments))"
  2248. },
  2249. "sbg:fileTypes": "IGV"
  2250. }
  2251. ],
  2252. "doc": "**GATK HaplotypeCaller** calls germline SNPs and indels from input BAM file(s) via local re-assembly of haplotypes [1].\n\n**GATK HaplotypeCaller** is capable of calling SNPs and indels simultaneously via local de-novo assembly of haplotypes in an active region. In other words, whenever the program encounters a region showing signs of variation, it discards the existing mapping information and completely reassembles the reads in that region. Reassembled reads are realigned to the reference. This allows **GATK HaplotypeCaller** to be more accurate when calling regions that are traditionally difficult to call, for example when they contain different types of variants close to each other. It also makes **GATK HaplotypeCaller** much better at calling indels than position-based callers like UnifiedGenotyper [1].\n\nIn the GVCF workflow used for scalable variant calling in DNA sequence data, **GATK HaplotypeCaller** runs per-sample to generate an intermediate GVCF (not to be used in final analysis), which can then be used in GenotypeGVCFs for joint genotyping of multiple samples in a very efficient way. The GVCF workflow enables rapid incremental processing of samples as they roll off the sequencer, as well as scaling to very large cohort sizes [1].\n\nIn addition, **HaplotypeCaller** is able to handle non-diploid organisms as well as pooled experiment data. Note however that the algorithms used to calculate variant likelihoods are not well suited to extreme allele frequencies (relative to ploidy) so its use is not recommended for somatic (cancer) variant discovery. For that purpose, use **Mutect2** instead [1].\n\nFinally, **GATK HaplotypeCaller** is also able to correctly handle splice junctions that make RNAseq a challenge for most variant callers, on the condition that the input read data has previously been processed according to [GATK RNAseq short variant discovery (SNPs + Indels)](https://gatk.broadinstitute.org/hc/en-us/articles/360035531192?id=4067) [1].\n\n*A list of **all inputs and parameters** with corresponding descriptions can be found at the bottom of this page.*\n\n### Common Use Cases\n\n- Call variants individually on each sample in GVCF mode\n\n```\n gatk --java-options \"-Xmx4g\" HaplotypeCaller \\\n -R Homo_sapiens_assembly38.fasta \\\n -I input.bam \\\n -O output.g.vcf.gz \\\n -ERC GVCF\n```\n\n\n- Call variants individually on each sample in GVCF mode with allele-specific annotations. [Here](https://software.broadinstitute.org/gatk/documentation/article?id=9622) you can read more details about allele-specific annotation and filtering.\n\n```\ngatk --java-options \"-Xmx4g\" HaplotypeCaller \\\n -R Homo_sapiens_assembly38.fasta \\\n -I input.bam \\\n -O output.g.vcf.gz \\\n -ERC GVCF \\\n -G Standard \\\n -G AS_Standard\n```\n\n\n- Call variants with bamout to show realigned reads. After performing a local reassembly and realignment, the reads' mapping positions are different than in the original file. This option could be used to visualize what rearrangements **HaplotypeCaller** has made.\n\n```\n gatk --java-options \"-Xmx4g\" HaplotypeCaller \\\n -R Homo_sapiens_assembly38.fasta \\\n -I input.bam \\\n -O output.vcf.gz \\\n -bamout bamout.bam\n```\n\n### Changes Introduced by Seven Bridges\n\n- **Include intervals** (`--intervals`) option is divided into **Include intervals string** and **Include intervals file** options.\n- **Exclude intervals** (`--exclude-intervals`) option is divided into **Exclude intervals string** and **Exclude intervals file** options.\n- **VCF output** will be prefixed using the **Output name prefix** parameter. If this value is not set, the output name will be generated based on **Sample ID** metadata value from the **Input alignments** file. If **Sample ID** value is not set, the name will be inherited from the **Input alignments** file name. In case there are multiple files on the **Input alignments** input, the files will be sorted by name and output file name will be generated based on the first file in the sorted file list, following the rules defined in the previous case.\n- The user can specify the output file format using the **Output VCF extension** argument. Otherwise, the output will be in the compressed VCF file format.\n- The following parameters were excluded from the tool wrapper: `--arguments_file`, `--cloud-index-prefetch-buffer`, `--cloud-prefetch-buffer`, `--create-output-bam-md5`, `--create-output-variant-md5`, `--gatk-config-file`, `--gcs-max-retries`, `--gcs-project-for-requester-pays`, `--help`, `--QUIET`, `--recover-dangling-heads` (deprecated), `--showHidden`, `--tmp-dir`, `--use-jdk-deflater`, `--use-jdk-inflater`, `--use-new-qual-calculator` (deprecated), `--verbosity`, `--version`\n\n### Common Issues and Important Notes\n\n- **Memory per job** (`mem_per_job`) input allows a user to set the desired memory requirement when running a tool or adding it to a workflow. This input should be defined in MB. It is propagated to the Memory requirements part and “-Xmx” parameter of the tool. The default value is 4000MB.\n- **Memory overhead per job** (`mem_overhead_per_job`) input allows a user to set the desired overhead memory when running a tool or adding it to a workflow. This input should be defined in MB. This amount will be added to the Memory per job in the Memory requirements section but it will not be added to the “-Xmx” parameter. The default value is 100MB. \n- Note: GATK tools that take in mapped read data expect a BAM file as the primary format [2]. More on GATK requirements for mapped sequence data formats can be found [here](https://gatk.broadinstitute.org/hc/en-us/articles/360035890791-SAM-or-BAM-or-CRAM-Mapped-sequence-data-formats).\n- Note: **Alleles**, **Comparison VCF**, **dbSNP**, **Input alignments**, **Population callset** should have corresponding index files in the same folder. \n- Note: **Reference** FASTA file should have corresponding .fai (FASTA index) and .dict (FASTA dictionary) files in the same folder. \n- Note: When working with PCR-free data, be sure to set **PCR indel model** (`--pcr_indel_model`) to NONE [1].\n- Note: When running **Emit ref confidence** ( `--emit-ref-confidence`) in GVCF or in BP_RESOLUTION mode, the confidence threshold is automatically set to 0. This cannot be overridden by the command line. The threshold can be set manually to the desired level when using **GenotypeGVCFs** [1].\n- Note: It is recommended to use a list of intervals to speed up the analysis. See [this document](https://software.broadinstitute.org/gatk/documentation/article?id=4133) for details [1].\n- Note: **HaplotypeCaller** is able to handle many non-diploid use cases; the desired ploidy can be specified using the `-ploidy` argument. Note however that very high ploidies (such as are encountered in large pooled experiments) may cause performance challenges including excessive slowness [1].\n- Note: These **Read Filters** (`--read-filter`) are automatically applied to the data by the Engine before processing by **HaplotypeCaller** [1]: **NotSecondaryAlignmentReadFilter**, **GoodCigarReadFilter**, **NonZeroReferenceLengthAlignmentReadFilter**, **PassesVendorQualityCheckReadFilter**, **MappedReadFilter**, **MappingQualityAvailableReadFilter**, **NotDuplicateReadFilter**, **MappingQualityReadFilter**, **WellformedReadFilter**\n- If the **Read filter** (`--read-filter`) option is set to \"LibraryReadFilter\", the **Library** (`--library`) option must be set to some value.\n- If the **Read filter** (`--read-filter`) option is set to \"PlatformReadFilter\", the **Platform filter name** (`--platform-filter-name`) option must be set to some value.\n- If the **Read filter** (`--read-filter`) option is set to \"PlatformUnitReadFilter\", the **Black listed lanes** (`--black-listed-lanes`) option must be set to some value. \n- If the **Read filter** (`--read-filter`) option is set to \"ReadGroupBlackListReadFilter\", the **Read group black list** (`--read-group-black-list`) option must be set to some value.\n- If the **Read filter** (`--read-filter`) option is set to \"ReadGroupReadFilter\", the **Keep read group** (`--keep-read-group`) option must be set to some value.\n- If the **Read filter** (`--read-filter`) option is set to \"ReadLengthReadFilter\", the **Max read length** (`--max-read-length`) option must be set to some value.\n- If the **Read filter** (`--read-filter`) option is set to \"ReadNameReadFilter\", the **Read name** (`--read-name`) option must be set to some value.\n- If the **Read filter** (`--read-filter`) option is set to \"ReadStrandFilter\", the **Keep reverse strand only** (`--keep-reverse-strand-only`) option must be set to some value.\n- If the **Read filter** (`--read-filter`) option is set to \"SampleReadFilter\", the **Sample** (`--sample`) option must be set to some value.\n- The following options are valid only if an appropriate **Read filter** (`--read-filter`) is specified: **Ambig filter bases** (`--ambig-filter-bases`), **Ambig filter frac** (`--ambig-filter-frac`), **Max fragment length** (`--max-fragment-length`), **Maximum mapping quality** (`--maximum-mapping-quality`), **Minimum mapping quality** (`--minimum-mapping-quality`), **Do not require soft clips** (`--dont-require-soft-clips-both-ends`), **Filter too short** (`--filter-too-short`), **Min read length** (`--min-read-length`). See the description of each parameter for information on the associated **Read filter**.\n- Note: Allele-specific annotations are not yet supported in the VCF mode\n- Note: The wrapper has not been tested for the SAM file type on the **Input alignments** input port.\n\n### Performance Benchmarking\n\nBelow is a table describing the runtimes and task costs for a couple of samples with different file sizes.\n\n| Experiment type | Input size | Paired-end | # of reads | Read length | Duration | Cost (on-demand) | AWS instance type |\n|:--------------:|:------------:|:--------:|:-------:|:---------:|:----------:|:------:|:------:|:------:|\n| RNA-Seq | 2.6 GB | Yes | 16M | 101 | 50min | 0.45$ | c4.2xlarge |\n| RNA-Seq | 7.7 GB | Yes | 50M | 101 | 1h31min | 0.82$ | c4.2xlarge |\n| RNA-Seq | 12.7 GB | Yes | 82M | 101 | 2h19min | 1.26$ | c4.2xlarge |\n| RNA-Seq | 25 GB | Yes | 164M | 101 | 4h5min | 2.21$ | c4.2xlarge |\n\n*Cost can be significantly reduced by using **spot instances**. Visit the [Knowledge Center](https://docs.sevenbridges.com/docs/about-spot-instances) for more details.*\n\n### References\n[1] [GATK HaplotypeCaller](https://gatk.broadinstitute.org/hc/en-us/articles/360036359552-HaplotypeCaller)\n\n[2] [GATK Mapped sequence data formats](https://gatk.broadinstitute.org/hc/en-us/articles/360035890791-SAM-or-BAM-or-CRAM-Mapped-sequence-data-formats)",
  2253. "label": "GATK HaplotypeCaller CWL1.0",
  2254. "arguments": [
  2255. {
  2256. "prefix": "",
  2257. "shellQuote": false,
  2258. "position": 1,
  2259. "valueFrom": "${\n if (inputs.mem_per_job) {\n return '\\\"-Xmx'.concat(inputs.mem_per_job, 'M') + '\\\"';\n } else {\n return '\\\"-Xmx4000M\\\"';\n }\n}"
  2260. },
  2261. {
  2262. "prefix": "",
  2263. "shellQuote": false,
  2264. "position": 2,
  2265. "valueFrom": "HaplotypeCaller"
  2266. },
  2267. {
  2268. "prefix": "--output",
  2269. "shellQuote": false,
  2270. "position": 3,
  2271. "valueFrom": "${\n //sort list of input files by nameroot\n function sortNameroot(x, y) {\n if (x.nameroot < y.nameroot) {\n return -1;\n }\n if (x.nameroot > y.nameroot) {\n return 1;\n }\n return 0;\n }\n \n var output_prefix;\n var output_ext;\n var in_num = [].concat(inputs.in_alignments).length;\n var tmp_ext = inputs.output_extension ? inputs.output_extension : \"vcf.gz\";\n var in_align = [].concat(inputs.in_alignments);\n \n if(inputs.emit_ref_confidence == 'GVCF' || inputs.emit_ref_confidence == 'BP_RESOLUTION'){\n output_ext = '.g.' + tmp_ext;\n } \n else {\n output_ext = '.' + tmp_ext;\n }\n \n //if input_prefix is provided by the user\n if (inputs.prefix) {\n output_prefix = inputs.prefix;\n if (in_num > 1) {\n output_prefix = output_prefix + '.' + in_num;\n }\n }\n else {\n //if there is only one input file\n if(in_num == 1){\n // check if the sample_id metadata value is defined for the input file\n if(in_align[0].metadata && in_align[0].metadata.sample_id) {\n output_prefix = in_align[0].metadata.sample_id;\n // if sample_id is not defined\n } else {\n output_prefix = in_align[0].path.split('/').pop().split('.')[0];\n }\n }\n //if there are more than 1 input files\n //sort list of input file objects alphabetically by file name \n //take the first element from that list, and generate output file name as if that file is the only file on the input. \n else if(in_num > 1) {\n //sort list of input files by nameroot\n in_align.sort(sortNameroot);\n //take the first alphabetically sorted file\n var first_file = in_align[0];\n //check if the sample_id metadata value is defined for the input file\n if(first_file.metadata && first_file.metadata.sample_id) {\n output_prefix = first_file.metadata.sample_id + '.' + in_num;\n // if sample_id is not defined\n } else {\n output_prefix = first_file.path.split('/').pop().split('.')[0] + '.' + in_num;\n }\n }\n }\n var output_full = output_prefix + output_ext;\n return output_full;\n}"
  2272. }
  2273. ],
  2274. "requirements": [
  2275. {
  2276. "class": "ShellCommandRequirement"
  2277. },
  2278. {
  2279. "class": "ResourceRequirement",
  2280. "ramMin": "${\n var memory = 4000;\n \n if(inputs.mem_per_job) {\n \t memory = inputs.mem_per_job;\n }\n if(inputs.mem_overhead_per_job) {\n\tmemory += inputs.mem_overhead_per_job;\n }\n else {\n memory += 100;\n }\n return memory;\n}",
  2281. "coresMin": "${\n return inputs.cpu_per_job ? inputs.cpu_per_job : 1;\n}"
  2282. },
  2283. {
  2284. "class": "DockerRequirement",
  2285. "dockerPull": "images.sbgenomics.com/marijeta_slavkovic/gatk-4-1-0-0:0"
  2286. },
  2287. {
  2288. "class": "InitialWorkDirRequirement",
  2289. "listing": []
  2290. },
  2291. {
  2292. "class": "InlineJavascriptRequirement",
  2293. "expressionLib": [
  2294. "\nvar setMetadata = function(file, metadata) {\n if (!('metadata' in file)) {\n file['metadata'] = {}\n }\n for (var key in metadata) {\n file['metadata'][key] = metadata[key];\n }\n return file\n};\nvar inheritMetadata = function(o1, o2) {\n var commonMetadata = {};\n if (!o2) {\n return o1;\n };\n if (!Array.isArray(o2)) {\n o2 = [o2]\n }\n for (var i = 0; i < o2.length; i++) {\n var example = o2[i]['metadata'];\n for (var key in example) {\n if (i == 0)\n commonMetadata[key] = example[key];\n else {\n if (!(commonMetadata[key] == example[key])) {\n delete commonMetadata[key]\n }\n }\n }\n for (var key in commonMetadata) {\n if (!(key in example)) {\n delete commonMetadata[key]\n }\n }\n }\n if (!Array.isArray(o1)) {\n o1 = setMetadata(o1, commonMetadata)\n if (o1.secondaryFiles) {\n o1.secondaryFiles = inheritMetadata(o1.secondaryFiles, o2)\n }\n } else {\n for (var i = 0; i < o1.length; i++) {\n o1[i] = setMetadata(o1[i], commonMetadata)\n if (o1[i].secondaryFiles) {\n o1[i].secondaryFiles = inheritMetadata(o1[i].secondaryFiles, o2)\n }\n }\n }\n return o1;\n};"
  2295. ]
  2296. }
  2297. ],
  2298. "sbg:categories": [
  2299. "Genomics",
  2300. "Variant Calling",
  2301. "CWL1.0"
  2302. ],
  2303. "sbg:image_url": null,
  2304. "sbg:license": "BSD 3-Clause License",
  2305. "sbg:links": [
  2306. {
  2307. "id": "https://www.broadinstitute.org/gatk/index.php",
  2308. "label": "Homepage"
  2309. },
  2310. {
  2311. "id": "https://github.com/broadinstitute/gatk",
  2312. "label": "Source Code"
  2313. },
  2314. {
  2315. "id": "https://github.com/broadinstitute/gatk/releases/download/4.1.0.0/gatk-4.1.0.0.zip",
  2316. "label": "Download"
  2317. },
  2318. {
  2319. "id": "https://www.biorxiv.org/content/10.1101/201178v3",
  2320. "label": "Publication"
  2321. },
  2322. {
  2323. "id": "https://gatk.broadinstitute.org/hc/en-us/articles/360036359552-HaplotypeCaller",
  2324. "label": "Documentation"
  2325. }
  2326. ],
  2327. "sbg:projectName": "GATK 4.1.0.0 - Demo",
  2328. "sbg:revisionsInfo": [
  2329. {
  2330. "sbg:revision": 0,
  2331. "sbg:modifiedBy": "uros_sipetic",
  2332. "sbg:modifiedOn": 1553086627,
  2333. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-haplotypecaller-4-1-0-0/4"
  2334. },
  2335. {
  2336. "sbg:revision": 1,
  2337. "sbg:modifiedBy": "uros_sipetic",
  2338. "sbg:modifiedOn": 1553105347,
  2339. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-haplotypecaller-4-1-0-0/8"
  2340. },
  2341. {
  2342. "sbg:revision": 2,
  2343. "sbg:modifiedBy": "veliborka_josipovic",
  2344. "sbg:modifiedOn": 1554720901,
  2345. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-haplotypecaller-4-1-0-0/13"
  2346. },
  2347. {
  2348. "sbg:revision": 3,
  2349. "sbg:modifiedBy": "veliborka_josipovic",
  2350. "sbg:modifiedOn": 1554730721,
  2351. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-haplotypecaller-4-1-0-0/14"
  2352. },
  2353. {
  2354. "sbg:revision": 4,
  2355. "sbg:modifiedBy": "veliborka_josipovic",
  2356. "sbg:modifiedOn": 1554999234,
  2357. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-haplotypecaller-4-1-0-0/15"
  2358. },
  2359. {
  2360. "sbg:revision": 5,
  2361. "sbg:modifiedBy": "nemanja.vucic",
  2362. "sbg:modifiedOn": 1559736399,
  2363. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-haplotypecaller-4-1-0-0/17"
  2364. },
  2365. {
  2366. "sbg:revision": 6,
  2367. "sbg:modifiedBy": "veliborka_josipovic",
  2368. "sbg:modifiedOn": 1559746054,
  2369. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-haplotypecaller-4-1-0-0/18"
  2370. },
  2371. {
  2372. "sbg:revision": 7,
  2373. "sbg:modifiedBy": "nemanja.vucic",
  2374. "sbg:modifiedOn": 1559750439,
  2375. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-haplotypecaller-4-1-0-0/19"
  2376. },
  2377. {
  2378. "sbg:revision": 8,
  2379. "sbg:modifiedBy": "nemanja.vucic",
  2380. "sbg:modifiedOn": 1581091724,
  2381. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-haplotypecaller-4-1-0-0/20"
  2382. },
  2383. {
  2384. "sbg:revision": 9,
  2385. "sbg:modifiedBy": "uros_sipetic",
  2386. "sbg:modifiedOn": 1584999154,
  2387. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-haplotypecaller-4-1-0-0/21"
  2388. },
  2389. {
  2390. "sbg:revision": 10,
  2391. "sbg:modifiedBy": "marijeta_slavkovic",
  2392. "sbg:modifiedOn": 1593698947,
  2393. "sbg:revisionNotes": "New wrapper"
  2394. },
  2395. {
  2396. "sbg:revision": 11,
  2397. "sbg:modifiedBy": "marijeta_slavkovic",
  2398. "sbg:modifiedOn": 1593705205,
  2399. "sbg:revisionNotes": "Description review suggestions added"
  2400. },
  2401. {
  2402. "sbg:revision": 12,
  2403. "sbg:modifiedBy": "marijeta_slavkovic",
  2404. "sbg:modifiedOn": 1593790342,
  2405. "sbg:revisionNotes": "typo in description"
  2406. },
  2407. {
  2408. "sbg:revision": 13,
  2409. "sbg:modifiedBy": "marijeta_slavkovic",
  2410. "sbg:modifiedOn": 1594067972,
  2411. "sbg:revisionNotes": "naming description and benchmarking cost review"
  2412. },
  2413. {
  2414. "sbg:revision": 14,
  2415. "sbg:modifiedBy": "marijeta_slavkovic",
  2416. "sbg:modifiedOn": 1594725360,
  2417. "sbg:revisionNotes": "added CRAM and SAM to suggested types for in_alignments"
  2418. },
  2419. {
  2420. "sbg:revision": 15,
  2421. "sbg:modifiedBy": "marijeta_slavkovic",
  2422. "sbg:modifiedOn": 1594725548,
  2423. "sbg:revisionNotes": "removed SAM as file suggestion"
  2424. },
  2425. {
  2426. "sbg:revision": 16,
  2427. "sbg:modifiedBy": "marijeta_slavkovic",
  2428. "sbg:modifiedOn": 1597682601,
  2429. "sbg:revisionNotes": "some description added"
  2430. },
  2431. {
  2432. "sbg:revision": 17,
  2433. "sbg:modifiedBy": "marijeta_slavkovic",
  2434. "sbg:modifiedOn": 1597693090,
  2435. "sbg:revisionNotes": "some description added"
  2436. },
  2437. {
  2438. "sbg:revision": 18,
  2439. "sbg:modifiedBy": "marijeta_slavkovic",
  2440. "sbg:modifiedOn": 1598132604,
  2441. "sbg:revisionNotes": "added [].concat for arrays"
  2442. },
  2443. {
  2444. "sbg:revision": 19,
  2445. "sbg:modifiedBy": "marijeta_slavkovic",
  2446. "sbg:modifiedOn": 1603200852,
  2447. "sbg:revisionNotes": "description edited (memory in description etc)"
  2448. },
  2449. {
  2450. "sbg:revision": 20,
  2451. "sbg:modifiedBy": "marijeta_slavkovic",
  2452. "sbg:modifiedOn": 1603280040,
  2453. "sbg:revisionNotes": "small parameter description"
  2454. },
  2455. {
  2456. "sbg:revision": 21,
  2457. "sbg:modifiedBy": "marijeta_slavkovic",
  2458. "sbg:modifiedOn": 1603297605,
  2459. "sbg:revisionNotes": "secondary files alleles, comp, dbsnp, population_callset (return basename.idx instead of '' when not VCF or VCF.GZ)"
  2460. }
  2461. ],
  2462. "sbg:toolAuthor": "Broad Institute",
  2463. "sbg:toolkit": "GATK",
  2464. "sbg:toolkitVersion": "4.1.0.0",
  2465. "sbg:appVersion": [
  2466. "v1.0"
  2467. ],
  2468. "sbg:id": "h-1bfe14f6/h-4a23c02d/h-083f1b40/0",
  2469. "sbg:revision": 21,
  2470. "sbg:revisionNotes": "secondary files alleles, comp, dbsnp, population_callset (return basename.idx instead of '' when not VCF or VCF.GZ)",
  2471. "sbg:modifiedOn": 1603297605,
  2472. "sbg:modifiedBy": "marijeta_slavkovic",
  2473. "sbg:createdOn": 1553086627,
  2474. "sbg:createdBy": "uros_sipetic",
  2475. "sbg:project": "uros_sipetic/gatk-4-1-0-0-demo",
  2476. "sbg:sbgMaintained": false,
  2477. "sbg:validationErrors": [],
  2478. "sbg:contributors": [
  2479. "marijeta_slavkovic",
  2480. "veliborka_josipovic",
  2481. "uros_sipetic",
  2482. "nemanja.vucic"
  2483. ],
  2484. "sbg:latestRevision": 21,
  2485. "sbg:publisher": "sbg",
  2486. "sbg:content_hash": "a4996c97275c9ef8556188f054f4741e586d5cf2b5f8a1a071d6049bb3d7cc929"
  2487. },
  2488. "label": "GATK HaplotypeCaller",
  2489. "scatter": [
  2490. "include_intervals_file"
  2491. ],
  2492. "sbg:x": -370.4007873535156,
  2493. "sbg:y": -133.78988647460938
  2494. },
  2495. {
  2496. "id": "gatk_mergevcfs_4_1_0_0",
  2497. "in": [
  2498. {
  2499. "id": "in_variants",
  2500. "source": [
  2501. "gatk_haplotypecaller_4_1_0_0/out_variants"
  2502. ]
  2503. },
  2504. {
  2505. "id": "output_file_format",
  2506. "default": "vcf",
  2507. "source": "output_file_format"
  2508. },
  2509. {
  2510. "id": "output_prefix",
  2511. "source": "output_prefix"
  2512. }
  2513. ],
  2514. "out": [
  2515. {
  2516. "id": "out_variants"
  2517. }
  2518. ],
  2519. "run": {
  2520. "class": "CommandLineTool",
  2521. "cwlVersion": "v1.0",
  2522. "$namespaces": {
  2523. "sbg": "https://sevenbridges.com"
  2524. },
  2525. "id": "uros_sipetic/gatk-4-1-0-0-demo/gatk-mergevcfs-4-1-0-0/8",
  2526. "baseCommand": [],
  2527. "inputs": [
  2528. {
  2529. "sbg:altPrefix": "-I",
  2530. "sbg:category": "Required Arguments",
  2531. "id": "in_variants",
  2532. "type": "File[]",
  2533. "inputBinding": {
  2534. "shellQuote": false,
  2535. "position": 4,
  2536. "valueFrom": "${\n if (self)\n {\n var cmd = [];\n for (var i = 0; i < self.length; i++) \n {\n cmd.push('--INPUT', self[i].path);\n \n }\n return cmd.join(' ');\n }\n}"
  2537. },
  2538. "label": "Input variants file",
  2539. "doc": "VCF or BCF input files (file format is determined by file extension).",
  2540. "sbg:fileTypes": "VCF, VCF.GZ, BCF",
  2541. "secondaryFiles": [
  2542. "${\n if (self.nameext == \".vcf\")\n {\n return self.basename + \".idx\";\n }\n else\n {\n return self.basename + \".tbi\";\n }\n}"
  2543. ]
  2544. },
  2545. {
  2546. "sbg:category": "Optional Arguments",
  2547. "sbg:toolDefaultValue": "2",
  2548. "id": "compression_level",
  2549. "type": "int?",
  2550. "inputBinding": {
  2551. "prefix": "--COMPRESSION_LEVEL",
  2552. "shellQuote": false,
  2553. "position": 4
  2554. },
  2555. "label": "Compression level",
  2556. "doc": "Compression level for all compressed files created (e.g. BAM and VCF)."
  2557. },
  2558. {
  2559. "sbg:category": "Optional Arguments",
  2560. "sbg:toolDefaultValue": "500000",
  2561. "id": "max_records_in_ram",
  2562. "type": "int?",
  2563. "inputBinding": {
  2564. "prefix": "--MAX_RECORDS_IN_RAM",
  2565. "shellQuote": false,
  2566. "position": 4
  2567. },
  2568. "label": "Max records in RAM",
  2569. "doc": "When writing files that need to be sorted, this will specify the number of records stored in RAM before spilling to disk. Increasing this number reduces the number of file handles needed to sort the file, and increases the amount of RAM needed."
  2570. },
  2571. {
  2572. "sbg:category": "Platform Options",
  2573. "id": "memory_overhead_per_job",
  2574. "type": "int?",
  2575. "label": "Memory overhead per job",
  2576. "doc": "This input allows a user to set the desired overhead memory when running a tool or adding it to a workflow. This amount will be added to the Memory per job in the Memory requirements section but it will not be added to the -Xmx parameter leaving some memory not occupied which can be used as stack memory (-Xmx parameter defines heap memory). This input should be defined in MB (for both the platform part and the -Xmx part if Java tool is wrapped)."
  2577. },
  2578. {
  2579. "sbg:category": "Platform Options",
  2580. "sbg:toolDefaultValue": "2048 MB",
  2581. "id": "memory_per_job",
  2582. "type": "int?",
  2583. "label": "Memory per job",
  2584. "doc": "This input allows a user to set the desired memory requirement when running a tool or adding it to a workflow. This value should be propagated to the -Xmx parameter too.This input should be defined in MB (for both the platform part and the -Xmx part if Java tool is wrapped)."
  2585. },
  2586. {
  2587. "sbg:altPrefix": "-D",
  2588. "sbg:category": "Optional Arguments",
  2589. "sbg:toolDefaultValue": "null",
  2590. "id": "sequence_dictionary",
  2591. "type": "File?",
  2592. "inputBinding": {
  2593. "prefix": "--SEQUENCE_DICTIONARY",
  2594. "shellQuote": false,
  2595. "position": 4
  2596. },
  2597. "label": "Sequence dictionary",
  2598. "doc": "The index sequence dictionary to use instead of the sequence dictionary in the input files.",
  2599. "sbg:fileTypes": "DICT"
  2600. },
  2601. {
  2602. "sbg:category": "Platform options",
  2603. "sbg:toolDefaultValue": "1",
  2604. "id": "cpu_per_job",
  2605. "type": "int?",
  2606. "label": "CPU per job",
  2607. "doc": "This input allows a user to set the desired CPU requirement when running a tool or adding it to a workflow."
  2608. },
  2609. {
  2610. "sbg:category": "Optional Arguments",
  2611. "id": "output_file_format",
  2612. "type": [
  2613. "null",
  2614. {
  2615. "type": "enum",
  2616. "symbols": [
  2617. "vcf",
  2618. "bcf",
  2619. "vcf.gz"
  2620. ],
  2621. "name": "output_file_format"
  2622. }
  2623. ],
  2624. "label": "Output file format",
  2625. "doc": "Output file format."
  2626. },
  2627. {
  2628. "sbg:category": "Optional Arguments",
  2629. "id": "output_prefix",
  2630. "type": "string?",
  2631. "label": "Output prefix",
  2632. "doc": "Output file name prefix."
  2633. }
  2634. ],
  2635. "outputs": [
  2636. {
  2637. "id": "out_variants",
  2638. "doc": "The merged VCF or BCF file. File format is determined by file extension.",
  2639. "label": "Output merged VCF or BCF file",
  2640. "type": "File?",
  2641. "outputBinding": {
  2642. "glob": "${\n var in_variants = [].concat(inputs.in_variants);\n \n var vcf_count = 0;\n var vcf_gz_count = 0;\n var bcf_count = 0;\n var gvcf_count = 0;\n var gvcf_gz_count = 0;\n \n for (var i = 0; i < in_variants.length; i++)\n {\n if (in_variants[i].path.endsWith('vcf') && !(in_variants[i].path.endsWith('g.vcf')) )\n vcf_count += 1\n else if (in_variants[i].path.endsWith('vcf.gz') && !(in_variants[i].path.endsWith('g.vcf.gz')))\n vcf_gz_count += 1\n else if (in_variants[i].path.endsWith('bcf'))\n bcf_count += 1\n else if (in_variants[i].path.endsWith('g.vcf'))\n gvcf_count += 1\n else if (in_variants[i].path.endsWith('g.vcf.gz'))\n gvcf_gz_count += 1\n \n }\n \n var max_ext = Math.max(vcf_count, vcf_gz_count, bcf_count, gvcf_count, gvcf_gz_count)\n var most_frequent_ext = (max_ext == vcf_count) ? \"vcf\" : (max_ext == vcf_gz_count) ? \"vcf.gz\" : (max_ext == bcf_count) ? \"bcf\" : (max_ext == gvcf_count) ? \"g.vcf\" : \"g.vcf.gz\";\n var out_format = inputs.output_file_format;\n var out_ext = \"\";\n if (out_format)\n {\n out_ext = ((most_frequent_ext == \"g.vcf\" || most_frequent_ext == \"g.vcf.gz\") && (out_format == \"vcf\" || out_format == \"vcf.gz\")) ? \"g.\" + out_format : ((most_frequent_ext == \"g.vcf\" || most_frequent_ext == \"g.vcf.gz\") && (out_format == \"bcf\" )) ? most_frequent_ext : out_format; \n }\n else\n {\n out_ext = most_frequent_ext;\n }\n return \"*\" + out_ext;\n \n}",
  2643. "outputEval": "$(inheritMetadata(self, inputs.in_variants))"
  2644. },
  2645. "secondaryFiles": [
  2646. "${\n return self.basename + \".tbi\";\n}\n"
  2647. ],
  2648. "sbg:fileTypes": "VCF, VCF.GZ, BCF"
  2649. }
  2650. ],
  2651. "doc": "The **GATK MergeVcfs** tool combines multiple variant files into a single variant file. \n\n*A list of **all inputs and parameters** with corresponding descriptions can be found at the bottom of the page.*\n\n###Common Use Cases\n\n* The **MergeVcfs** tool requires one or more input files in VCF format on its **Input variant files** (`--INPUT`) input. The input files can be in VCF format (can be gzipped, i.e. ending in \".vcf.gz\", or binary compressed, i.e. ending in \".bcf\"). The tool generates a VCF file on its **Output merged VCF or BCF file** output.\n\n* The **MergeVcfs** tool supports a sequence dictionary file (typically name ending in .dict) on its **Sequence dictionary** (`--SEQUENCE_DICTIONARY`) input if the input VCF does not contain a complete contig list and if the output index is to be created (true by default).\n\n* The output file is sorted (i) according to the dictionary and (ii) by coordinate.\n\n* Usage example:\n\n```\ngatk MergeVcfs \\\n --INPUT input_variants.01.vcf \\\n --INPUT input_variants.02.vcf.gz \\\n --OUTPUT output_variants.vcf.gz\n```\n\n###Changes Introduced by Seven Bridges\n\n* The output file will be prefixed using the **Output prefix** parameter. In case **Output prefix** is not provided, the input files provided on the **Input variant files** input will be alphabetically sorted by name and output prefix will be equal to the Sample ID metadata from the first element from that list, if the Sample ID metadata exists. Otherwise, output prefix will be inferred from the filename of the first element from this list. Moreover, the number of input files will be added after the output prefix as well as the tool specific extension which is **merged**. This way, having identical names of the output files between runs is avoided.\n\n* The user has a possibility to specify the output file format using the **Output file format** argument. The default output format is \"vcf.gz\".\n\n###Common Issues and Important Notes\n\n* Note 1: If running this tool on multi-sample input files (originating from e.g. some scatter-gather runs), the input files must contain the same sample names in the same column order. \n\n* Note 2: Input file headers must contain compatible declarations for common annotations (INFO, FORMAT fields) and filters.\n\n* Note 3: Input files variant records must be sorted by their contig and position following the sequence dictionary provided or the header contig list.\n\n###Performance Benchmarking\n\nThis tool is ultra fast, with a running time less than a minute on the default AWS c4.2xlarge instance.\n\n###References\n\n[1] [GATK MergeVcfs](https://software.broadinstitute.org/gatk/documentation/tooldocs/4.1.0.0/picard_vcf_MergeVcfs.php)",
  2652. "label": "GATK MergeVcfs",
  2653. "arguments": [
  2654. {
  2655. "prefix": "",
  2656. "shellQuote": false,
  2657. "position": 0,
  2658. "valueFrom": "/opt/gatk-4.1.0.0/gatk"
  2659. },
  2660. {
  2661. "shellQuote": false,
  2662. "position": 1,
  2663. "valueFrom": "--java-options"
  2664. },
  2665. {
  2666. "prefix": "",
  2667. "shellQuote": false,
  2668. "position": 2,
  2669. "valueFrom": "${\n if (inputs.memory_per_job) {\n return '\\\"-Xmx'.concat(inputs.memory_per_job, 'M') + '\\\"';\n }\n return '\\\"-Xms2000m\\\"';\n}"
  2670. },
  2671. {
  2672. "shellQuote": false,
  2673. "position": 3,
  2674. "valueFrom": "MergeVcfs"
  2675. },
  2676. {
  2677. "prefix": "",
  2678. "shellQuote": false,
  2679. "position": 4,
  2680. "valueFrom": "${\n var in_variants = [].concat(inputs.in_variants);\n var output_prefix = \"\";\n \n var vcf_count = 0;\n var vcf_gz_count = 0;\n var bcf_count = 0;\n var gvcf_count = 0;\n var gvcf_gz_count = 0;\n \n for (var i = 0; i < in_variants.length; i++)\n {\n if (in_variants[i].path.endsWith('vcf') && !(in_variants[i].path.endsWith('g.vcf')) )\n vcf_count += 1\n else if (in_variants[i].path.endsWith('vcf.gz') && !(in_variants[i].path.endsWith('g.vcf.gz')))\n vcf_gz_count += 1\n else if (in_variants[i].path.endsWith('bcf'))\n bcf_count += 1\n else if (in_variants[i].path.endsWith('g.vcf'))\n gvcf_count += 1\n else if (in_variants[i].path.endsWith('g.vcf.gz'))\n gvcf_gz_count += 1\n \n }\n \n var max_ext = Math.max(vcf_count, vcf_gz_count, bcf_count, gvcf_count, gvcf_gz_count)\n var most_frequent_ext = (max_ext == vcf_count) ? \"vcf\" : (max_ext == vcf_gz_count) ? \"vcf.gz\" : (max_ext == bcf_count) ? \"bcf\" : (max_ext == gvcf_count) ? \"g.vcf\" : \"g.vcf.gz\";\n var out_format = inputs.output_file_format;\n var out_ext = \"\";\n if (out_format)\n {\n out_ext = ((most_frequent_ext == \"g.vcf\" || most_frequent_ext == \"g.vcf.gz\") && (out_format == \"vcf\" || out_format == \"vcf.gz\")) ? \"g.\" + out_format : ((most_frequent_ext == \"g.vcf\" || most_frequent_ext == \"g.vcf.gz\") && (out_format == \"bcf\" )) ? most_frequent_ext : out_format;\n }\n else\n {\n out_ext = most_frequent_ext;\n }\n \n if (inputs.output_prefix)\n {\n output_prefix = inputs.output_prefix;\n }\n else\n {\n if (in_variants.length > 1)\n {\n in_variants.sort(function(file1, file2) {\n var file1_name = file1.basename.toUpperCase();\n var file2_name = file2.basename.toUpperCase();\n if (file1_name < file2_name) {\n return -1;\n }\n if (file1_name > file2_name) {\n return 1;\n }\n // names must be equal\n return 0;\n });\n }\n \n var in_variants_first = in_variants[0];\n if (in_variants_first.metadata && in_variants_first.metadata.sample_id)\n {\n output_prefix = in_variants_first.metadata.sample_id;\n\n }\n else\n {\n output_prefix = in_variants_first.basename.split('.')[0];\n }\n \n if (in_variants.length > 1)\n {\n output_prefix = output_prefix + \".\" + in_variants.length;\n }\n }\n \n return \"--OUTPUT \" + output_prefix + \".merged.\" + out_ext;\n}"
  2681. }
  2682. ],
  2683. "requirements": [
  2684. {
  2685. "class": "ShellCommandRequirement"
  2686. },
  2687. {
  2688. "class": "ResourceRequirement",
  2689. "ramMin": "${\n var memory = 3500;\n if (inputs.memory_per_job) \n {\n memory = inputs.memory_per_job;\n }\n if (inputs.memory_overhead_per_job)\n {\n memory += inputs.memory_overhead_per_job;\n }\n return memory;\n}",
  2690. "coresMin": "${\n return inputs.cpu_per_job ? inputs.cpu_per_job : 1\n}"
  2691. },
  2692. {
  2693. "class": "DockerRequirement",
  2694. "dockerPull": "images.sbgenomics.com/marijeta_slavkovic/gatk-4-1-0-0:0"
  2695. },
  2696. {
  2697. "class": "InitialWorkDirRequirement",
  2698. "listing": []
  2699. },
  2700. {
  2701. "class": "InlineJavascriptRequirement",
  2702. "expressionLib": [
  2703. "var updateMetadata = function(file, key, value) {\n file['metadata'][key] = value;\n return file;\n};\n\n\nvar setMetadata = function(file, metadata) {\n if (!('metadata' in file))\n file['metadata'] = metadata;\n else {\n for (var key in metadata) {\n file['metadata'][key] = metadata[key];\n }\n }\n return file\n};\n\nvar inheritMetadata = function(o1, o2) {\n var commonMetadata = {};\n if (!Array.isArray(o2)) {\n o2 = [o2]\n }\n for (var i = 0; i < o2.length; i++) {\n var example = o2[i]['metadata'];\n for (var key in example) {\n if (i == 0)\n commonMetadata[key] = example[key];\n else {\n if (!(commonMetadata[key] == example[key])) {\n delete commonMetadata[key]\n }\n }\n }\n }\n if (!Array.isArray(o1)) {\n o1 = setMetadata(o1, commonMetadata)\n } else {\n for (var i = 0; i < o1.length; i++) {\n o1[i] = setMetadata(o1[i], commonMetadata)\n }\n }\n return o1;\n};\n\nvar toArray = function(file) {\n return [].concat(file);\n};\n\nvar groupBy = function(files, key) {\n var groupedFiles = [];\n var tempDict = {};\n for (var i = 0; i < files.length; i++) {\n var value = files[i]['metadata'][key];\n if (value in tempDict)\n tempDict[value].push(files[i]);\n else tempDict[value] = [files[i]];\n }\n for (var key in tempDict) {\n groupedFiles.push(tempDict[key]);\n }\n return groupedFiles;\n};\n\nvar orderBy = function(files, key, order) {\n var compareFunction = function(a, b) {\n if (a['metadata'][key].constructor === Number) {\n return a['metadata'][key] - b['metadata'][key];\n } else {\n var nameA = a['metadata'][key].toUpperCase();\n var nameB = b['metadata'][key].toUpperCase();\n if (nameA < nameB) {\n return -1;\n }\n if (nameA > nameB) {\n return 1;\n }\n return 0;\n }\n };\n\n files = files.sort(compareFunction);\n if (order == undefined || order == \"asc\")\n return files;\n else\n return files.reverse();\n};",
  2704. "\nvar setMetadata = function(file, metadata) {\n if (!('metadata' in file))\n file['metadata'] = metadata;\n else {\n for (var key in metadata) {\n file['metadata'][key] = metadata[key];\n }\n }\n return file\n};\n\nvar inheritMetadata = function(o1, o2) {\n var commonMetadata = {};\n if (!Array.isArray(o2)) {\n o2 = [o2]\n }\n for (var i = 0; i < o2.length; i++) {\n var example = o2[i]['metadata'];\n for (var key in example) {\n if (i == 0)\n commonMetadata[key] = example[key];\n else {\n if (!(commonMetadata[key] == example[key])) {\n delete commonMetadata[key]\n }\n }\n }\n }\n if (!Array.isArray(o1)) {\n o1 = setMetadata(o1, commonMetadata)\n } else {\n for (var i = 0; i < o1.length; i++) {\n o1[i] = setMetadata(o1[i], commonMetadata)\n }\n }\n return o1;\n};"
  2705. ]
  2706. }
  2707. ],
  2708. "sbg:categories": [
  2709. "Utilities",
  2710. "VCF Processing"
  2711. ],
  2712. "sbg:license": "Open source BSD (3-clause) license",
  2713. "sbg:toolAuthor": "Broad Institute",
  2714. "sbg:toolkit": "GATK",
  2715. "sbg:toolkitVersion": "4.0.12.0",
  2716. "sbg:projectName": "GATK 4.1.0.0 - Demo",
  2717. "sbg:revisionsInfo": [
  2718. {
  2719. "sbg:revision": 0,
  2720. "sbg:modifiedBy": "uros_sipetic",
  2721. "sbg:modifiedOn": 1552929960,
  2722. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-mergevcfs-4-1-0-0/7"
  2723. },
  2724. {
  2725. "sbg:revision": 1,
  2726. "sbg:modifiedBy": "veliborka_josipovic",
  2727. "sbg:modifiedOn": 1554493122,
  2728. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-mergevcfs-4-1-0-0/14"
  2729. },
  2730. {
  2731. "sbg:revision": 2,
  2732. "sbg:modifiedBy": "veliborka_josipovic",
  2733. "sbg:modifiedOn": 1554720843,
  2734. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-mergevcfs-4-1-0-0/15"
  2735. },
  2736. {
  2737. "sbg:revision": 3,
  2738. "sbg:modifiedBy": "veliborka_josipovic",
  2739. "sbg:modifiedOn": 1554999276,
  2740. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-mergevcfs-4-1-0-0/16"
  2741. },
  2742. {
  2743. "sbg:revision": 4,
  2744. "sbg:modifiedBy": "veliborka_josipovic",
  2745. "sbg:modifiedOn": 1559740771,
  2746. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-mergevcfs-4-1-0-0/18"
  2747. },
  2748. {
  2749. "sbg:revision": 5,
  2750. "sbg:modifiedBy": "veliborka_josipovic",
  2751. "sbg:modifiedOn": 1559746042,
  2752. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-mergevcfs-4-1-0-0/19"
  2753. },
  2754. {
  2755. "sbg:revision": 6,
  2756. "sbg:modifiedBy": "nemanja.vucic",
  2757. "sbg:modifiedOn": 1559750444,
  2758. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-mergevcfs-4-1-0-0/20"
  2759. },
  2760. {
  2761. "sbg:revision": 7,
  2762. "sbg:modifiedBy": "nens",
  2763. "sbg:modifiedOn": 1565776372,
  2764. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-mergevcfs-4-1-0-0/26"
  2765. },
  2766. {
  2767. "sbg:revision": 8,
  2768. "sbg:modifiedBy": "nens",
  2769. "sbg:modifiedOn": 1605879889,
  2770. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-mergevcfs-4-1-0-0/27"
  2771. }
  2772. ],
  2773. "sbg:image_url": null,
  2774. "sbg:links": [
  2775. {
  2776. "id": "https://software.broadinstitute.org/gatk/",
  2777. "label": "Homepage"
  2778. },
  2779. {
  2780. "id": "https://github.com/broadinstitute/gatk/",
  2781. "label": "Source Code"
  2782. },
  2783. {
  2784. "id": "https://github.com/broadinstitute/gatk/releases/download/4.1.0.0/gatk-4.1.0.0.zip",
  2785. "label": "Download"
  2786. },
  2787. {
  2788. "id": "https://www.ncbi.nlm.nih.gov/pubmed?term=20644199",
  2789. "label": "Publications"
  2790. },
  2791. {
  2792. "id": "https://software.broadinstitute.org/gatk/documentation/tooldocs/4.1.0.0/picard_vcf_MergeVcfs.php",
  2793. "label": "Documentation"
  2794. }
  2795. ],
  2796. "sbg:appVersion": [
  2797. "v1.0"
  2798. ],
  2799. "sbg:id": "h-04854109/h-5e77b903/h-b934ddb9/0",
  2800. "sbg:revision": 8,
  2801. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-mergevcfs-4-1-0-0/27",
  2802. "sbg:modifiedOn": 1605879889,
  2803. "sbg:modifiedBy": "nens",
  2804. "sbg:createdOn": 1552929960,
  2805. "sbg:createdBy": "uros_sipetic",
  2806. "sbg:project": "uros_sipetic/gatk-4-1-0-0-demo",
  2807. "sbg:sbgMaintained": false,
  2808. "sbg:validationErrors": [],
  2809. "sbg:contributors": [
  2810. "nens",
  2811. "veliborka_josipovic",
  2812. "uros_sipetic",
  2813. "nemanja.vucic"
  2814. ],
  2815. "sbg:latestRevision": 8,
  2816. "sbg:publisher": "sbg",
  2817. "sbg:content_hash": "adc2b5150b732c9c308626feefafdc34f6cef615baad97af6b2e5769b265c9491",
  2818. "sbg:copyOf": "veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-mergevcfs-4-1-0-0/27"
  2819. },
  2820. "label": "GATK MergeVcfs",
  2821. "sbg:x": -159.89105224609375,
  2822. "sbg:y": -181.2996063232422
  2823. },
  2824. {
  2825. "id": "gatk_intervallisttools_4_1_0_0",
  2826. "in": [
  2827. {
  2828. "id": "in_intervals",
  2829. "source": [
  2830. "in_intervals"
  2831. ]
  2832. },
  2833. {
  2834. "id": "memory_per_job",
  2835. "default": 2000
  2836. },
  2837. {
  2838. "id": "scatter_count",
  2839. "default": 36
  2840. },
  2841. {
  2842. "id": "sort",
  2843. "default": true
  2844. },
  2845. {
  2846. "id": "subdivision_mode",
  2847. "default": "BALANCING_WITHOUT_INTERVAL_SUBDIVISION_WITH_OVERFLOW"
  2848. },
  2849. {
  2850. "id": "unique",
  2851. "default": true
  2852. }
  2853. ],
  2854. "out": [
  2855. {
  2856. "id": "output_interval_list"
  2857. }
  2858. ],
  2859. "run": {
  2860. "class": "CommandLineTool",
  2861. "cwlVersion": "v1.0",
  2862. "$namespaces": {
  2863. "sbg": "https://sevenbridges.com"
  2864. },
  2865. "id": "uros_sipetic/gatk-4-1-0-0-demo/gatk-intervallisttools-4-1-0-0/5",
  2866. "baseCommand": [],
  2867. "inputs": [
  2868. {
  2869. "sbg:category": "Optional Arguments",
  2870. "sbg:toolDefaultValue": "CONCAT",
  2871. "id": "action",
  2872. "type": [
  2873. "null",
  2874. {
  2875. "type": "enum",
  2876. "symbols": [
  2877. "CONCAT",
  2878. "UNION",
  2879. "INTERSECT",
  2880. "SUBTRACT",
  2881. "SYMDIFF",
  2882. "OVERLAPS"
  2883. ],
  2884. "name": "action"
  2885. }
  2886. ],
  2887. "inputBinding": {
  2888. "prefix": "--ACTION",
  2889. "shellQuote": false,
  2890. "position": 4
  2891. },
  2892. "label": "Action",
  2893. "doc": "Action to take on inputs. Possible values: { CONCAT (the concatenation of all the intervals in all the inputs, no sorting or merging of overlapping/abutting intervals implied. Will result in a possibly unsorted list unless requested otherwise.) UNION (like concatenate but with UNIQUE and SORT implied, the result being the set-wise union of all inputs, with overlapping and abutting intervals merged into one.) INTERSECT (the sorted and merged set of all loci that are contained in all of the inputs.) SUBTRACT (subtracts the intervals in second_input from those in input. The resulting loci are those in input that are not in second_input.) symdiff (results in loci that are in input or second_input but are not in both.) overlaps (outputs the entire intervals from input that have bases which overlap any interval from second_input. Note that this is different than intersect in that each original interval is either emitted in its entirety, or not at all.) }."
  2894. },
  2895. {
  2896. "sbg:altPrefix": "-BRK",
  2897. "sbg:category": "Optional Arguments",
  2898. "sbg:toolDefaultValue": "0",
  2899. "id": "break_bands_at_multiples_of",
  2900. "type": "int?",
  2901. "inputBinding": {
  2902. "prefix": "--BREAK_BANDS_AT_MULTIPLES_OF",
  2903. "shellQuote": false,
  2904. "position": 4
  2905. },
  2906. "label": "Break bands at multiples of",
  2907. "doc": "If set to a positive value will create a new interval list with the original intervals broken up at integer multiples of this value. Set to 0 to not break up intervals."
  2908. },
  2909. {
  2910. "sbg:category": "Optional Arguments",
  2911. "sbg:toolDefaultValue": "null",
  2912. "id": "comment",
  2913. "type": "string[]?",
  2914. "inputBinding": {
  2915. "prefix": "",
  2916. "shellQuote": false,
  2917. "position": 4,
  2918. "valueFrom": "${\n if (self)\n {\n var cmd = [];\n for (var i = 0; i < self.length; i++) \n {\n cmd.push('--COMMENT', self[i]);\n }\n return cmd.join(' ');\n }\n \n}"
  2919. },
  2920. "label": "Comment",
  2921. "doc": "One or more lines of comment to add to the header of the output file (as @CO lines in the SAM header). This argument may be specified 0 or more times."
  2922. },
  2923. {
  2924. "sbg:category": "Optional Arguments",
  2925. "sbg:toolDefaultValue": "2",
  2926. "id": "compression_level",
  2927. "type": "int?",
  2928. "inputBinding": {
  2929. "prefix": "--COMPRESSION_LEVEL",
  2930. "shellQuote": false,
  2931. "position": 4
  2932. },
  2933. "label": "Compression level",
  2934. "doc": "Compression level for all compressed files created (e.g. BAM and VCF)."
  2935. },
  2936. {
  2937. "sbg:category": "Platform options",
  2938. "sbg:toolDefaultValue": "1",
  2939. "id": "cpu_per_job",
  2940. "type": "int?",
  2941. "label": "CPU per job",
  2942. "doc": "Number of CPUs to be used per job."
  2943. },
  2944. {
  2945. "sbg:category": "Optional Arguments",
  2946. "sbg:toolDefaultValue": "false",
  2947. "id": "include_filtered",
  2948. "type": "boolean?",
  2949. "inputBinding": {
  2950. "prefix": "--INCLUDE_FILTERED",
  2951. "shellQuote": false,
  2952. "position": 4
  2953. },
  2954. "label": "Include filtered",
  2955. "doc": "Whether to include filtered variants in the VCF when generating an interval list from VCF."
  2956. },
  2957. {
  2958. "sbg:altPrefix": "-I",
  2959. "sbg:category": "Required Arguments",
  2960. "id": "in_intervals",
  2961. "type": "File[]",
  2962. "inputBinding": {
  2963. "prefix": "",
  2964. "shellQuote": false,
  2965. "position": 4,
  2966. "valueFrom": "${\n if (self)\n {\n var cmd = [];\n for (var i = 0; i < self.length; i++) \n {\n cmd.push('--INPUT', self[i].path);\n }\n return cmd.join(' ');\n }\n \n}"
  2967. },
  2968. "label": "Interval list",
  2969. "doc": "One or more interval lists. If multiple interval lists are provided the output is the result of merging the inputs. Supported formats are interval_list and VCF. This argument must be specified at least once.",
  2970. "sbg:fileTypes": "VCF, INTERVAL_LIST"
  2971. },
  2972. {
  2973. "sbg:category": "Optional Arguments",
  2974. "sbg:toolDefaultValue": "false",
  2975. "id": "invert",
  2976. "type": "boolean?",
  2977. "inputBinding": {
  2978. "prefix": "--INVERT",
  2979. "shellQuote": false,
  2980. "position": 4
  2981. },
  2982. "label": "Invert",
  2983. "doc": "Produce the inverse list of intervals, that is, the regions in the genome that are not covered by any of the input intervals. Will merge abutting intervals first. Output will be sorted."
  2984. },
  2985. {
  2986. "sbg:category": "Optional Arguments",
  2987. "sbg:toolDefaultValue": "500000",
  2988. "id": "max_records_in_ram",
  2989. "type": "int?",
  2990. "inputBinding": {
  2991. "prefix": "--MAX_RECORDS_IN_RAM",
  2992. "shellQuote": false,
  2993. "position": 4
  2994. },
  2995. "label": "Max records in ram",
  2996. "doc": "When writing files that need to be sorted, this will specify the number of records stored in RAM before spilling to disk. Increasing this number reduces the number of file handles needed to sort the file, and increases the amount of RAM needed."
  2997. },
  2998. {
  2999. "sbg:category": "Platform Options",
  3000. "sbg:toolDefaultValue": "7",
  3001. "id": "memory_overhead_per_job",
  3002. "type": "int?",
  3003. "label": "Memory overhead per job",
  3004. "doc": "This input allows a user to set the desired overhead memory when running a tool or adding it to a workflow. This amount will be added to the Memory per job in the Memory requirements section but it will not be added to the -Xmx parameter leaving some memory not occupied which can be used as stack memory (-Xmx parameter defines heap memory). This input should be defined in MB (for both the platform part and the -Xmx part if Java tool is wrapped)."
  3005. },
  3006. {
  3007. "sbg:category": "Platform options",
  3008. "sbg:toolDefaultValue": "2048",
  3009. "id": "memory_per_job",
  3010. "type": "int?",
  3011. "label": "Memory per job",
  3012. "doc": "This input allows a user to set the desired memory requirement when running a tool or adding it to a workflow. This value should be propagated to the -Xmx parameter too.This input should be defined in MB (for both the platform part and the -Xmx part if Java tool is wrapped)."
  3013. },
  3014. {
  3015. "sbg:category": "Optional Arguments",
  3016. "sbg:toolDefaultValue": "NONE",
  3017. "id": "output_value",
  3018. "type": [
  3019. "null",
  3020. {
  3021. "type": "enum",
  3022. "symbols": [
  3023. "NONE",
  3024. "BASES",
  3025. "INTERVALS"
  3026. ],
  3027. "name": "output_value"
  3028. }
  3029. ],
  3030. "inputBinding": {
  3031. "prefix": "--OUTPUT_VALUE",
  3032. "shellQuote": false,
  3033. "position": 4
  3034. },
  3035. "label": "Output value",
  3036. "doc": "What value (if anything) to output to stdout (for scripting)."
  3037. },
  3038. {
  3039. "sbg:category": "Optional Arguments",
  3040. "sbg:toolDefaultValue": "0",
  3041. "id": "padding",
  3042. "type": "int?",
  3043. "inputBinding": {
  3044. "prefix": "--PADDING",
  3045. "shellQuote": false,
  3046. "position": 4
  3047. },
  3048. "label": "Padding",
  3049. "doc": "The amount to pad each end of the intervals by before other operations are undertaken. Negative numbers are allowed and indicate intervals should be shrunk. Resulting intervals < 0 bases long will be removed. Padding is applied to the interval lists (both INPUT and SECOND_INPUT, if provided) before the ACTION is performed."
  3050. },
  3051. {
  3052. "sbg:altPrefix": "-R",
  3053. "sbg:category": "Optional Arguments",
  3054. "sbg:toolDefaultValue": "null",
  3055. "id": "in_reference",
  3056. "type": "File?",
  3057. "inputBinding": {
  3058. "prefix": "--REFERENCE_SEQUENCE",
  3059. "shellQuote": false,
  3060. "position": 4
  3061. },
  3062. "label": "Reference sequence",
  3063. "doc": "Reference sequence file.",
  3064. "sbg:fileTypes": "FASTA, FA"
  3065. },
  3066. {
  3067. "sbg:category": "Optional Arguments",
  3068. "sbg:toolDefaultValue": "null",
  3069. "id": "scatter_content",
  3070. "type": "int?",
  3071. "inputBinding": {
  3072. "prefix": "--SCATTER_CONTENT",
  3073. "shellQuote": false,
  3074. "position": 4
  3075. },
  3076. "label": "Scatter content",
  3077. "doc": "When scattering with this argument, each of the resultant files will (ideally) have this amount of 'content', which means either base-counts or interval-counts depending on SUBDIVISION_MODE. When provided, overrides SCATTER_COUNT."
  3078. },
  3079. {
  3080. "sbg:category": "Optional Arguments",
  3081. "sbg:toolDefaultValue": "1",
  3082. "id": "scatter_count",
  3083. "type": "int?",
  3084. "inputBinding": {
  3085. "prefix": "--SCATTER_COUNT",
  3086. "shellQuote": false,
  3087. "position": 4
  3088. },
  3089. "label": "Scatter count",
  3090. "doc": "The number of files into which to scatter the resulting list by locus; in some situations, fewer intervals may be emitted."
  3091. },
  3092. {
  3093. "sbg:altPrefix": "-SI",
  3094. "sbg:category": "Optional Arguments",
  3095. "sbg:toolDefaultValue": "null",
  3096. "id": "second_input",
  3097. "type": "File[]?",
  3098. "inputBinding": {
  3099. "prefix": "--SECOND_INPUT",
  3100. "shellQuote": false,
  3101. "position": 4
  3102. },
  3103. "label": "Second input",
  3104. "doc": "Second set of intervals for SUBTRACT and DIFFERENCE operations. This argument may be specified 0 or more times.",
  3105. "sbg:fileTypes": "VCF, INTERVAL_LIST"
  3106. },
  3107. {
  3108. "sbg:category": "Optional Arguments",
  3109. "sbg:toolDefaultValue": "true",
  3110. "id": "sort",
  3111. "type": "boolean?",
  3112. "inputBinding": {
  3113. "prefix": "--SORT",
  3114. "shellQuote": false,
  3115. "position": 4
  3116. },
  3117. "label": "Sort",
  3118. "doc": "If true, sort the resulting interval list by coordinate."
  3119. },
  3120. {
  3121. "sbg:altPrefix": "-M",
  3122. "sbg:category": "Optional Arguments",
  3123. "sbg:toolDefaultValue": "INTERVAL_SUBDIVISION",
  3124. "id": "subdivision_mode",
  3125. "type": [
  3126. "null",
  3127. {
  3128. "type": "enum",
  3129. "symbols": [
  3130. "INTERVAL_SUBDIVISION",
  3131. "BALANCING_WITHOUT_INTERVAL_SUBDIVISION",
  3132. "BALANCING_WITHOUT_INTERVAL_SUBDIVISION_WITH_OVERFLOW",
  3133. "INTERVAL_COUNT"
  3134. ],
  3135. "name": "subdivision_mode"
  3136. }
  3137. ],
  3138. "inputBinding": {
  3139. "prefix": "--SUBDIVISION_MODE",
  3140. "shellQuote": false,
  3141. "position": 4
  3142. },
  3143. "label": "Subdivision mode",
  3144. "doc": "The mode used to scatter the interval list. Possible values: { INTERVAL_SUBDIVISION (scatter the interval list into similarly sized interval lists (by base count), breaking up intervals as needed.) BALANCING_WITHOUT_INTERVAL_SUBDIVISION (scatter the interval list into similarly sized interval lists (by base count), but without breaking up intervals.) BALANCING_WITHOUT_INTERVAL_SUBDIVISION_WITH_OVERFLOW (scatter the interval list into similarly sized interval lists (by base count), but without breaking up intervals. Will overflow current interval list so that the remaining lists will not have too many bases to deal with.) interval_count (scatter the interval list into similarly sized interval lists (by interval count, not by base count). Resulting interval lists will contain similar number of intervals.) }."
  3145. },
  3146. {
  3147. "sbg:category": "Optional Arguments",
  3148. "sbg:toolDefaultValue": "false",
  3149. "id": "unique",
  3150. "type": "boolean?",
  3151. "inputBinding": {
  3152. "prefix": "--UNIQUE",
  3153. "shellQuote": false,
  3154. "position": 4
  3155. },
  3156. "label": "Unique",
  3157. "doc": "If true, merge overlapping and adjacent intervals to create a list of unique intervals. Implies SORT=true."
  3158. },
  3159. {
  3160. "sbg:category": "Optional Arguments",
  3161. "sbg:toolDefaultValue": "STRICT",
  3162. "id": "validation_stringency",
  3163. "type": [
  3164. "null",
  3165. {
  3166. "type": "enum",
  3167. "symbols": [
  3168. "STRICT",
  3169. "LENIENT",
  3170. "SILENT"
  3171. ],
  3172. "name": "validation_stringency"
  3173. }
  3174. ],
  3175. "inputBinding": {
  3176. "prefix": "--VALIDATION_STRINGENCY",
  3177. "shellQuote": false,
  3178. "position": 4
  3179. },
  3180. "label": "Validation stringency",
  3181. "doc": "Validation stringency for all SAM files read by this program. Setting stringency to silent can improve performance when processing a bam file in which variable-length data (read, qualities, tags) do not otherwise need to be decoded."
  3182. }
  3183. ],
  3184. "outputs": [
  3185. {
  3186. "id": "output_interval_list",
  3187. "doc": "Output list of intervals, processed per the tool's specifications (union, intersection, split list... ).",
  3188. "label": "Output interval list",
  3189. "type": "File[]?",
  3190. "outputBinding": {
  3191. "glob": "${\n var scatter_count = inputs.scatter_count ? inputs.scatter_count : 1;\n if (scatter_count > 1)\n {\n return \"out/*/*.interval_list\";\n }\n else\n {\n return \"test.interval_list\";\n }\n}",
  3192. "outputEval": "$(inheritMetadata(self, inputs.in_intervals))"
  3193. },
  3194. "sbg:fileTypes": "INTERVAL_LIST"
  3195. }
  3196. ],
  3197. "doc": "This tool offers multiple interval list file manipulation capabilities including sorting, merging, subtracting, padding, and other set-theoretic operations. \n\nThe default action is to merge and sort genomic intervals provided as the input. Compatible input files are INTERVAL_LIST and VCF files. **IntervalListTools** can also \"scatter\" the output into many interval files. This can be useful for creating multiple interval lists for scattering an analysis execution.\n\n###Common Use Cases\n\n- Combine the intervals from two interval lists:\n```\njava -jar picard.jar IntervalListTools \\\n ACTION=CONCAT \\\n I=input.interval_list \\\n I=input_2.interval_list \\\n O=new.interval_list\n```\n- Combine the intervals from two interval lists, sorting and merging overlapping and abutting intervals:\n```\n java -jar picard.jar IntervalListTools \\\n ACTION=CONCAT \\\n SORT=true \\\n UNIQUE=true \\\n I=input.interval_list \\\n I=input_2.interval_list \\\n O=new.interval_list \n```\n- Subtract the intervals in **second_input** (`SECOND_INPUT`) from those in **in_intervals** (`INPUT`):\n```\n java -jar picard.jar IntervalListTools \\\n ACTION=SUBTRACT \\\n I=input.interval_list \\\n SI=input_2.interval_list \\\n O=new.interval_list \n```\n- Find bases that are in either *input1.interval_list* or *input2.interval_list*, and also in *input3.interval_list*:\n```\n java -jar picard.jar IntervalListTools \\\n ACTION=INTERSECT \\\n I=input1.interval_list \\\n I=input2.interval_list \\\n SI=input3.interval_list \\\n O=new.interval_list \n```\n- Split intervals list file using * scatter_count* (`SCATTER_COUNT`) option:\n```\n java -jar picard.jar IntervalListTools \\\n I=input.interval_list \\\n SCATTER_COUNT=2 \n```\n\n\n###Common Issues and Important Notes\n\n- A SAM style header must be present at the top of the *interval_list* file. After the header, the file then contains records, one per line in text format with the following tab-separated values. Example of the *interval_list* file: \n```\n@HD VN:1.0\n@SQ SN:chr1 LN:501\n@SQ SN:chr2 LN:401\nchr1 1 100 + starts at the first base of the contig and covers 100 bases\nchr2 100 100 + interval with exactly one base\n```\n- The coordinate system is 1-based, closed-ended so that the first base in a sequence has position 1, and both the start and the end positions are included in an interval.\n- The **Interval list** input file should be denoted with the extension INTERVAL_LIST.\n\n\n###Changes Introduced by Seven Bridges\n\nIf no additional parameter is set, the app will output the INTERVAL_LIST file given on the input.\n\n\n###Performance Benchmarking\nThe execution time takes several minutes on the default instance. Unless specified otherwise, the default AWS instance used to run the **IntervalListTools** will be c4.2xlarge (8CPUs and 16GB RAM).",
  3198. "label": "GATK IntervalListTools",
  3199. "arguments": [
  3200. {
  3201. "prefix": "",
  3202. "shellQuote": false,
  3203. "position": 0,
  3204. "valueFrom": "/opt/gatk-4.1.0.0/gatk"
  3205. },
  3206. {
  3207. "shellQuote": false,
  3208. "position": 1,
  3209. "valueFrom": "--java-options"
  3210. },
  3211. {
  3212. "prefix": "",
  3213. "shellQuote": false,
  3214. "position": 2,
  3215. "valueFrom": "${\n if (inputs.memory_per_job) {\n return '\\\"-Xmx'.concat(inputs.memory_per_job, 'M') + '\\\"';\n }\n return '\\\"-Xms1g\\\"';\n}"
  3216. },
  3217. {
  3218. "shellQuote": false,
  3219. "position": 3,
  3220. "valueFrom": "IntervalListTools"
  3221. },
  3222. {
  3223. "prefix": "",
  3224. "shellQuote": false,
  3225. "position": 4,
  3226. "valueFrom": "${\n var scatter_count = inputs.scatter_count ? inputs.scatter_count : 1;\n if (scatter_count > 1)\n {\n return \"--OUTPUT out\";\n }\n else\n {\n return \"--OUTPUT test.interval_list\";\n }\n \n}"
  3227. },
  3228. {
  3229. "prefix": "",
  3230. "shellQuote": false,
  3231. "position": -1,
  3232. "valueFrom": "${\n return \"mkdir out && \";\n}"
  3233. },
  3234. {
  3235. "prefix": "&& python",
  3236. "shellQuote": false,
  3237. "position": 99,
  3238. "valueFrom": "rename_intervals.py"
  3239. }
  3240. ],
  3241. "requirements": [
  3242. {
  3243. "class": "ShellCommandRequirement"
  3244. },
  3245. {
  3246. "class": "ResourceRequirement",
  3247. "ramMin": "${\n var memory = 2048;\n if (inputs.memory_per_job) \n {\n memory = inputs.memory_per_job;\n }\n if (inputs.memory_overhead_per_job)\n {\n memory += inputs.memory_overhead_per_job;\n }\n return memory;\n}",
  3248. "coresMin": "${\n return inputs.cpu_per_job ? inputs.cpu_per_job : 1\n}"
  3249. },
  3250. {
  3251. "class": "DockerRequirement",
  3252. "dockerPull": "images.sbgenomics.com/marijeta_slavkovic/gatk-4-1-0-0:0"
  3253. },
  3254. {
  3255. "class": "InitialWorkDirRequirement",
  3256. "listing": [
  3257. {
  3258. "entryname": "rename_intervals.py",
  3259. "entry": "import glob, os\n# Works around a JES limitation where multiples files with the same name overwrite each other when globbed\nintervals = sorted(glob.glob(\"out/*/*.interval_list\"))\nfor i, interval in enumerate(intervals):\n (directory, filename) = os.path.split(interval)\n newName = os.path.join(directory, str(i + 1) + filename)\n os.rename(interval, newName)\nprint(len(intervals))",
  3260. "writable": false
  3261. }
  3262. ]
  3263. },
  3264. {
  3265. "class": "InlineJavascriptRequirement",
  3266. "expressionLib": [
  3267. "var updateMetadata = function(file, key, value) {\n file['metadata'][key] = value;\n return file;\n};\n\n\nvar setMetadata = function(file, metadata) {\n if (!('metadata' in file))\n file['metadata'] = metadata;\n else {\n for (var key in metadata) {\n file['metadata'][key] = metadata[key];\n }\n }\n return file\n};\n\nvar inheritMetadata = function(o1, o2) {\n var commonMetadata = {};\n if (!Array.isArray(o2)) {\n o2 = [o2]\n }\n for (var i = 0; i < o2.length; i++) {\n var example = o2[i]['metadata'];\n for (var key in example) {\n if (i == 0)\n commonMetadata[key] = example[key];\n else {\n if (!(commonMetadata[key] == example[key])) {\n delete commonMetadata[key]\n }\n }\n }\n }\n if (!Array.isArray(o1)) {\n o1 = setMetadata(o1, commonMetadata)\n } else {\n for (var i = 0; i < o1.length; i++) {\n o1[i] = setMetadata(o1[i], commonMetadata)\n }\n }\n return o1;\n};\n\nvar toArray = function(file) {\n return [].concat(file);\n};\n\nvar groupBy = function(files, key) {\n var groupedFiles = [];\n var tempDict = {};\n for (var i = 0; i < files.length; i++) {\n var value = files[i]['metadata'][key];\n if (value in tempDict)\n tempDict[value].push(files[i]);\n else tempDict[value] = [files[i]];\n }\n for (var key in tempDict) {\n groupedFiles.push(tempDict[key]);\n }\n return groupedFiles;\n};\n\nvar orderBy = function(files, key, order) {\n var compareFunction = function(a, b) {\n if (a['metadata'][key].constructor === Number) {\n return a['metadata'][key] - b['metadata'][key];\n } else {\n var nameA = a['metadata'][key].toUpperCase();\n var nameB = b['metadata'][key].toUpperCase();\n if (nameA < nameB) {\n return -1;\n }\n if (nameA > nameB) {\n return 1;\n }\n return 0;\n }\n };\n\n files = files.sort(compareFunction);\n if (order == undefined || order == \"asc\")\n return files;\n else\n return files.reverse();\n};",
  3268. "\nvar setMetadata = function(file, metadata) {\n if (!('metadata' in file))\n file['metadata'] = metadata;\n else {\n for (var key in metadata) {\n file['metadata'][key] = metadata[key];\n }\n }\n return file\n};\n\nvar inheritMetadata = function(o1, o2) {\n var commonMetadata = {};\n if (!Array.isArray(o2)) {\n o2 = [o2]\n }\n for (var i = 0; i < o2.length; i++) {\n var example = o2[i]['metadata'];\n for (var key in example) {\n if (i == 0)\n commonMetadata[key] = example[key];\n else {\n if (!(commonMetadata[key] == example[key])) {\n delete commonMetadata[key]\n }\n }\n }\n }\n if (!Array.isArray(o1)) {\n o1 = setMetadata(o1, commonMetadata)\n } else {\n for (var i = 0; i < o1.length; i++) {\n o1[i] = setMetadata(o1[i], commonMetadata)\n }\n }\n return o1;\n};"
  3269. ]
  3270. }
  3271. ],
  3272. "sbg:categories": [
  3273. "Utilities",
  3274. "BED Processing"
  3275. ],
  3276. "sbg:license": "Open source BSD (3-clause) license",
  3277. "sbg:toolAuthor": "Broad Institute",
  3278. "sbg:toolkit": "GATK",
  3279. "sbg:toolkitVersion": "4.1.0.0",
  3280. "sbg:projectName": "GATK 4.1.0.0 - Demo",
  3281. "sbg:revisionsInfo": [
  3282. {
  3283. "sbg:revision": 0,
  3284. "sbg:modifiedBy": "uros_sipetic",
  3285. "sbg:modifiedOn": 1553015434,
  3286. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-intervallisttools-4-1-0-0/14"
  3287. },
  3288. {
  3289. "sbg:revision": 1,
  3290. "sbg:modifiedBy": "uros_sipetic",
  3291. "sbg:modifiedOn": 1553023178,
  3292. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-intervallisttools-4-1-0-0/16"
  3293. },
  3294. {
  3295. "sbg:revision": 2,
  3296. "sbg:modifiedBy": "veliborka_josipovic",
  3297. "sbg:modifiedOn": 1554720910,
  3298. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-intervallisttools-4-1-0-0/17"
  3299. },
  3300. {
  3301. "sbg:revision": 3,
  3302. "sbg:modifiedBy": "veliborka_josipovic",
  3303. "sbg:modifiedOn": 1554999245,
  3304. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-intervallisttools-4-1-0-0/18"
  3305. },
  3306. {
  3307. "sbg:revision": 4,
  3308. "sbg:modifiedBy": "veliborka_josipovic",
  3309. "sbg:modifiedOn": 1559740786,
  3310. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-intervallisttools-4-1-0-0/19"
  3311. },
  3312. {
  3313. "sbg:revision": 5,
  3314. "sbg:modifiedBy": "nens",
  3315. "sbg:modifiedOn": 1605879900,
  3316. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-intervallisttools-4-1-0-0/20"
  3317. }
  3318. ],
  3319. "sbg:image_url": null,
  3320. "sbg:wrapperAuthor": "nemanja.vucic, veliborka_josipovic",
  3321. "sbg:links": [
  3322. {
  3323. "id": "https://software.broadinstitute.org/gatk/documentation/tooldocs/4.1.0.0/picard_util_IntervalListTools.php",
  3324. "label": "Homepage"
  3325. }
  3326. ],
  3327. "sbg:appVersion": [
  3328. "v1.0"
  3329. ],
  3330. "sbg:id": "h-1f87b475/h-80dfb463/h-dccfb347/0",
  3331. "sbg:revision": 5,
  3332. "sbg:revisionNotes": "Copy of veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-intervallisttools-4-1-0-0/20",
  3333. "sbg:modifiedOn": 1605879900,
  3334. "sbg:modifiedBy": "nens",
  3335. "sbg:createdOn": 1553015434,
  3336. "sbg:createdBy": "uros_sipetic",
  3337. "sbg:project": "uros_sipetic/gatk-4-1-0-0-demo",
  3338. "sbg:sbgMaintained": false,
  3339. "sbg:validationErrors": [],
  3340. "sbg:contributors": [
  3341. "nens",
  3342. "veliborka_josipovic",
  3343. "uros_sipetic"
  3344. ],
  3345. "sbg:latestRevision": 5,
  3346. "sbg:publisher": "sbg",
  3347. "sbg:content_hash": "aab91a10937ed903774a8d9c5029169c30945c927c94b851024dd3bdbab22f9ec",
  3348. "sbg:copyOf": "veliborka_josipovic/gatk-4-1-0-0-toolkit-dev/gatk-intervallisttools-4-1-0-0/20"
  3349. },
  3350. "label": "GATK IntervalListTools",
  3351. "sbg:x": -584,
  3352. "sbg:y": -342
  3353. }
  3354. ],
  3355. "hints": [
  3356. {
  3357. "class": "sbg:AWSInstanceType",
  3358. "value": "c5.9xlarge;ebs-gp2;1024"
  3359. }
  3360. ],
  3361. "requirements": [
  3362. {
  3363. "class": "ScatterFeatureRequirement"
  3364. },
  3365. {
  3366. "class": "InlineJavascriptRequirement"
  3367. },
  3368. {
  3369. "class": "StepInputExpressionRequirement"
  3370. }
  3371. ],
  3372. "sbg:projectName": "SBG Public data",
  3373. "sbg:revisionsInfo": [
  3374. {
  3375. "sbg:revision": 0,
  3376. "sbg:modifiedBy": "admin",
  3377. "sbg:modifiedOn": 1572002745,
  3378. "sbg:revisionNotes": null
  3379. },
  3380. {
  3381. "sbg:revision": 1,
  3382. "sbg:modifiedBy": "admin",
  3383. "sbg:modifiedOn": 1572002745,
  3384. "sbg:revisionNotes": "v25 dev"
  3385. },
  3386. {
  3387. "sbg:revision": 2,
  3388. "sbg:modifiedBy": "admin",
  3389. "sbg:modifiedOn": 1572002745,
  3390. "sbg:revisionNotes": "v32 - dev"
  3391. },
  3392. {
  3393. "sbg:revision": 3,
  3394. "sbg:modifiedBy": "admin",
  3395. "sbg:modifiedOn": 1572002745,
  3396. "sbg:revisionNotes": "secondary files added"
  3397. },
  3398. {
  3399. "sbg:revision": 4,
  3400. "sbg:modifiedBy": "admin",
  3401. "sbg:modifiedOn": 1572002745,
  3402. "sbg:revisionNotes": "Description update"
  3403. },
  3404. {
  3405. "sbg:revision": 5,
  3406. "sbg:modifiedBy": "admin",
  3407. "sbg:modifiedOn": 1572002745,
  3408. "sbg:revisionNotes": "Description improved - performance benchmarking results addd"
  3409. },
  3410. {
  3411. "sbg:revision": 6,
  3412. "sbg:modifiedBy": "admin",
  3413. "sbg:modifiedOn": 1584986756,
  3414. "sbg:revisionNotes": "dev v36 - with requirements for cwltool"
  3415. },
  3416. {
  3417. "sbg:revision": 7,
  3418. "sbg:modifiedBy": "admin",
  3419. "sbg:modifiedOn": 1584986756,
  3420. "sbg:revisionNotes": "dev - v40"
  3421. },
  3422. {
  3423. "sbg:revision": 8,
  3424. "sbg:modifiedBy": "admin",
  3425. "sbg:modifiedOn": 1584986756,
  3426. "sbg:revisionNotes": "dev v41 - haplotypecaller update, smoe parameters pre-defined, spome parameters are not exposed any more"
  3427. },
  3428. {
  3429. "sbg:revision": 9,
  3430. "sbg:modifiedBy": "admin",
  3431. "sbg:modifiedOn": 1584986756,
  3432. "sbg:revisionNotes": "some parameters exposed, some put to default value"
  3433. },
  3434. {
  3435. "sbg:revision": 10,
  3436. "sbg:modifiedBy": "admin",
  3437. "sbg:modifiedOn": 1584986756,
  3438. "sbg:revisionNotes": "dev v43"
  3439. },
  3440. {
  3441. "sbg:revision": 11,
  3442. "sbg:modifiedBy": "admin",
  3443. "sbg:modifiedOn": 1612280770,
  3444. "sbg:revisionNotes": "CATEGORIES: GATK added"
  3445. },
  3446. {
  3447. "sbg:revision": 12,
  3448. "sbg:modifiedBy": "admin",
  3449. "sbg:modifiedOn": 1612280771,
  3450. "sbg:revisionNotes": "Lighter docker images used; WF name contains GATK"
  3451. }
  3452. ],
  3453. "sbg:image_url": "https://cgc.sbgenomics.com/ns/brood/images/admin/sbg-public-data/gatk-best-practice-generic-germline-short-variant-per-sample-cal/12.png",
  3454. "sbg:license": "BSD 3-Clause License",
  3455. "sbg:wrapperAuthor": "nevena.ilic.raicevic@sbgenomics.com",
  3456. "sbg:toolAuthor": "Broad Institute",
  3457. "sbg:categories": [
  3458. "Genomics",
  3459. "Variant Calling",
  3460. "GATK"
  3461. ],
  3462. "sbg:links": [
  3463. {
  3464. "id": "https://github.com/gatk-workflows/gatk4-germline-snps-indels",
  3465. "label": "Homepage"
  3466. },
  3467. {
  3468. "id": "https://github.com/gatk-workflows/gatk4-germline-snps-indels/blob/master/haplotypecaller-gvcf-gatk4.wdl",
  3469. "label": "Source Code"
  3470. },
  3471. {
  3472. "id": "https://github.com/broadinstitute/gatk/releases/download/4.1.0.0/gatk-4.1.0.0.zip",
  3473. "label": "Download"
  3474. },
  3475. {
  3476. "id": "https://www.ncbi.nlm.nih.gov/pubmed?term=20644199",
  3477. "label": "Publication"
  3478. },
  3479. {
  3480. "id": "https://software.broadinstitute.org/gatk/documentation/tooldocs/current/",
  3481. "label": "Documentation"
  3482. }
  3483. ],
  3484. "sbg:expand_workflow": false,
  3485. "sbg:appVersion": [
  3486. "v1.0"
  3487. ],
  3488. "id": "https://cgc-api.sbgenomics.com/v2/apps/admin/sbg-public-data/gatk-best-practice-generic-germline-short-variant-per-sample-cal/12/raw/",
  3489. "sbg:id": "admin/sbg-public-data/gatk-best-practice-generic-germline-short-variant-per-sample-cal/12",
  3490. "sbg:revision": 12,
  3491. "sbg:revisionNotes": "Lighter docker images used; WF name contains GATK",
  3492. "sbg:modifiedOn": 1612280771,
  3493. "sbg:modifiedBy": "admin",
  3494. "sbg:createdOn": 1572002745,
  3495. "sbg:createdBy": "admin",
  3496. "sbg:project": "admin/sbg-public-data",
  3497. "sbg:sbgMaintained": false,
  3498. "sbg:validationErrors": [],
  3499. "sbg:contributors": [
  3500. "admin"
  3501. ],
  3502. "sbg:latestRevision": 12,
  3503. "sbg:publisher": "sbg",
  3504. "sbg:content_hash": "aea62e43789ffc2afb810948c704de718fb2511170604b057d4afe8fb24161da5"
  3505. }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement