Advertisement
Light1992

File dello script

Sep 10th, 2019
137
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.92 KB | None | 0 0
  1. (base) [genomica@localhost RSEM]$ sudo ./rsem-generate-ngvector
  2. Invalid number of arguments!
  3. NAME
  4. rsem-generate-ngvector - Create Ng vector for EBSeq based only on
  5. transcript sequences.
  6.  
  7. SYNOPSIS
  8. rsem-generate-ngvector [options] input_fasta_file output_name
  9.  
  10. ARGUMENTS
  11. input_fasta_file
  12. The fasta file containing all reference transcripts. The transcripts
  13. must be in the same order as those in expression value files. Thus,
  14. 'reference_name.transcripts.fa' generated by
  15. 'rsem-prepare-reference' should be used.
  16.  
  17. output_name
  18. The name of all output files. The Ng vector will be stored as
  19. 'output_name.ngvec'.
  20.  
  21. OPTIONS
  22. -k <int>
  23. k mer length. See description section. (Default: 25)
  24.  
  25. -h/--help
  26. Show help information.
  27.  
  28. DESCRIPTION
  29. This program generates the Ng vector required by EBSeq for isoform level
  30. differential expression analysis based on reference sequences only.
  31. EBSeq can take variance due to read mapping ambiguity into consideration
  32. by grouping isoforms with parent gene's number of isoforms. However, for
  33. de novo assembled transcriptome, it is hard to obtain an accurate
  34. gene-isoform relationship. Instead, this program groups isoforms by
  35. using measures on read mappaing ambiguity directly. First, it calculates
  36. the 'unmappability' of each transcript. The 'unmappability' of a
  37. transcript is the ratio between the number of k mers with at least one
  38. perfect match to other transcripts and the total number of k mers of
  39. this transcript, where k is a parameter. Then, Ng vector is generated by
  40. applying Kmeans algorithm to the 'unmappability' values with number of
  41. clusters set as 3. 'rsem-generate-ngvector' will make sure the mean
  42. 'unmappability' scores for clusters are in ascending order. All
  43. transcripts whose lengths are less than k are assigned to cluster 3.
  44.  
  45. If your reference is a de novo assembled transcript set, you should run
  46. 'rsem-generate-ngvector' first. Then load the resulting
  47. 'output_name.ngvec' into R. For example, you can use
  48.  
  49. NgVec <- scan(file="output_name.ngvec", what=0, sep="\n")
  50.  
  51. . After that, replace 'IsoNgTrun' with 'NgVec' in the second line of
  52. section 3.2.5 (Page 10) of EBSeq's vignette:
  53.  
  54. IsoEBres=EBTest(Data=IsoMat, NgVector=NgVec, ...)
  55.  
  56. This program only needs to run once per RSEM reference.
  57.  
  58. OUTPUT
  59. output_name.ump
  60. 'unmappability' scores for each transcript. This file contains two
  61. columns. The first column is transcript name and the second column
  62. is 'unmappability' score.
  63.  
  64. output_name.ngvec
  65. Ng vector generated by this program.
  66.  
  67. EXAMPLES
  68. Suppose the reference sequences file is
  69. '/ref/mouse_125/mouse_125.transcripts.fa' and we set the output_name as
  70. 'mouse_125':
  71.  
  72. rsem-generate-ngvector /ref/mouse_125/mouse_125.transcripts.fa mouse_125
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement