Advertisement
Guest User

Untitled

a guest
Sep 19th, 2019
135
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.50 KB | None | 0 0
  1. Get sequence length for all reference contigs
  2.  
  3. ```{bash}
  4.  
  5. cat data/REF/reference.fasta | awk '$0 ~ ">" {print c; c=0;printf substr($0,2,100) "\t"; } $0 !~ ">" {c+=length($0);} END { print c; }' > results/fasta.length
  6.  
  7. ```
  8.  
  9. Import and analyze to get summary stats
  10.  
  11. ```{r}
  12.  
  13. ref_stats <- read_delim("results/fasta.length", delim = "\t",
  14. col_names = c("CONTIG", "LENGTH"))
  15.  
  16. sum(ref_stats$LENGTH)
  17.  
  18. length(ref_stats$CONTIG)
  19.  
  20. mean(ref_stats$LENGTH)
  21.  
  22. Mode(ref_stats$LENGTH)
  23.  
  24. min(ref_stats$LENGTH)
  25.  
  26. ```
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement