SHARE
TWEET

Untitled

a guest Sep 19th, 2019 100 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. Get sequence length for all reference contigs
  2.  
  3. ```{bash}
  4.  
  5. cat data/REF/reference.fasta | awk '$0 ~ ">" {print c; c=0;printf substr($0,2,100) "\t"; } $0 !~ ">" {c+=length($0);} END { print c; }' > results/fasta.length
  6.  
  7. ```
  8.  
  9. Import and analyze to get summary stats
  10.  
  11. ```{r}
  12.  
  13. ref_stats <- read_delim("results/fasta.length", delim = "\t",
  14.                         col_names = c("CONTIG", "LENGTH"))
  15.  
  16. sum(ref_stats$LENGTH)
  17.  
  18. length(ref_stats$CONTIG)
  19.  
  20. mean(ref_stats$LENGTH)
  21.  
  22. Mode(ref_stats$LENGTH)
  23.  
  24. min(ref_stats$LENGTH)
  25.  
  26. ```
RAW Paste Data
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the Cookies Policy. OK, I Understand
Not a member of Pastebin yet?
Sign Up, it unlocks many cool features!
 
Top