Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Get sequence length for all reference contigs
- ```{bash}
- cat data/REF/reference.fasta | awk '$0 ~ ">" {print c; c=0;printf substr($0,2,100) "\t"; } $0 !~ ">" {c+=length($0);} END { print c; }' > results/fasta.length
- ```
- Import and analyze to get summary stats
- ```{r}
- ref_stats <- read_delim("results/fasta.length", delim = "\t",
- col_names = c("CONTIG", "LENGTH"))
- sum(ref_stats$LENGTH)
- length(ref_stats$CONTIG)
- mean(ref_stats$LENGTH)
- Mode(ref_stats$LENGTH)
- min(ref_stats$LENGTH)
- ```
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement