As I finally received data from NGS sequencing, for some days I used Ubuntu to analyse them. However, I lack the basics of shell coding, and I feel overwhelmed by this whole new language.
I managed to follow pipelines, but there are still beginner issues.
Specifically, I have a folder with 96 files that I want to rename. They are typically of the form:
AD18_S1_R2_cat_trimmed.fastq.gz
AD19_S26_R2_cat_trimmed.fastq.gz
Basically, I am trying to delete the sample ID, for instance _S1
and _S26
.
I recently discovered asterisks, and used them successfully for a previous function. But I have an issue imagining how to use them here.
What I think would work is to extract the expression between _S
and _R
and remove it, while keeping the R
.
If the sample ID always had the same length, I would have used [5-7]
to remove the characters from the name. But it won't work for some samples.
I want to understand how to do this, more than having the answer. Thus, would you kindly explain me how to make this change, and what does your code mean if you agree to share a solution?