Tuesday, May 14, 2024
3
rated 0 times [  3] [ 0]  / answers: 1 / hits: 8039  / 1 Year ago, sat, february 18, 2023, 1:16:12

I want to read parts of a big csv file between rows n and m and between columns p and q.
Is there an easy way to do this easily with the shell? (Are there commands I should read the doc's? otherwise, I'll write a python script)


More From » command-line

 Answers
7

I had a script that I adjusted (good idea the (N+1)q part!) thanks to @chronitis comment and the SO answer:



#! /bin/bash 
#
N=10
M=20
P=2
Q=3
sed -n "$N,${M}p; $((M+1))q" $1 | cut -d, -f$P-$Q


Save the file as for example cut_csv, make it executable and use as



cut_csv file


It can be made fancier by accepting the N,M,P,Q parameters as input etc, but I use it seldomly so I normally simply edit the file.



How it works:



The main command is the following (let's suppose N=10, M=20, P=2, Q=3); the shell substitutes the variables and the last line will become: (1)



sed -n "10,20p ; 21q" file | cut -d, -f2-3


Let's start with the first command:



sed -n "10,20p ; 21q" file 


This call sed (stream editor, man sed) in no-print mode (-n) and execute the following commands on the file:




  1. print (p) the lines between 10 and 20 (this is the 10,20p part)

  2. quit (q) when reading line 21 (21q) so that discard the rest of the file



The output of sed is piped (|) to cut:



cut -d, -f2-3


This command (man cut) selects fields of a line (and repeat for each line). In this case, I am telling it that the separator between fields (columns) is a commad (-d,), and to print out the columns between 2 and 3.



As another more complex example I often use this one:



sed -n "1p; 10,14p; 21q" data.csv | cut -d, -f1,4-8


This will select row 1 (where I have titles :-)) and rows from 10 to 14 (5 lines); then select columns 1 (time in my data...) and column from 4 to 8. It is really powerful once you get grips with it.



(1) one great way to see what the shell is doing is change the first line (which is called a shebang) like that:



#! /bin/bash -xv


The shell will now print every command it reads and the result of the substitutions:



(0)asus-rmano: part_of_csv.sh p20dedo.csv
#! /bin/bash -xv
#
N=10
+ N=10
M=20
+ M=20
P=2
+ P=2
Q=3
+ Q=3
sed -n "$N,${M}p; $((M+1))q" $1 | cut -d, -f$P-$Q
+ cut -d, -f2-3
+ sed -n '10,20p; 21q' p20dedo.csv
16:05:49,000
16:05:51,000
[...]

[#27043] Saturday, February 18, 2023, 1 Year  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
admin

Total Points: 459
Total Questions: 112
Total Answers: 109

Location: Equatorial Guinea
Member since Sun, Feb 14, 2021
3 Years ago
admin questions
Wed, Aug 24, 22, 07:25, 2 Years ago
Wed, Mar 2, 22, 07:09, 2 Years ago
Thu, May 13, 21, 00:32, 3 Years ago
Mon, Jan 17, 22, 14:49, 2 Years ago
;