Friday, May 3, 2024
16
rated 0 times [  16] [ 0]  / answers: 1 / hits: 35394  / 1 Year ago, sat, may 20, 2023, 3:07:40

I want to print all the lines except the last three lines from the input through awk only. Please note that my file contains n number of lines.



For example,



file.txt contains,



foo
bar
foobar
barfoo
last
line


I want the output to be,



foo
bar
foobar


I know it could be possible through the combination of tac and sed or tac and awk



$ tac file | sed '1,3d' | tac
foo
bar
foobar

$ tac file | awk 'NR==1{next}NR==2{next}NR==3{next}1' | tac
foo
bar
foobar


But i want the output through awk only.


More From » command-line

 Answers
2

It's ever-so clunky but you can add every line to an array and at the end —when you know the length— output everything but the last 3 lines.



... | awk '{l[NR] = $0} END {for (i=1; i<=NR-3; i++) print l[i]}'


Another (more efficient here) approach is manually stacking in three variables:



... | awk '{if (a) print a; a=b; b=c; c=$0}'


a only prints after a line has moved from c to b and then into a so this limits it to three lines. The immediate upsides are it doesn't store all the content in memory and it shouldn't cause buffering issues (fflush() after printing if it does) but the downside here is it's not simple to scale this up. If you want to skip the last 100 lines, you need 100 variables and 100 variable juggles.



If awk had push and pop operators for arrays, it would be easier.



Or we could pre-calculate the number of lines and how far we actually want to go with $(($(wc -l < file) - 3)). This is relatively useless for streamed content but on a file, works pretty well:



awk -v n=$(($(wc -l < file) - 3)) 'NR<n' file


Typically speaking you'd just use head though:



$ seq 6 | head -n-3
1
2
3





Using terdon's benchmark we can actually see how these compare. I thought I'd offer a full comparison though:




  • head: 0.018s (me)

  • awk + wc: 0.169s (me)

  • awk 3 variables: 0.178s (me)

  • awk double-file: 0.322s (terdon)

  • awk circular buffer: 0.355s (Scrutinizer)

  • awk for-loop: 0.693s (me)



The fastest solution is using a C-optimised utility like head or wc handle the heavy lifting things but in pure awk, the manually rotating stack is king for now.


[#24853] Sunday, May 21, 2023, 1 Year  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
termetalli

Total Points: 326
Total Questions: 127
Total Answers: 110

Location: Sao Tome and Principe
Member since Sat, Sep 12, 2020
4 Years ago
;