Saturday, April 27, 2024
17
rated 0 times [  17] [ 0]  / answers: 1 / hits: 76206  / 3 Years ago, fri, october 1, 2021, 3:48:28

To be precise



Some text
begin
Some text goes here.
end
Some more text


and I want to extract entire block that starts from "begin" till "end".



with awk we can do like awk '/begin/,/end/' text.



How to do with grep?


More From » command-line

 Answers
1

Updated 18-Nov-2016 (since grep behavior is changed: grep with -P parameter now doesn't support ^ and $ anchors [on Ubuntu 16.04 with kernel v:4.4.0-21-generic])(wrong (non-)fix)


$ grep -Pzo "begin(.|
)*
end" file
begin
Some text goes here.
end

note: for other commands just replace the '^' & '$' anchors with new-line anchor '
'

______________________________


With grep command:


grep -Pzo "^begin$(.|
)*^end$" file

If you want don't include the patterns "begin" and "end" in result, use grep with Lookbehind and Lookahead support.


grep -Pzo "(?<=^begin$
)(.|
)*(?=
^end$)" file

Also you can use K notify instead of Lookbehind assertion.


grep -Pzo "^begin$
K(.|
)*(?=
^end$)" file

K option ignore everything before pattern matching and ignore pattern itself.


used for avoid printing empty lines from output.


Or as @AvinashRaj suggests there are simple easy grep as following:


grep -Pzo "(?s)^begin$.*?^end$" file

grep -Pzo "^begin$[sS]*?^end$" file

(?s) tells grep to allow the dot to match newline characters.

[sS] matches any character that is either whitespace or non-whitespace.


And their output without including "begin" and "end" is as following:


grep -Pzo "^begin$
K[sS]*?(?=
^end$)" file # or grep -Pzo "(?<=^begin$
)[sS]*?(?=
^end$)"

grep -Pzo "(?s)(?<=^begin$
).*?(?=
^end$)" file

see the full test of all commands here (out of dated as grep behavior with -P parameter is changed)


Note:


^ point the beginning of a line and $ point the end of a line. these added to the around of "begin" and "end" to matching them if they are alone in a line.

In two commands I escaped $ because it also using for "Command Substitution"($(command)) that allows the output of a command to replace the command name.


From man grep:


-o, --only-matching
Print only the matched (non-empty) parts of a matching line,
with each such part on a separate output line.

-P, --perl-regexp
Interpret PATTERN as a Perl compatible regular expression (PCRE)

-z, --null-data
Treat the input as a set of lines, each terminated by a zero byte (the ASCII
NUL character) instead of a newline. Like the -Z or --null option, this option
can be used with commands like sort -z to process arbitrary file names.

[#22336] Friday, October 1, 2021, 3 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
erranbe

Total Points: 118
Total Questions: 95
Total Answers: 117

Location: Virgin Islands (U.S.)
Member since Tue, Jul 7, 2020
4 Years ago
erranbe questions
;