Saturday, December 2, 2023
rated 0 times [  2] [ 0]  / answers: 1 / hits: 1191  / 2 Years ago, thu, april 14, 2022, 12:46:40

So I have this pattern, the whole thing is one line

<img  itemprop="image"  class="hovered__image jsOpenGallery lazyload" data-src="//" alt="Drain King Plumbers - Plumbers & Plumbing Contractors"/><img  itemprop="image"  class="jsMerchantLogo lazyload" data-src="" alt="Drain King Plumbers - Plumbers & Plumbing Contractors"/>

here I am using the expression "alt=" to find the tag and I need to get the name of the business after it like this from above code

alt="Drain King Plumbers - Plumbers & Plumbing Contractors"

The name can be anything, but it is always enclosed in " ". can I use grep to return something like alt="business name"

More From » command-line



You can use htmlq (Like jq, but for HTML.). Install it with brew: brew install htmlq and pipe your string to

| htmlq --attribute alt img

Check also for HTML pup, and xq for XML.

grep (PCREs)

A less elegant way (you can't really parse [X]HTML with regex) is to just use grep with --perl-regexp and --only-matching, with a regex using lookbehind:

| grep -Po "(?<= alt=")[^"]*"

Check also ripgrep.

[#522] Friday, April 15, 2022, 2 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.

Total Points: 114
Total Questions: 122
Total Answers: 126

Location: Mauritania
Member since Sun, Oct 17, 2021
2 Years ago
coffekne questions