Friday, May 3, 2024
 Popular · Latest · Hot · Upcoming
14
rated 0 times [  14] [ 0]  / answers: 1 / hits: 2237  / 3 Years ago, mon, may 10, 2021, 9:06:12

How do i extract only



http://www.youtube.com/watch?v=qdRaf3-OEh4


from a URL like



http://www.youtube.com/watch?v=qdRaf3-OEh4&playnext=1&list=PL4367CEDBC117AEC6&feature=results_main


I am only interested in the "v" parameter.


More From » youtube

 Answers
2

Update:


The better ones would be:



sed 's/^.+(/|&|?)v=([^&]*).*/2/'
awk 'match($0,/((/|&|?)v=)([^&]*)/,x){print x[3]}'
grep -Po '(?<=(/|&|?)v=)[^&]*'
# Saying match / or & then v=

RFC 3986 states:



URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

query = *( pchar / "/" / "?" )
fragment = *( pchar / "/" / "?" )

pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="


So to be safe use:


 | sed 's/#.*//' | - to remove #fragment part

in front.


I.e.


| sed 's/#.*//' | grep -Po '(?<=(/|&)v=)[^&]*'



SED (2):



echo 'http://www.youtube.com/watch?v=qdRaf3-OEh4&playnext=1&list=PL4367CEDBC117AEC6&feature=results_main'
| sed 's/^.+Wv=([^&]*).*/1/'

Explanation:




's
/…/…/ /THIS/WITH THIS/

'substitute/MATCH 0 or MORE THINGS and GROUP them in ()/WITH THIS/

+-------------------------- s _s_ubsititute
|+------------------------- / START MATCH
|| +---- / END MATCH
|| | +-- 1 REPLACE WITH - 1==Group 1. Or FIRS low ().
|| | | +- / End of SUBSTITUTE
s/^.+Wv=([^&]*).*/1/'
+++-+-+-+-+-----+-+------- ^ Match from beginning of line
++-+-+-+-+-----+-+------- . Match any character
+-+-+-+-+-----+-+------- + multiple times (grep (greedy +, * *? etc))
+-+-+-+-----+-+------- W Non-word-character
+-+-+-----+-+------- v= Literally match "v="
+-+-----+-+------- ( Start MATCH GROUP
+-----+-+------- [^&]* Match any character BUT & - as many as possible
+-+------- ) End MATCH GROUP
+------- .* Match anything; *As many times as possible
- aka to end of line; as there is no

[abc] would match a OR b OR c
[abc]* would match a AND/OR b AND/OR c - as many times as possible
[^abc] would match anything BUT a,b or c

/1/ Replace ENTIRE match with MATCH GROUP number 1.
That would be - everything between ( and ) - which his anything but "&"
after the literal string "v=" - which in turn has a non word letter in
front of it.

That also means that no match means no substitution which ultimately result in
no change.


Result: qdRaf3-OEh4


Note: If no match entire string will be returned.




(G)AWK:



echo 'http://www.youtube.com/watch?v=qdRaf3-OEh4&playnext=1&list=PL4367CEDBC117AEC6&feature=results_main'
| awk 'match($0,/(Wv=)([^&]*)/,v){print v[2]}'

Result: qdRaf3-OEh4


Explanation:


In Awk match(string, regexp) is a function that searches for the longest, leftmost, match of regexp in string. Here I have used an extension that comes with Gawk. (see Awk, GAwk; MAwk etc.) that places the individual matches - that is: what is between parenthesis - in an array of matches.


The pattern is fairly like the Perl/Grep one below.




+-------------------------------------- Built in function
| +--------------------------------- Entire input ($1 would have been filed 1)
| | etc. (Using default delimiters " "*)
| |
| |
| | (....)(....) ------------------ Places Wv= in one group 1, and [^&]* group 2.
match($0, /(Wv=)([^&]*)/, v){print v[2]}
| | | |
| | +-+---- Use "v" from /, v; v is a user defined name
| | +---- 2 specifies index in v, which is group from
| | what is between ()'s in /…/
| |
| +----------- Print is another built in function.
+--------------- Group name that one can use in print.






GREP (Using Perl-compatible):



echo 'http://www.youtube.com/watch?v=qdRaf3-OEh4&playnext=1&list=PL4367CEDBC117AEC6&feature=results_main' |
grep -Po '(?<=Wv=)[^&]*'

Result: qdRaf3-OEh4


Explanation:




-P Use Perl compatible
-o Only print match of the expression.
- That means: Of our pattern only print/return what it matches.
If nothing matches; return nothing.

+------- ^ Negate math to - do not match (ONLY as it is FIRST between [])
|+------ & A literal "&" character
||
(?<=Wv=)[^&]*
| | | | ||
| | | | |+---- * Greedy; as many times as possible.
| | | +--+----- [] Wild order/any order of what is inside []
| | +----------- v= Literal v=
| +------------- W Non Word character
+----------------- (?<= What follows should be (mediately) preceded by.
?=Huh, <=left, = =Equals to

So: Match literal "v=" where "v" is preceded by an non-word-character. Then match
anything; as many times as possible until we are at end of line or we meet an "&".

As you can't have "&" in an URL between key/value pairs this should be OK.


[#33618] Tuesday, May 11, 2021, 3 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
weamp

Total Points: 197
Total Questions: 115
Total Answers: 92

Location: Mauritania
Member since Sun, May 7, 2023
1 Year ago
;