正则表达式
- 1. Intro
- 2. Using simple patterns
- 3. Using special characters
- 3.1 \ –> Indicate next character is special
- 3.2 ^ –> Matches beginning of input
- 3.3 $ –> matches end of input
- 3.4 * –> matches the preceding expression 0 or more times. Equal to {0,}
- 3.5 + –> matches the preceding expression 1 or more times. Equals t0 {1,}
- 3.6 ? –> matches the preceding expression 0 or 1 time. Equivalent to {0,1}
- 3.7 . –> matches any single character except the newline character
- 3.8 (x) –> Matches ‘x’ and remembers the match, as the following example shows.
- !!! 3.9 (?:x) –> Matches ‘x’ but does not remember the match
- 3.10 x(?=y) –> matches ‘x’ only id ‘x’ is followed by ‘y’
- 3.11 x(?!y) –> matches ‘x’ only if ‘x’ is not followed by ‘y’
- 3.12 x|y –> matches x or y(if there is no match for ‘x’)
- 3.13 {n} –> matches exactly n occurences of the preceding expression.
- 3.14 {n,} –> matches at least n occurrences of the preceding expression.
- 3.15 {n,m} –> matches at least n and at more m occurrences of the preceding expression.
- 3.16 [xyz] –> matches any one of the characters in the brackets
- 3.17 [^xyz] –> matches anything that is not enclosed in the brackets
- 3.18 [\b] –> matches a backspace
- 3.19 \b –> matches a word boundary
- 3.20 \B –> matches a non-word boundary
- 3.21 \d –> matches a digit chracter
- 3.22 \D –> matches a non digit character
- 3.23 \s –> matches a white space chracter
- 3.24 \S –> matches a character other than white space
- 3.25 \w –> matches any apphanumeric character including the underscore
- 3.26 \W matches any non word character
- 3.27 \n –> Where n is a positive integer, a back reference to the last substring matching the n parenthetical in the regular expression (counting left parentheses).
- 4. Rethink for some cool things
- 5. Reference
1. Intro
Composed of simple characters, or a combination of simple and special characters.
2. Using simple patterns
constructed of characters for which you want to find a direct match. For example, the pattern /abc/ matches character combinations in strings only when exactly the characters ‘abc’ occur together and in that order. Such a match would succeed in the strings “Hi, do you know your abc’s?” and “The latest airplane designs evolved from slabcraft.” In both cases the match is with the substring ‘abc’. There is no match in the string ‘Grab crab’ because while it contains the substring ‘ab c’, it does not contain the exact substring ‘abc’.
3. Using special characters
Whne your search need more than a direct match. Now detailing special characters in regular expressions:
3.1 \ –> Indicate next character is special
- A backslash that precedes a non-special character indicates that the next character is special and is not to be interpreted literally.
3.2 ^ –> Matches beginning of input
For example, /^A/
does not match the ‘A’ in “an A”, but does match the ‘A’ in “An E”.
3.3 $ –> matches end of input
For example, /t$/
does not match the ‘t’ in “eater”, but does match it in “eat”.
3.4 * –> matches the preceding expression 0 or more times. Equal to {0,}
For example, /bo*/ matches ‘boooo’ in “A ghost booooed” and ‘b’ in “A bird warbled” but nothing in “A goat grunted”.
3.5 + –> matches the preceding expression 1 or more times. Equals t0 {1,}
For example, /a+/
matches the ‘a’ in “candy” and all the a’s in “caaaaaaandy”, but nothing in “cndy”.
3.6 ? –> matches the preceding expression 0 or 1 time. Equivalent to {0,1}
For example, /e?le?/ matches the ‘el’ in “angel” and the ‘le’ in “angle” and also the ‘l’ in “oslo”.
If used immediately after any of the quantifiers , +, ?, or {}, makes the quantifier *non-greedy** (matching the fewest possible characters), as opposed to the default, which is greedy (matching as many characters as possible). For example, applying /\d+/ to “123abc” matches “123”. But applying /\d+?/ to that same string matches only the “1”.
3.7 . –> matches any single character except the newline character
For example, /.n/ matches ‘an’ and ‘on’ in “nay, an apple is on the tree”, but not ‘nay’.
3.8 (x) –> Matches ‘x’ and remembers the match, as the following example shows.
The parentheses are called capturing parentheses.
The ‘(foo)’ and ‘(bar)’ in the pattern /(foo) (bar) \1 \2/ match and remember the first two words in the string “foo bar foo bar”. The \1 and \2 denote the first and second parenthesized substring matches - foo and bar, matching the string’s last two words. Note that \1, \2, …, \n are used in the matching part of the regex, for more information, see \n below. In the replacement part of a regex the syntax $1, $2, …, $n must be used, e.g.: ‘bar foo’.replace(/(…) (…)/, ‘$2 $1’). $& means the whole matched string.
!!! 3.9 (?:x) –> Matches ‘x’ but does not remember the match
The parentheses are called non-capturing parentheses, and let you define subexpressions for regular expression operators to work with.
Matches ‘x’ but does not remember the match. The parentheses are called non-capturing parentheses, and let you define subexpressions for regular expression operators to work with. Consider the sample expression /(?:foo){1,2}/. If the expression was /foo{1,2}/, the {1,2} characters would apply only to the last ‘o’ in ‘foo’. With the non-capturing parentheses, the {1,2} applies to the entire word ‘foo’.
3.10 x(?=y) –> matches ‘x’ only id ‘x’ is followed by ‘y’
For example, /Jack(?=Sprat)/ matches ‘Jack’ only if it is followed by ‘Sprat’. /Jack(?=Sprat|Frost)/ matches ‘Jack’ only if it is followed by ‘Sprat’ or ‘Frost’. However, neither ‘Sprat’ nor ‘Frost’ is part of the match results.
3.11 x(?!y) –> matches ‘x’ only if ‘x’ is not followed by ‘y’
For example, /\d+(?!\.)/
matches a number only if it is not followed by a decimal point. The regular expression /\d+(?!\.)/.exec("3.141")
matches ‘141’ but not ‘3.141’.
3.12 x|y –> matches x or y(if there is no match for ‘x’)
For example, /green|red/ matches ‘green’ in “green apple” and ‘red’ in “red apple.” The order of ‘x’ and ‘y’ matters. For example a|b matches the empty string in “b”, but b|a matches “b” in the same string.
3.13 {n} –> matches exactly n occurences of the preceding expression.
For example, /a{2}/ doesn’t match the ‘a’ in “candy,” but it does match all of the a’s in “caandy,” and the first two a’s in “caaandy.”
3.14 {n,} –> matches at least n occurrences of the preceding expression.
For example, /a{2,}/ will match “aa”, “aaaa” and “aaaaa” but not “a”
3.15 {n,m} –> matches at least n and at more m occurrences of the preceding expression.
For example, /a{1,3}/ matches nothing in “cndy”, the ‘a’ in “candy,” the first two a’s in “caandy,” and the first three a’s in “caaaaaaandy”. Notice that when matching “caaaaaaandy”, the match is “aaa”, even though the original string had more a’s in it.
3.16 [xyz] –> matches any one of the characters in the brackets
The pattern [a-d], which performs the same match as [abcd], matches the ‘b’ in “brisket” and the ‘c’ in “city”. The patterns /[a-z.]+/ and /[\w.]+/ match the entire string “test.i.ng”.
3.17 [^xyz] –> matches anything that is not enclosed in the brackets
For example, [^abc] is the same as [^a-c]. They initially match ‘r’ in “brisket” and ‘h’ in “chop.”
3.18 [\b] –> matches a backspace
You need to use square brackets if you want to match a literal backspace character. (Not to be confused with \b.)
3.19 \b –> matches a word boundary
A word boundary matches the position between a word character followed by a non-word character
Examples using the input string “moon”:
/\bm/ matches, because the \b
is at the beginning of the string;
the ‘\b’ in /oo\b/ does not match, because the ‘\b’ is both preceded and followed by word characters;
the ‘\b’ in /oon\b/ matches, because it appears at the end of the string;
the ‘\b\ in /\w\b\w/ will never match anything, because it is both preceded and followed by a word character..
3.20 \B –> matches a non-word boundary
matches the following case:
- Before the first character of the string.
- After the last character of the string,.
- Between two word characters
- Between two non-word characters
- The empty string
3.21 \d –> matches a digit chracter
Equal to [0-9]
For example, /\d/ or /[0-9]/ matches ‘2’ in “B2 is the suite number.”
3.22 \D –> matches a non digit character
Equivalent to [^0-9].
For example, /\D/ or /[^0-9]/ matches ‘B’ in “B2 is the suite number.”
3.23 \s –> matches a white space chracter
can be space, tab, form feed, line feed
3.24 \S –> matches a character other than white space
3.25 \w –> matches any apphanumeric character including the underscore
Equivalent to [A-Za-z0-9_]
For example, /\w/ matches ‘a’ in “apple,” ‘5’ in “$5.28,” and ‘3’ in “3D.”
3.26 \W matches any non word character
Equivalent to [^A-Za-z0-9_].
For example, /\W/ or /[^A-Za-z0-9_]/ matches ‘%’ in “50%.”
3.27 \n –> Where n is a positive integer, a back reference to the last substring matching the n parenthetical in the regular expression (counting left parentheses).
For example, /apple(,)\sorange\1/ matches ‘apple, orange,’ in “apple, orange, cherry, peach.”
4. Rethink for some cool things
4.1 \n
选择器
(a|b)\1 —> aa or bb
(1|2)(3|4)\1\2 –> 1313 or 1414 or 2323 or 2424
4.2 (x)
给分组用的,然后用$0, $1, $2 来进行分别的表示
5. Reference
2.掘金正则总结
3.Stackoverflow: what’s the meaning of a number after a backslash in a regular expression?
转载请注明来源,欢迎对文章中的引用来源进行考证,欢迎指出任何有错误或不够清晰的表达。可以在下面评论区评论,也可以邮件至 stone2paul@gmail.com
文章标题:正则表达式
文章字数:1.4k
本文作者:Leilei Chen
发布时间:2020-01-31, 15:25:28
最后更新:2020-02-02, 14:06:58
原始链接:https://www.llchen60.com/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/版权声明: "署名-非商用-相同方式共享 4.0" 转载请保留原文链接及作者。