Appendix B. Regular Expressions

The following tables summarize the regular expression grammar and syntax supported by the regular expression classes in System.Text.RegularExpressions. Each of the modifiers and qualifiers in the tables can substantially change the behavior of the matching and searching patterns. For further information on regular expressions, we recommend the definitive Mastering Regular Expressions by Jeffrey E. F. Friedl (O’Reilly).

All the syntax described in Table B-1 through Table B-10 should match the Perl5 syntax, with specific exceptions noted.

Table B-1. Character escapes

Escape Code Sequence

Meaning

Hexadecimal equivalent

a

Bell

u0007


Backspace

u0008
	

Tab

u0009

Carriage return

u000D
v

Vertical tab

u000B
f

Form feed

u000C

Newline

u000A
e

Escape

u001B
40

ASCII character as octal

x20

ASCII character as hex

cC

ASCII control character

u0020

Unicode character as hex

non-escape

A nonescape character

Special case: within a regular expression,  means word boundary, except in a [] set, in which  means the backspace character.

Table B-2. Substitutions

Expression

Meaning

$group-number

Substitutes last substring matched by group-number

${group-name}

Substitutes last substring matched by (?<group-name>)

$$

Substitutes a literal “$”

$&

Substitutes copy of the entire match

$'

Substitutes text of the input string preceding match

$'

Substitutes text of the input string following match

$+

Substitutes the last captures group

$_

Substitutes the entire input string

Substitutions are specified only within a replacement pattern.

Table B-3. Character sets

Expression

Meaning

.

Matches any character except n

[characterlist]

Matches a single character in the list

[^characterlist]

Matches a single character not in the list

[char0-char1]

Matches a single character in a range

w

Matches a word character; same as [a-zA-Z_0-9]

W

Matches a nonword character

s

Matches a space character; same as [ vf]

S

Matches a nonspace character

d

Matches a decimal digit; same as [0-9]

D

Matches a nondigit

Table B-4. Positioning assertions

Expression

Meaning

^

Beginning of line

$

End of line

A

Beginning of string



End of line or string

z

Exactly the end of string

G

Where search started



On a word boundary

B

Not on a word boundary

Table B-5. Quantifiers

Quantifier

Meaning

*

0 or more matches

+

1 or more matches

?

0 or 1 matches

{n}

Exactly n matches

{n,}

At least nmatches

{n,m}

At least n, but no more than m matches

*?

Lazy *, finds first match that has minimum repeats

+?

Lazy +, minimum repeats, but at least 1

??

Lazy ?, zero or minimum repeats

{n}?

Lazy {n}, exactly n matches

{n,}?

Lazy {n}, minimum repeats, but at least n

{n,m}?

Lazy {n,m}, minimum repeats, but at least n, and no more than m

Table B-6. Grouping constructs

Syntax

Meaning

( )

Capture matched substring

(?<name>)

Capture matched substring into group name[a]

(?<number>)

Capture matched substring into group number[a]

(?<name1-name2>)

Undefine name2 and store interval and current group into name1; if name2 is undefined, matching backtracks; name1 is optional[a]

(?: )

Noncapturing group

(?imnsx-imnsx: )

Apply or disable matching options

(?= )

Continue matching only if subexpression matches on right[c]

(?! )

Continue matching only if subexpression doesn’t match on right[c]

(?<= )

Continue matching only if subexpression matches on left[b][c]

(?<! )

Continue matching only if subexpression doesn’t match on left[b][c]

(?> )

Subexpression is matched once, but isn’t backtracked

[a] Single quotes may be used instead of angle brackets, for example (?'name').

[b] This construct doesn’t backtrack; this is to remain compatible with Perl5.

[c] Zero-width assertion; does not consume any characters.

Tip

The named capturing group syntax follows a suggestion made by Jeffrey Friedl in Mastering Regular Expressions (O’Reilly). All other grouping constructs use the Perl5 syntax.

Table B-7. Back references

Parameter syntax

Meaning

count

Back reference count occurrences

k<name>

Named back reference

Table B-8. Alternation

Expression syntax

Meaning

|

Logical OR

(?(expression)yes|no)

Matches yes if expression matches, else no; the no is optional

(?(name)yes|no)

Matches yes if named string has a match, else no; the no is optional

Table B-9. Miscellaneous constructs

Expression syntax

Meaning

(?imnsx-imnsx)

Set or disable options in midpattern

(?# )

Inline comment

# [to end of line]

X-mode comment (requires x option or IgnorePatternWhitespace)

Table B-10. Regular expression options

Option

RegexOption value

Meaning

i

IgnoreCase

Case-insensitive match

m

MultiLine

Multiline mode; changes ^ and $ so they match beginning and ending of any line

n

ExplicitCapture

Capture explicitly named or numbered groups

Compiled

Compile to MSIL

s

SingleLine

Single-line mode; changes meaning of "." so it matches every character

x

IgnorePatternWhitespace

Eliminates unescaped whitespace from the pattern

RightToLeft

Search from right to left; can’t be specified in midstream

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset