Context Navigation

← Previous Revision
Next Revision →
Blame
Revision Log

reg.hlp

Visit:

Last change on this file was 19575, checked in by westram, 3 weeks ago
reintegrates 'help' into 'trunk' preformatted text gets checked for width now (to enforce it fits into the arb help window). fixed help following these checks, using the following steps: ignore problems in foreign documentation. increase default help window width. introduce control comments to accept oversized preformatted sections. enforce preformatted style for whole sections. simply define single-line preformatted sections Used intensive for definition of internal script languages. fixed several non-related problems found in documentation. minor layout changes for HTML version of arb help (more compacted; highlight anchored/all sections). refactor system interface (GUI version) and use it from help module. adds: log:branches/help@19532:19574
Property svn:eol-style set to `native` Property svn:keywords set to `Author Date Id Revision`
File size: 7.8 KB

Line
1	# main topics:
2	UP arb.hlp
3	UP glossary.hlp
4
5	# sub topics:
6	SUB srt.hlp
7	SUB aci.hlp
8
9	# format described in ../help.readme
10
11
12	TITLE Regular Expressions (REG)
13
14	OCCURRENCE Many places
15
16	SECTION Ways to use regular expressions
17
18	There are two ways to use regular expressions:
19
20	[1] /Search Regexpr/Replace String/
21	[2] /Search Regexpr/
22
23	[1] searches the input for occurrences of 'Search Regexpr' and
24	replaces every occurrence with 'Replace String'.
25
26	[2] searches the input for the FIRST occurrence of 'Search
27	Regexpr' and returns the found match.
28	If nothing matches, it returns an empty string.
29
30	Notes:
31
32	* You can use regular expressions everywhere where you can use
33	ACI and SRT expressions.
34	* At some places only [2] is available (e.g. in Search&Query).
35	* Normally regular expressions work case sensitive. To make them
36	work case insensitive, simply append an 'i' to the
37	expression (i.e. '/expr/i' or '/expr/repl/i')
38
39	SECTION Syntax of POSIX extended regular expressions as used in ARB
40
41	A regular expression specifies a set of character strings,
42	e.g. the expression '/pseu/i' specifies all strings containing
43	"pseu", "Pseu" or "pSeu" and so on. We say the expression "matches"
44	(a part of) these strings.
45
46	Several characters have special meanings in regular expressions.
47	All other characters just match against themselves.
48
49	Special characters:
50
51	'.' matches any character (e.g. '/h.s/' matches "has" and "his")
52	'[xyz]' matches 'x', 'y' or 'z'
53	'[a-z]' matches all lower case letters
54	'^' matches the beginning of the string
55	(e.g. '/^pseu/i' matches all strings starting with "pseu")
56	'$' matches the end of the string
57	(e.g. '/cens$/i' matches all strings ending in "cens")
58
59	'*' matches the preceding element zero or more times
60	(e.g. '/th*is/' matches "tis", "this", "thhhhhhiss", ..)
61	'?' matches the preceding element zero or one time
62	(e.g. '/th?is/' matches "tis" or "this", but not "thhis")
63	'+' matches the preceding element one or more times
64	(e.g. '/th+is/' matches "this" or "thhhis", but not "tis")
65	'{mi,ma}' matches the preceding element 3 to 5 times
66	(e.g. '/th{2,4}is/' matches "thhis", "thhhis" or "thhhhis")
67
68	'\|' marks an alternative
69
70	Example: '/bacter\|spiri/i' matches all strings containing
71	either "bacter" or "spiri".
72
73	'()' marks a subexpression.
74
75	Subexpressions can be used to separate alternatives or to mark parts
76	for reference in the replace expression (see section about
77	replacement below).
78
79	Examples:
80	* '/bact\|spiri.*cens/'
81
82	matches '/bact/' or '/spiri.*cens/'.
83
84	* whereas '/(bact\|spiri).*cens/'
85
86	matches '/bact.cens/' or '/spiri.cens/'.
87
88	To match against special characters themselves, escape them
89	using a '\' (e.g. '/\/' matches the character "", '/\\/' matches "\")
90
91
92	Character classes:
93
94	[...] is called a character class. It matches against any of the characters
95	listed in between the brackets.
96	[^...] If the character class starts with '^' it matches against any character
97	NOT listed (e.g. '[^78]' matches all but '7' or '8')
98	[5-9] When the character class contains a '-', it will be interpreted as
99	"range of characters". Here '5-9' is equivalent to '56789'.
100	You may mix ranges and single characters,
101	e.g. '14-79' is same as '145679', '7-91-3' is same as '789123'.
102
103	To add special characters to a character class, escape them using '\'.
104
105	There are several special predefined character classes like
106	* [:alpha:] = [a-zA-Z]
107	* [:digit:] = [0-9]
108	* [:alnum:] = [[:alpha:][:digit:]]
109	* [:punct:] = Punctuation characters
110	* [:print:] = Visible characters and the space character
111	* [:blank:] = Space and tab
112	* [:space:] = Whitespace characters (including newlines)
113	* [:cntrl:] = Control characters
114
115	Use these inside brackets (e.g. '/[[:cntrl:]]//' will remove all control characters).
116	See links below for details.
117
118
119	Links:
120
121	* A more in-depth explanation of POSIX extended regular expressions can be
122	found at LINK{http://en.wikipedia.org/wiki/Regular_expression#POSIX}.
123	* Many examples are given in this guide: LINK{http://www.digitalamit.com/article/regular_expression.phtml}
124
125	Notes:
126
127	* if an expression matches one string multiple times, the longest leftmost
128	match is used (e.g: '/ae/' matches 'aaeee' at position 3 of the
129	string 'bbaaeeeffaegg', not 'ae' at position 10).
130
131
132	SECTION Special syntax for search and replace
133
134	Syntax: '/regexp/replace/'
135
136	The part of the input string matched by 'regexp' gets replaced by 'replace'.
137
138	Simple example:
139
140	Input string: 'The quick brown fox jumps over the lazy dog'
141	Search&replace: '/fox\|dog/cat/'
142	Result: 'The quick brown cat jumps over the lazy cat'
143
144	Additionally the match (or parts of it) can be referenced in the replace string:
145
146	\0 refers to the whole match
147	\1 refers to the first subexpression
148	\2 refers to the second subexpression
149	...
150	\9 refers to the ninth subexpression
151
152	Example using refs:
153
154	Input string: 'The quick brown fox jumps over the lazy dog'
155	Search&replace: '/(brown\|lazy)\s+(fox\|dog)/\2 \1/'
156	Result: 'The quick fox brown jumps over the dog lazy'
157
158	WARNINGS POSIX extended regular expressions are not greedy, i.e. an expression
159	like '_*' does normally match an empty string (if used w/o context).
160
161	This makes some replacements difficult, e.g. if you have data containing
162	multiple consecutive characters and you'd like to replace these.
163	The expression "/_*/_/" does not work as expected and reports
164	an error: "regular expression '_*' matched an empty string".
165
166	A workaround is the following expression:
167	"/(_+)([^_]\|$)/_\2/"
168
169	Other, simpler workarounds do use the BOL/EOL operators ('^'/'$'),
170	e.g. to remove all trailing underscores:
171	"/_*$//"
172
173	Or all leading underscores:
174	"/^_*//"
175
176	BUGS No bugs known
177

Note: See TracBrowser for help on using the repository browser.

Download in other formats:

Original Format