diff options
Diffstat (limited to 'doc/html/ntqregexp.html')
-rw-r--r-- | doc/html/ntqregexp.html | 1037 |
1 files changed, 1037 insertions, 0 deletions
diff --git a/doc/html/ntqregexp.html b/doc/html/ntqregexp.html new file mode 100644 index 000000000..16ee5d8c0 --- /dev/null +++ b/doc/html/ntqregexp.html @@ -0,0 +1,1037 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> +<!-- /home/espenr/tmp/qt-3.3.8-espenr-2499/qt-x11-free-3.3.8/src/tools/qregexp.cpp:77 --> +<html> +<head> +<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> +<title>TQRegExp Class</title> +<style type="text/css"><!-- +fn { margin-left: 1cm; text-indent: -1cm; } +a:link { color: #004faf; text-decoration: none } +a:visited { color: #672967; text-decoration: none } +body { background: #ffffff; color: black; } +--></style> +</head> +<body> + +<table border="0" cellpadding="0" cellspacing="0" width="100%"> +<tr bgcolor="#E5E5E5"> +<td valign=center> + <a href="index.html"> +<font color="#004faf">Home</font></a> + | <a href="classes.html"> +<font color="#004faf">All Classes</font></a> + | <a href="mainclasses.html"> +<font color="#004faf">Main Classes</font></a> + | <a href="annotated.html"> +<font color="#004faf">Annotated</font></a> + | <a href="groups.html"> +<font color="#004faf">Grouped Classes</font></a> + | <a href="functions.html"> +<font color="#004faf">Functions</font></a> +</td> +<td align="right" valign="center"><img src="logo32.png" align="right" width="64" height="32" border="0"></td></tr></table><h1 align=center>TQRegExp Class Reference</h1> + +<p>The TQRegExp class provides pattern matching using regular expressions. +<a href="#details">More...</a> +<p>All the functions in this class are <a href="threads.html#reentrant">reentrant</a> when TQt is built with thread support.</p> +<p><tt>#include <<a href="qregexp-h.html">ntqregexp.h</a>></tt> +<p><a href="qregexp-members.html">List of all member functions.</a> +<h2>Public Members</h2> +<ul> +<li class=fn>enum <a href="#CaretMode-enum"><b>CaretMode</b></a> { CaretAtZero, CaretAtOffset, CaretWontMatch }</li> +<li class=fn><a href="#TQRegExp"><b>TQRegExp</b></a> ()</li> +<li class=fn><a href="#TQRegExp-2"><b>TQRegExp</b></a> ( const TQString & pattern, bool caseSensitive = TRUE, bool wildcard = FALSE )</li> +<li class=fn><a href="#TQRegExp-3"><b>TQRegExp</b></a> ( const TQRegExp & rx )</li> +<li class=fn><a href="#~TQRegExp"><b>~TQRegExp</b></a> ()</li> +<li class=fn>TQRegExp & <a href="#operator-eq"><b>operator=</b></a> ( const TQRegExp & rx )</li> +<li class=fn>bool <a href="#operator-eq-eq"><b>operator==</b></a> ( const TQRegExp & rx ) const</li> +<li class=fn>bool <a href="#operator!-eq"><b>operator!=</b></a> ( const TQRegExp & rx ) const</li> +<li class=fn>bool <a href="#isEmpty"><b>isEmpty</b></a> () const</li> +<li class=fn>bool <a href="#isValid"><b>isValid</b></a> () const</li> +<li class=fn>TQString <a href="#pattern"><b>pattern</b></a> () const</li> +<li class=fn>void <a href="#setPattern"><b>setPattern</b></a> ( const TQString & pattern )</li> +<li class=fn>bool <a href="#caseSensitive"><b>caseSensitive</b></a> () const</li> +<li class=fn>void <a href="#setCaseSensitive"><b>setCaseSensitive</b></a> ( bool sensitive )</li> +<li class=fn>bool <a href="#wildcard"><b>wildcard</b></a> () const</li> +<li class=fn>void <a href="#setWildcard"><b>setWildcard</b></a> ( bool wildcard )</li> +<li class=fn>bool <a href="#minimal"><b>minimal</b></a> () const</li> +<li class=fn>void <a href="#setMinimal"><b>setMinimal</b></a> ( bool minimal )</li> +<li class=fn>bool <a href="#exactMatch"><b>exactMatch</b></a> ( const TQString & str ) const</li> +<li class=fn>int match ( const TQString & str, int index = 0, int * len = 0, bool indexIsStart = TRUE ) const <em>(obsolete)</em></li> +<li class=fn>int <a href="#search"><b>search</b></a> ( const TQString & str, int offset = 0, CaretMode caretMode = CaretAtZero ) const</li> +<li class=fn>int <a href="#searchRev"><b>searchRev</b></a> ( const TQString & str, int offset = -1, CaretMode caretMode = CaretAtZero ) const</li> +<li class=fn>int <a href="#matchedLength"><b>matchedLength</b></a> () const</li> +<li class=fn>int <a href="#numCaptures"><b>numCaptures</b></a> () const</li> +<li class=fn>TQStringList <a href="#capturedTexts"><b>capturedTexts</b></a> ()</li> +<li class=fn>TQString <a href="#cap"><b>cap</b></a> ( int nth = 0 )</li> +<li class=fn>int <a href="#pos"><b>pos</b></a> ( int nth = 0 )</li> +<li class=fn>TQString <a href="#errorString"><b>errorString</b></a> ()</li> +</ul> +<h2>Static Public Members</h2> +<ul> +<li class=fn>TQString <a href="#escape"><b>escape</b></a> ( const TQString & str )</li> +</ul> +<hr><a name="details"></a><h2>Detailed Description</h2> + + + +The TQRegExp class provides pattern matching using regular expressions. +<p> + + + +<!-- index regular expression --><a name="regular-expression"></a> +<p> Regular expressions, or "regexps", provide a way to find patterns +within text. This is useful in many contexts, for example: +<p> <center><table cellpadding="4" cellspacing="2" border="0"> +<tr bgcolor="#f0f0f0"> <td valign="top">Validation +<td valign="top">A regexp can be used to check whether a piece of text +meets some criteria, e.g. is an integer or contains no +whitespace. +<tr bgcolor="#d0d0d0"> <td valign="top">Searching +<td valign="top">Regexps provide a much more powerful means of searching +text than simple string matching does. For example we can +create a regexp which says "find one of the words 'mail', +'letter' or 'correspondence' but not any of the words +'email', 'mailman' 'mailer', 'letterbox' etc." +<tr bgcolor="#f0f0f0"> <td valign="top">Search and Replace +<td valign="top">A regexp can be used to replace a pattern with a piece of +text, for example replace all occurrences of '&' with +'&amp;' except where the '&' is already followed by 'amp;'. +<tr bgcolor="#d0d0d0"> <td valign="top">String Splitting +<td valign="top">A regexp can be used to identify where a string should be +split into its component fields, e.g. splitting tab-delimited +strings. +</table></center> +<p> We present a very brief introduction to regexps, a description of +TQt's regexp language, some code examples, and finally the function +documentation itself. TQRegExp is modeled on Perl's regexp +language, and also fully supports Unicode. TQRegExp can also be +used in the weaker 'wildcard' (globbing) mode which works in a +similar way to command shells. A good text on regexps is <em>Mastering Regular Expressions: Powerful Techniques for Perl and Other Tools</em> by Jeffrey E. Friedl, ISBN 1565922573. +<p> Experienced regexp users may prefer to skip the introduction and +go directly to the relevant information. +<p> In case of multi-threaded programming, note that TQRegExp depends on +<a href="ntqthreadstorage.html">TQThreadStorage</a> internally. For that reason, TQRegExp should only be +used with threads started with <a href="ntqthread.html">TQThread</a>, i.e. not with threads +started with platform-specific APIs. +<p> <!-- toc --> +<ul> +<li><a href="#1"> Introduction +</a> +<li><a href="#1-1"> Characters and Abbreviations for Sets of Characters +</a> +<li><a href="#1-2"> Sets of Characters +</a> +<li><a href="#1-3"> Quantifiers +</a> +<li><a href="#1-4"> Capturing Text +</a> +<li><a href="#1-5"> Assertions +</a> +<li><a href="#1-6"> Wildcard Matching (globbing) +</a> +<li><a href="#1-7"> Notes for Perl Users +</a> +<li><a href="#1-8"> Code Examples +</a> +</ul> +<!-- endtoc --> + +<p> <h3> Introduction +</h3> +<a name="1"></a><p> Regexps are built up from expressions, quantifiers, and assertions. +The simplest form of expression is simply a character, e.g. +<b>x</b> or <b>5</b>. An expression can also be a set of +characters. For example, <b>[ABCD]</b>, will match an <b>A</b> or +a <b>B</b> or a <b>C</b> or a <b>D</b>. As a shorthand we could +write this as <b>[A-D]</b>. If we want to match any of the +captital letters in the English alphabet we can write +<b>[A-Z]</b>. A quantifier tells the regexp engine how many +occurrences of the expression we want, e.g. <b>x{1,1}</b> means +match an <b>x</b> which occurs at least once and at most once. +We'll look at assertions and more complex expressions later. +<p> Note that in general regexps cannot be used to check for balanced +brackets or tags. For example if you want to match an opening html +<tt><b></tt> and its closing <tt></b></tt> you can only use a regexp if you +know that these tags are not nested; the html fragment, <tt><b>bold <b>bolder</b></b></tt> will not match as expected. If you know the +maximum level of nesting it is possible to create a regexp that +will match correctly, but for an unknown level of nesting, regexps +will fail. +<p> We'll start by writing a regexp to match integers in the range 0 +to 99. We will require at least one digit so we will start with +<b>[0-9]{1,1}</b> which means match a digit exactly once. This +regexp alone will match integers in the range 0 to 9. To match one +or two digits we can increase the maximum number of occurrences so +the regexp becomes <b>[0-9]{1,2}</b> meaning match a digit at +least once and at most twice. However, this regexp as it stands +will not match correctly. This regexp will match one or two digits +<em>within</em> a string. To ensure that we match against the whole +string we must use the anchor assertions. We need <b>^</b> (caret) +which when it is the first character in the regexp means that the +regexp must match from the beginning of the string. And we also +need <b>$</b> (dollar) which when it is the last character in the +regexp means that the regexp must match until the end of the +string. So now our regexp is <b>^[0-9]{1,2}$</b>. Note that +assertions, such as <b>^</b> and <b>$</b>, do not match any +characters. +<p> If you've seen regexps elsewhere they may have looked different from +the ones above. This is because some sets of characters and some +quantifiers are so common that they have special symbols to +represent them. <b>[0-9]</b> can be replaced with the symbol +<b>\d</b>. The quantifier to match exactly one occurrence, +<b>{1,1}</b>, can be replaced with the expression itself. This means +that <b>x{1,1}</b> is exactly the same as <b>x</b> alone. So our 0 +to 99 matcher could be written <b>^\d{1,2}$</b>. Another way of +writing it would be <b>^\d\d{0,1}$</b>, i.e. from the start of the +string match a digit followed by zero or one digits. In practice +most people would write it <b>^\d\d?$</b>. The <b>?</b> is a +shorthand for the quantifier <b>{0,1}</b>, i.e. a minimum of no +occurrences a maximum of one occurrence. This is used to make an +expression optional. The regexp <b>^\d\d?$</b> means "from the +beginning of the string match one digit followed by zero or one +digits and then the end of the string". +<p> Our second example is matching the words 'mail', 'letter' or +'correspondence' but without matching 'email', 'mailman', +'mailer', 'letterbox' etc. We'll start by just matching 'mail'. In +full the regexp is, <b>m{1,1}a{1,1}i{1,1}l{1,1}</b>, but since +each expression itself is automatically quantified by <b>{1,1}</b> +we can simply write this as <b>mail</b>; an 'm' followed by an 'a' +followed by an 'i' followed by an 'l'. The symbol '|' (bar) is +used for <em>alternation</em>, so our regexp now becomes +<b>mail|letter|correspondence</b> which means match 'mail' <em>or</em> +'letter' <em>or</em> 'correspondence'. Whilst this regexp will find the +words we want it will also find words we don't want such as +'email'. We will start by putting our regexp in parentheses, +<b>(mail|letter|correspondence)</b>. Parentheses have two effects, +firstly they group expressions together and secondly they identify +parts of the regexp that we wish to <a href="#capturing-text">capture</a>. Our regexp still matches any of the three words but now +they are grouped together as a unit. This is useful for building +up more complex regexps. It is also useful because it allows us to +examine which of the words actually matched. We need to use +another assertion, this time <b>\b</b> "word boundary": +<b>\b(mail|letter|correspondence)\b</b>. This regexp means "match +a word boundary followed by the expression in parentheses followed +by another word boundary". The <b>\b</b> assertion matches at a <em>position</em> in the regexp not a <em>character</em> in the regexp. A word +boundary is any non-word character such as a space a newline or +the beginning or end of the string. +<p> For our third example we want to replace ampersands with the HTML +entity '&amp;'. The regexp to match is simple: <b>&</b>, i.e. +match one ampersand. Unfortunately this will mess up our text if +some of the ampersands have already been turned into HTML +entities. So what we really want to say is replace an ampersand +providing it is not followed by 'amp;'. For this we need the +negative lookahead assertion and our regexp becomes: +<b>&(?!amp;)</b>. The negative lookahead assertion is introduced +with '(?!' and finishes at the ')'. It means that the text it +contains, 'amp;' in our example, must <em>not</em> follow the expression +that preceeds it. +<p> Regexps provide a rich language that can be used in a variety of +ways. For example suppose we want to count all the occurrences of +'Eric' and 'Eirik' in a string. Two valid regexps to match these +are <b>\b(Eric|Eirik)\b</b> and <b>\bEi?ri[ck]\b</b>. We need +the word boundary '\b' so we don't get 'Ericsson' etc. The second +regexp actually matches more than we want, 'Eric', 'Erik', 'Eiric' +and 'Eirik'. +<p> We will implement some the examples above in the +<a href="#code-examples">code examples</a> section. +<p> <a name="characters-and-abbreviations-for-sets-of-characters"></a> +<h3> Characters and Abbreviations for Sets of Characters +</h3> +<a name="1-1"></a><p> <center><table cellpadding="4" cellspacing="2" border="0"> +<tr bgcolor="#a2c511"> <th valign="top">Element <th valign="top">Meaning +<tr bgcolor="#f0f0f0"> <td valign="top"><b>c</b> +<td valign="top">Any character represents itself unless it has a special +regexp meaning. Thus <b>c</b> matches the character <em>c</em>. +<tr bgcolor="#d0d0d0"> <td valign="top"><b>\c</b> +<td valign="top">A character that follows a backslash matches the character +itself except where mentioned below. For example if you +wished to match a literal caret at the beginning of a string +you would write <b>\^</b>. +<tr bgcolor="#f0f0f0"> <td valign="top"><b>\a</b> +<td valign="top">This matches the ASCII bell character (BEL, 0x07). +<tr bgcolor="#d0d0d0"> <td valign="top"><b>\f</b> +<td valign="top">This matches the ASCII form feed character (FF, 0x0C). +<tr bgcolor="#f0f0f0"> <td valign="top"><b>\n</b> +<td valign="top">This matches the ASCII line feed character (LF, 0x0A, Unix newline). +<tr bgcolor="#d0d0d0"> <td valign="top"><b>\r</b> +<td valign="top">This matches the ASCII carriage return character (CR, 0x0D). +<tr bgcolor="#f0f0f0"> <td valign="top"><b>\t</b> +<td valign="top">This matches the ASCII horizontal tab character (HT, 0x09). +<tr bgcolor="#d0d0d0"> <td valign="top"><b>\v</b> +<td valign="top">This matches the ASCII vertical tab character (VT, 0x0B). +<tr bgcolor="#f0f0f0"> <td valign="top"><b>\xhhhh</b> +<td valign="top">This matches the Unicode character corresponding to the +hexadecimal number hhhh (between 0x0000 and 0xFFFF). \0ooo +(i.e., \zero ooo) matches the ASCII/Latin-1 character +corresponding to the octal number ooo (between 0 and 0377). +<tr bgcolor="#d0d0d0"> <td valign="top"><b>. (dot)</b> +<td valign="top">This matches any character (including newline). +<tr bgcolor="#f0f0f0"> <td valign="top"><b>\d</b> +<td valign="top">This matches a digit (<a href="qchar.html#isDigit">TQChar::isDigit</a>()). +<tr bgcolor="#d0d0d0"> <td valign="top"><b>\D</b> +<td valign="top">This matches a non-digit. +<tr bgcolor="#f0f0f0"> <td valign="top"><b>\s</b> +<td valign="top">This matches a whitespace (<a href="qchar.html#isSpace">TQChar::isSpace</a>()). +<tr bgcolor="#d0d0d0"> <td valign="top"><b>\S</b> +<td valign="top">This matches a non-whitespace. +<tr bgcolor="#f0f0f0"> <td valign="top"><b>\w</b> +<td valign="top">This matches a word character (<a href="qchar.html#isLetterOrNumber">TQChar::isLetterOrNumber</a>() or '_'). +<tr bgcolor="#d0d0d0"> <td valign="top"><b>\W</b> +<td valign="top">This matches a non-word character. +<tr bgcolor="#f0f0f0"> <td valign="top"><b>\n</b> +<td valign="top">The n-th <a href="#capturing-text">backreference</a>, +e.g. \1, \2, etc. +</table></center> +<p> <em>Note that the C++ compiler transforms backslashes in strings so to include a <b>\</b> in a regexp you will need to enter it twice, i.e. <b>\\</b>.</em> +<p> <a name="sets-of-characters"></a> +<h3> Sets of Characters +</h3> +<a name="1-2"></a><p> Square brackets are used to match any character in the set of +characters contained within the square brackets. All the character +set abbreviations described above can be used within square +brackets. Apart from the character set abbreviations and the +following two exceptions no characters have special meanings in +square brackets. +<p> <center><table cellpadding="4" cellspacing="2" border="0"> +<tr bgcolor="#d0d0d0"> <td valign="top"><b>^</b> +<td valign="top">The caret negates the character set if it occurs as the +first character, i.e. immediately after the opening square +bracket. For example, <b>[abc]</b> matches 'a' or 'b' or 'c', +but <b>[^abc]</b> matches anything <em>except</em> 'a' or 'b' or +'c'. +<tr bgcolor="#f0f0f0"> <td valign="top"><b>-</b> +<td valign="top">The dash is used to indicate a range of characters, for +example <b>[W-Z]</b> matches 'W' or 'X' or 'Y' or 'Z'. +</table></center> +<p> Using the predefined character set abbreviations is more portable +than using character ranges across platforms and languages. For +example, <b>[0-9]</b> matches a digit in Western alphabets but +<b>\d</b> matches a digit in <em>any</em> alphabet. +<p> Note that in most regexp literature sets of characters are called +"character classes". +<p> <a name="quantifiers"></a> +<h3> Quantifiers +</h3> +<a name="1-3"></a><p> By default an expression is automatically quantified by +<b>{1,1}</b>, i.e. it should occur exactly once. In the following +list <b><em>E</em></b> stands for any expression. An expression is a +character or an abbreviation for a set of characters or a set of +characters in square brackets or any parenthesised expression. +<p> <center><table cellpadding="4" cellspacing="2" border="0"> +<tr bgcolor="#d0d0d0"> <td valign="top"><b><em>E</em>?</b> +<td valign="top">Matches zero or one occurrence of <em>E</em>. This quantifier +means "the previous expression is optional" since it will +match whether or not the expression occurs in the string. It +is the same as <b><em>E</em>{0,1}</b>. For example <b>dents?</b> +will match 'dent' and 'dents'. +<tr bgcolor="#f0f0f0"> <td valign="top"><b><em>E</em>+</b> +<td valign="top">Matches one or more occurrences of <em>E</em>. This is the same +as <b><em>E</em>{1,MAXINT}</b>. For example, <b>0+</b> will match +'0', '00', '000', etc. +<tr bgcolor="#d0d0d0"> <td valign="top"><b><em>E</em>*</b> +<td valign="top">Matches zero or more occurrences of <em>E</em>. This is the same +as <b><em>E</em>{0,MAXINT}</b>. The <b>*</b> quantifier is often +used by a mistake. Since it matches <em>zero</em> or more +occurrences it will match no occurrences at all. For example +if we want to match strings that end in whitespace and use +the regexp <b>\s*$</b> we would get a match on every string. +This is because we have said find zero or more whitespace +followed by the end of string, so even strings that don't end +in whitespace will match. The regexp we want in this case is +<b>\s+$</b> to match strings that have at least one +whitespace at the end. +<tr bgcolor="#f0f0f0"> <td valign="top"><b><em>E</em>{n}</b> +<td valign="top">Matches exactly <em>n</em> occurrences of the expression. This +is the same as repeating the expression <em>n</em> times. For +example, <b>x{5}</b> is the same as <b>xxxxx</b>. It is also +the same as <b><em>E</em>{n,n}</b>, e.g. <b>x{5,5}</b>. +<tr bgcolor="#d0d0d0"> <td valign="top"><b><em>E</em>{n,}</b> +<td valign="top">Matches at least <em>n</em> occurrences of the expression. This +is the same as <b><em>E</em>{n,MAXINT}</b>. +<tr bgcolor="#f0f0f0"> <td valign="top"><b><em>E</em>{,m}</b> +<td valign="top">Matches at most <em>m</em> occurrences of the expression. This +is the same as <b><em>E</em>{0,m}</b>. +<tr bgcolor="#d0d0d0"> <td valign="top"><b><em>E</em>{n,m}</b> +<td valign="top">Matches at least <em>n</em> occurrences of the expression and at +most <em>m</em> occurrences of the expression. +</table></center> +<p> (MAXINT is implementation dependent but will not be smaller than +1024.) +<p> If we wish to apply a quantifier to more than just the preceding +character we can use parentheses to group characters together in +an expression. For example, <b>tag+</b> matches a 't' followed by +an 'a' followed by at least one 'g', whereas <b>(tag)+</b> matches +at least one occurrence of 'tag'. +<p> Note that quantifiers are "greedy". They will match as much text +as they can. For example, <b>0+</b> will match as many zeros as it +can from the first zero it finds, e.g. '2.<u>000</u>5'. +Quantifiers can be made non-greedy, see <a href="#setMinimal">setMinimal</a>(). +<p> <a name="capturing-text"></a> +<h3> Capturing Text +</h3> +<a name="1-4"></a><p> Parentheses allow us to group elements together so that we can +quantify and capture them. For example if we have the expression +<b>mail|letter|correspondence</b> that matches a string we know +that <em>one</em> of the words matched but not which one. Using +parentheses allows us to "capture" whatever is matched within +their bounds, so if we used <b>(mail|letter|correspondence)</b> +and matched this regexp against the string "I sent you some email" +we can use the <a href="#cap">cap</a>() or <a href="#capturedTexts">capturedTexts</a>() functions to extract the +matched characters, in this case 'mail'. +<p> We can use captured text within the regexp itself. To refer to the +captured text we use <em>backreferences</em> which are indexed from 1, +the same as for cap(). For example we could search for duplicate +words in a string using <b>\b(\w+)\W+\1\b</b> which means match a +word boundary followed by one or more word characters followed by +one or more non-word characters followed by the same text as the +first parenthesised expression followed by a word boundary. +<p> If we want to use parentheses purely for grouping and not for +capturing we can use the non-capturing syntax, e.g. +<b>(?:green|blue)</b>. Non-capturing parentheses begin '(?:' and +end ')'. In this example we match either 'green' or 'blue' but we +do not capture the match so we only know whether or not we matched +but not which color we actually found. Using non-capturing +parentheses is more efficient than using capturing parentheses +since the regexp engine has to do less book-keeping. +<p> Both capturing and non-capturing parentheses may be nested. +<p> <a name="assertions"></a> +<h3> Assertions +</h3> +<a name="1-5"></a><p> Assertions make some statement about the text at the point where +they occur in the regexp but they do not match any characters. In +the following list <b><em>E</em></b> stands for any expression. +<p> <center><table cellpadding="4" cellspacing="2" border="0"> +<tr bgcolor="#f0f0f0"> <td valign="top"><b>^</b> +<td valign="top">The caret signifies the beginning of the string. If you +wish to match a literal <tt>^</tt> you must escape it by +writing <b>\^</b>. For example, <b>^#include</b> will only +match strings which <em>begin</em> with the characters '#include'. +(When the caret is the first character of a character set it +has a special meaning, see <a href="#sets-of-characters">Sets of + Characters</a>.) +<tr bgcolor="#d0d0d0"> <td valign="top"><b>$</b> +<td valign="top">The dollar signifies the end of the string. For example +<b>\d\s*$</b> will match strings which end with a digit +optionally followed by whitespace. If you wish to match a +literal <tt>$</tt> you must escape it by writing +<b>\$</b>. +<tr bgcolor="#f0f0f0"> <td valign="top"><b>\b</b> +<td valign="top">A word boundary. For example the regexp +<b>\bOK\b</b> means match immediately after a word +boundary (e.g. start of string or whitespace) the letter 'O' +then the letter 'K' immediately before another word boundary +(e.g. end of string or whitespace). But note that the +assertion does not actually match any whitespace so if we +write <b>(\bOK\b)</b> and we have a match it will only +contain 'OK' even if the string is "Its <u>OK</u> now". +<tr bgcolor="#d0d0d0"> <td valign="top"><b>\B</b> +<td valign="top">A non-word boundary. This assertion is true wherever +<b>\b</b> is false. For example if we searched for +<b>\Bon\B</b> in "Left on" the match would fail (space +and end of string aren't non-word boundaries), but it would +match in "t<u>on</u>ne". +<tr bgcolor="#f0f0f0"> <td valign="top"><b>(?=<em>E</em>)</b> +<td valign="top">Positive lookahead. This assertion is true if the +expression matches at this point in the regexp. For example, +<b>const(?=\s+char)</b> matches 'const' whenever it is +followed by 'char', as in 'static <u>const</u> char *'. +(Compare with <b>const\s+char</b>, which matches 'static +<u>const char</u> *'.) +<tr bgcolor="#d0d0d0"> <td valign="top"><b>(?!<em>E</em>)</b> +<td valign="top">Negative lookahead. This assertion is true if the +expression does not match at this point in the regexp. For +example, <b>const(?!\s+char)</b> matches 'const' <em>except</em> +when it is followed by 'char'. +</table></center> +<p> <a name="wildcard-matching"></a> +<h3> Wildcard Matching (globbing) +</h3> +<a name="1-6"></a><p> Most command shells such as <em>bash</em> or <em>cmd.exe</em> support "file +globbing", the ability to identify a group of files by using +wildcards. The <a href="#setWildcard">setWildcard</a>() function is used to switch between +regexp and wildcard mode. Wildcard matching is much simpler than +full regexps and has only four features: +<p> <center><table cellpadding="4" cellspacing="2" border="0"> +<tr bgcolor="#f0f0f0"> <td valign="top"><b>c</b> +<td valign="top">Any character represents itself apart from those mentioned +below. Thus <b>c</b> matches the character <em>c</em>. +<tr bgcolor="#d0d0d0"> <td valign="top"><b>?</b> +<td valign="top">This matches any single character. It is the same as +<b>.</b> in full regexps. +<tr bgcolor="#f0f0f0"> <td valign="top"><b>*</b> +<td valign="top">This matches zero or more of any characters. It is the +same as <b>.*</b> in full regexps. +<tr bgcolor="#d0d0d0"> <td valign="top"><b>[...]</b> +<td valign="top">Sets of characters can be represented in square brackets, +similar to full regexps. Within the character class, like +outside, backslash has no special meaning. +</table></center> +<p> For example if we are in wildcard mode and have strings which +contain filenames we could identify HTML files with <b>*.html</b>. +This will match zero or more characters followed by a dot followed +by 'h', 't', 'm' and 'l'. +<p> <a name="perl-users"></a> +<h3> Notes for Perl Users +</h3> +<a name="1-7"></a><p> Most of the character class abbreviations supported by Perl are +supported by TQRegExp, see <a href="#characters-and-abbreviations-for-sets-of-characters">characters + and abbreviations for sets of characters</a>. +<p> In TQRegExp, apart from within character classes, <tt>^</tt> always +signifies the start of the string, so carets must always be +escaped unless used for that purpose. In Perl the meaning of caret +varies automagically depending on where it occurs so escaping it +is rarely necessary. The same applies to <tt>$</tt> which in +TQRegExp always signifies the end of the string. +<p> TQRegExp's quantifiers are the same as Perl's greedy quantifiers. +Non-greedy matching cannot be applied to individual quantifiers, +but can be applied to all the quantifiers in the pattern. For +example, to match the Perl regexp <b>ro+?m</b> requires: +<pre> + TQRegExp rx( "ro+m" ); + rx.<a href="#setMinimal">setMinimal</a>( TRUE ); + </pre> + +<p> The equivalent of Perl's <tt>/i</tt> option is +<a href="#setCaseSensitive">setCaseSensitive</a>(FALSE). +<p> Perl's <tt>/g</tt> option can be emulated using a <a href="#cap_in_a_loop">loop</a>. +<p> In TQRegExp <b>.</b> matches any character, therefore all TQRegExp +regexps have the equivalent of Perl's <tt>/s</tt> option. TQRegExp +does not have an equivalent to Perl's <tt>/m</tt> option, but this +can be emulated in various ways for example by splitting the input +into lines or by looping with a regexp that searches for newlines. +<p> Because TQRegExp is string oriented there are no \A, \Z or \z +assertions. The \G assertion is not supported but can be emulated +in a loop. +<p> Perl's $& is <a href="#cap">cap</a>(0) or <a href="#capturedTexts">capturedTexts</a>()[0]. There are no TQRegExp +equivalents for $`, $' or $+. Perl's capturing variables, $1, $2, +... correspond to cap(1) or capturedTexts()[1], cap(2) or +capturedTexts()[2], etc. +<p> To substitute a pattern use <a href="ntqstring.html#replace">TQString::replace</a>(). +<p> Perl's extended <tt>/x</tt> syntax is not supported, nor are +directives, e.g. (?i), or regexp comments, e.g. (?#comment). On +the other hand, C++'s rules for literal strings can be used to +achieve the same: +<pre> + TQRegExp mark( "\\b" // word boundary + "[Mm]ark" // the word we want to match + ); + </pre> + +<p> Both zero-width positive and zero-width negative lookahead +assertions (?=pattern) and (?!pattern) are supported with the same +syntax as Perl. Perl's lookbehind assertions, "independent" +subexpressions and conditional expressions are not supported. +<p> Non-capturing parentheses are also supported, with the same +(?:pattern) syntax. +<p> See <a href="ntqstringlist.html#split">TQStringList::split</a>() and <a href="ntqstringlist.html#join">TQStringList::join</a>() for equivalents +to Perl's split and join functions. +<p> Note: because C++ transforms \'s they must be written <em>twice</em> in +code, e.g. <b>\b</b> must be written <b>\\b</b>. +<p> <a name="code-examples"></a> +<h3> Code Examples +</h3> +<a name="1-8"></a><p> <pre> + TQRegExp rx( "^\\d\\d?$" ); // match integers 0 to 99 + rx.<a href="#search">search</a>( "123" ); // returns -1 (no match) + rx.<a href="#search">search</a>( "-6" ); // returns -1 (no match) + rx.<a href="#search">search</a>( "6" ); // returns 0 (matched as position 0) + </pre> + +<p> The third string matches '<u>6</u>'. This is a simple validation +regexp for integers in the range 0 to 99. +<p> <pre> + TQRegExp rx( "^\\S+$" ); // match strings without whitespace + rx.<a href="#search">search</a>( "Hello world" ); // returns -1 (no match) + rx.<a href="#search">search</a>( "This_is-OK" ); // returns 0 (matched at position 0) + </pre> + +<p> The second string matches '<u>This_is-OK</u>'. We've used the +character set abbreviation '\S' (non-whitespace) and the anchors +to match strings which contain no whitespace. +<p> In the following example we match strings containing 'mail' or +'letter' or 'correspondence' but only match whole words i.e. not +'email' +<p> <pre> + TQRegExp rx( "\\b(mail|letter|correspondence)\\b" ); + rx.<a href="#search">search</a>( "I sent you an email" ); // returns -1 (no match) + rx.<a href="#search">search</a>( "Please write the letter" ); // returns 17 + </pre> + +<p> The second string matches "Please write the <u>letter</u>". The +word 'letter' is also captured (because of the parentheses). We +can see what text we've captured like this: +<p> <pre> + <a href="ntqstring.html">TQString</a> captured = rx.cap( 1 ); // captured == "letter" + </pre> + +<p> This will capture the text from the first set of capturing +parentheses (counting capturing left parentheses from left to +right). The parentheses are counted from 1 since <a href="#cap">cap</a>( 0 ) is the +whole matched regexp (equivalent to '&' in most regexp engines). +<p> <pre> + TQRegExp rx( "&(?!amp;)" ); // match ampersands but not &amp; + <a href="ntqstring.html">TQString</a> line1 = "This & that"; + line1.<a href="ntqstring.html#replace">replace</a>( rx, "&amp;" ); + // line1 == "This &amp; that" + <a href="ntqstring.html">TQString</a> line2 = "His &amp; hers & theirs"; + line2.<a href="ntqstring.html#replace">replace</a>( rx, "&amp;" ); + // line2 == "His &amp; hers &amp; theirs" + </pre> + +<p> Here we've passed the TQRegExp to <a href="ntqstring.html">TQString</a>'s replace() function to +replace the matched text with new text. +<p> <pre> + <a href="ntqstring.html">TQString</a> str = "One Eric another Eirik, and an Ericsson." + " How many Eiriks, Eric?"; + TQRegExp rx( "\\b(Eric|Eirik)\\b" ); // match Eric or Eirik + int pos = 0; // where we are in the string + int count = 0; // how many Eric and Eirik's we've counted + while ( pos >= 0 ) { + pos = rx.<a href="#search">search</a>( str, pos ); + if ( pos >= 0 ) { + pos++; // move along in str + count++; // count our Eric or Eirik + } + } + </pre> + +<p> We've used the <a href="#search">search</a>() function to repeatedly match the regexp in +the string. Note that instead of moving forward by one character +at a time <tt>pos++</tt> we could have written <tt>pos += rx.matchedLength()</tt> to skip over the already matched string. The +count will equal 3, matching 'One <u>Eric</u> another +<u>Eirik</u>, and an Ericsson. How many Eiriks, <u>Eric</u>?'; it +doesn't match 'Ericsson' or 'Eiriks' because they are not bounded +by non-word boundaries. +<p> One common use of regexps is to split lines of delimited data into +their component fields. +<p> <pre> + str = "Trolltech AS\twww.trolltech.com\tNorway"; + <a href="ntqstring.html">TQString</a> company, web, country; + rx.setPattern( "^([^\t]+)\t([^\t]+)\t([^\t]+)$" ); + if ( rx.search( str ) != -1 ) { + company = rx.cap( 1 ); + web = rx.cap( 2 ); + country = rx.cap( 3 ); + } + </pre> + +<p> In this example our input lines have the format company name, web +address and country. Unfortunately the regexp is rather long and +not very versatile -- the code will break if we add any more +fields. A simpler and better solution is to look for the +separator, '\t' in this case, and take the surrounding text. The +<a href="ntqstringlist.html">TQStringList</a> split() function can take a separator string or regexp +as an argument and split a string accordingly. +<p> <pre> + <a href="ntqstringlist.html">TQStringList</a> field = TQStringList::<a href="ntqstringlist.html#split">split</a>( "\t", str ); + </pre> + +<p> Here field[0] is the company, field[1] the web address and so on. +<p> To imitate the matching of a shell we can use wildcard mode. +<p> <pre> + TQRegExp rx( "*.html" ); // invalid regexp: * doesn't quantify anything + rx.<a href="#setWildcard">setWildcard</a>( TRUE ); // now it's a valid wildcard regexp + rx.<a href="#exactMatch">exactMatch</a>( "index.html" ); // returns TRUE + rx.<a href="#exactMatch">exactMatch</a>( "default.htm" ); // returns FALSE + rx.<a href="#exactMatch">exactMatch</a>( "readme.txt" ); // returns FALSE + </pre> + +<p> Wildcard matching can be convenient because of its simplicity, but +any wildcard regexp can be defined using full regexps, e.g. +<b>.*\.html$</b>. Notice that we can't match both <tt>.html</tt> and <tt>.htm</tt> files with a wildcard unless we use <b>*.htm*</b> which will +also match 'test.html.bak'. A full regexp gives us the precision +we need, <b>.*\.html?$</b>. +<p> TQRegExp can match case insensitively using <a href="#setCaseSensitive">setCaseSensitive</a>(), and +can use non-greedy matching, see <a href="#setMinimal">setMinimal</a>(). By default TQRegExp +uses full regexps but this can be changed with <a href="#setWildcard">setWildcard</a>(). +Searching can be forward with <a href="#search">search</a>() or backward with +<a href="#searchRev">searchRev</a>(). Captured text can be accessed using <a href="#capturedTexts">capturedTexts</a>() +which returns a string list of all captured strings, or using +<a href="#cap">cap</a>() which returns the captured string for the given index. The +<a href="#pos">pos</a>() function takes a match index and returns the position in the +string where the match was made (or -1 if there was no match). +<p> <p>See also <a href="qregexpvalidator.html">TQRegExpValidator</a>, <a href="ntqstring.html">TQString</a>, <a href="ntqstringlist.html">TQStringList</a>, <a href="misc.html">Miscellaneous Classes</a>, <a href="shared.html">Implicitly and Explicitly Shared Classes</a>, and <a href="tools.html">Non-GUI Classes</a>. + +<p> <a name="member-function-documentation"></a> + +<hr><h2>Member Type Documentation</h2> +<h3 class=fn><a name="CaretMode-enum"></a>TQRegExp::CaretMode</h3> + +<p> The CaretMode enum defines the different meanings of the caret +(<b>^</b>) in a <a href="ntqregexp.html#regular-expression">regular expression</a>. The possible values are: +<ul> +<li><tt>TQRegExp::CaretAtZero</tt> - +The caret corresponds to index 0 in the searched string. +<li><tt>TQRegExp::CaretAtOffset</tt> - +The caret corresponds to the start offset of the search. +<li><tt>TQRegExp::CaretWontMatch</tt> - +The caret never matches. +</ul> +<hr><h2>Member Function Documentation</h2> +<h3 class=fn><a name="TQRegExp"></a>TQRegExp::TQRegExp () +</h3> +Constructs an empty regexp. +<p> <p>See also <a href="#isValid">isValid</a>() and <a href="#errorString">errorString</a>(). + +<h3 class=fn><a name="TQRegExp-2"></a>TQRegExp::TQRegExp ( const <a href="ntqstring.html">TQString</a> & pattern, bool caseSensitive = TRUE, bool wildcard = FALSE ) +</h3> +Constructs a <a href="ntqregexp.html#regular-expression">regular expression</a> object for the given <em>pattern</em> +string. The pattern must be given using wildcard notation if <em>wildcard</em> is TRUE (default is FALSE). The pattern is case +sensitive, unless <em>caseSensitive</em> is FALSE. Matching is greedy +(maximal), but can be changed by calling <a href="#setMinimal">setMinimal</a>(). +<p> <p>See also <a href="#setPattern">setPattern</a>(), <a href="#setCaseSensitive">setCaseSensitive</a>(), <a href="#setWildcard">setWildcard</a>(), and <a href="#setMinimal">setMinimal</a>(). + +<h3 class=fn><a name="TQRegExp-3"></a>TQRegExp::TQRegExp ( const <a href="ntqregexp.html">TQRegExp</a> & rx ) +</h3> +Constructs a <a href="ntqregexp.html#regular-expression">regular expression</a> as a copy of <em>rx</em>. +<p> <p>See also <a href="#operator-eq">operator=</a>(). + +<h3 class=fn><a name="~TQRegExp"></a>TQRegExp::~TQRegExp () +</h3> +Destroys the <a href="ntqregexp.html#regular-expression">regular expression</a> and cleans up its internal data. + +<h3 class=fn><a href="ntqstring.html">TQString</a> <a name="cap"></a>TQRegExp::cap ( int nth = 0 ) +</h3> +Returns the text captured by the <em>nth</em> subexpression. The entire +match has index 0 and the parenthesized subexpressions have +indices starting from 1 (excluding non-capturing parentheses). +<p> <pre> + TQRegExp rxlen( "(\\d+)(?:\\s*)(cm|inch)" ); + int pos = rxlen.<a href="#search">search</a>( "Length: 189cm" ); + if ( pos > -1 ) { + <a href="ntqstring.html">TQString</a> value = rxlen.<a href="#cap">cap</a>( 1 ); // "189" + <a href="ntqstring.html">TQString</a> unit = rxlen.<a href="#cap">cap</a>( 2 ); // "cm" + // ... + } + </pre> + +<p> The order of elements matched by <a href="#cap">cap</a>() is as follows. The first +element, cap(0), is the entire matching string. Each subsequent +element corresponds to the next capturing open left parentheses. +Thus cap(1) is the text of the first capturing parentheses, cap(2) +is the text of the second, and so on. +<p> <a name="cap_in_a_loop"></a> +Some patterns may lead to a number of matches which cannot be +determined in advance, for example: +<p> <pre> + TQRegExp rx( "(\\d+)" ); + str = "Offsets: 12 14 99 231 7"; + <a href="ntqstringlist.html">TQStringList</a> list; + pos = 0; + while ( pos >= 0 ) { + pos = rx.<a href="#search">search</a>( str, pos ); + if ( pos > -1 ) { + list += rx.<a href="#cap">cap</a>( 1 ); + pos += rx.<a href="#matchedLength">matchedLength</a>(); + } + } + // list contains "12", "14", "99", "231", "7" + </pre> + +<p> <p>See also <a href="#capturedTexts">capturedTexts</a>(), <a href="#pos">pos</a>(), <a href="#exactMatch">exactMatch</a>(), <a href="#search">search</a>(), and <a href="#searchRev">searchRev</a>(). + +<p>Examples: <a href="archivesearch-example.html#x479">network/archivesearch/archivedialog.ui.h</a> and <a href="regexptester-example.html#x2485">regexptester/regexptester.cpp</a>. +<h3 class=fn><a href="ntqstringlist.html">TQStringList</a> <a name="capturedTexts"></a>TQRegExp::capturedTexts () +</h3> +Returns a list of the captured text strings. +<p> The first string in the list is the entire matched string. Each +subsequent list element contains a string that matched a +(capturing) subexpression of the regexp. +<p> For example: +<pre> + TQRegExp rx( "(\\d+)(\\s*)(cm|inch(es)?)" ); + int pos = rx.<a href="#search">search</a>( "Length: 36 inches" ); + <a href="ntqstringlist.html">TQStringList</a> list = rx.<a href="#capturedTexts">capturedTexts</a>(); + // list is now ( "36 inches", "36", " ", "inches", "es" ) + </pre> + +<p> The above example also captures elements that may be present but +which we have no interest in. This problem can be solved by using +non-capturing parentheses: +<p> <pre> + TQRegExp rx( "(\\d+)(?:\\s*)(cm|inch(?:es)?)" ); + int pos = rx.<a href="#search">search</a>( "Length: 36 inches" ); + <a href="ntqstringlist.html">TQStringList</a> list = rx.<a href="#capturedTexts">capturedTexts</a>(); + // list is now ( "36 inches", "36", "inches" ) + </pre> + +<p> Note that if you want to iterate over the list, you should iterate +over a copy, e.g. +<pre> + <a href="ntqstringlist.html">TQStringList</a> list = rx.capturedTexts(); + TQStringList::Iterator it = list.<a href="ntqvaluelist.html#begin">begin</a>(); + while( it != list.<a href="ntqvaluelist.html#end">end</a>() ) { + myProcessing( *it ); + ++it; + } + </pre> + +<p> Some regexps can match an indeterminate number of times. For +example if the input string is "Offsets: 12 14 99 231 7" and the +regexp, <tt>rx</tt>, is <b>(\d+)+</b>, we would hope to get a list of +all the numbers matched. However, after calling +<tt>rx.search(str)</tt>, <a href="#capturedTexts">capturedTexts</a>() will return the list ( "12", +"12" ), i.e. the entire match was "12" and the first subexpression +matched was "12". The correct approach is to use <a href="#cap">cap</a>() in a <a href="#cap_in_a_loop">loop</a>. +<p> The order of elements in the string list is as follows. The first +element is the entire matching string. Each subsequent element +corresponds to the next capturing open left parentheses. Thus +capturedTexts()[1] is the text of the first capturing parentheses, +capturedTexts()[2] is the text of the second and so on +(corresponding to $1, $2, etc., in some other regexp languages). +<p> <p>See also <a href="#cap">cap</a>(), <a href="#pos">pos</a>(), <a href="#exactMatch">exactMatch</a>(), <a href="#search">search</a>(), and <a href="#searchRev">searchRev</a>(). + +<h3 class=fn>bool <a name="caseSensitive"></a>TQRegExp::caseSensitive () const +</h3> +Returns TRUE if case sensitivity is enabled; otherwise returns +FALSE. The default is TRUE. +<p> <p>See also <a href="#setCaseSensitive">setCaseSensitive</a>(). + +<h3 class=fn><a href="ntqstring.html">TQString</a> <a name="errorString"></a>TQRegExp::errorString () +</h3> +Returns a text string that explains why a regexp pattern is +invalid the case being; otherwise returns "no error occurred". +<p> <p>See also <a href="#isValid">isValid</a>(). + +<p>Example: <a href="regexptester-example.html#x2486">regexptester/regexptester.cpp</a>. +<h3 class=fn><a href="ntqstring.html">TQString</a> <a name="escape"></a>TQRegExp::escape ( const <a href="ntqstring.html">TQString</a> & str )<tt> [static]</tt> +</h3> +Returns the string <em>str</em> with every regexp special character +escaped with a backslash. The special characters are $, (, ), *, +, +., ?, [, \, ], ^, {, | and }. +<p> Example: +<pre> + s1 = TQRegExp::<a href="#escape">escape</a>( "bingo" ); // s1 == "bingo" + s2 = TQRegExp::<a href="#escape">escape</a>( "f(x)" ); // s2 == "f\\(x\\)" + </pre> + +<p> This function is useful to construct regexp patterns dynamically: +<p> <pre> + TQRegExp rx( "(" + TQRegExp::escape(name) + + "|" + TQRegExp::escape(alias) + ")" ); + </pre> + + +<h3 class=fn>bool <a name="exactMatch"></a>TQRegExp::exactMatch ( const <a href="ntqstring.html">TQString</a> & str ) const +</h3> +Returns TRUE if <em>str</em> is matched exactly by this <a href="ntqregexp.html#regular-expression">regular expression</a>; otherwise returns FALSE. You can determine how much of +the string was matched by calling <a href="#matchedLength">matchedLength</a>(). +<p> For a given regexp string, R, <a href="#exactMatch">exactMatch</a>("R") is the equivalent of +<a href="#search">search</a>("^R$") since exactMatch() effectively encloses the regexp +in the start of string and end of string anchors, except that it +sets matchedLength() differently. +<p> For example, if the regular expression is <b>blue</b>, then +exactMatch() returns TRUE only for input <tt>blue</tt>. For inputs <tt>bluebell</tt>, <tt>blutak</tt> and <tt>lightblue</tt>, exactMatch() returns FALSE +and matchedLength() will return 4, 3 and 0 respectively. +<p> Although const, this function sets matchedLength(), +<a href="#capturedTexts">capturedTexts</a>() and <a href="#pos">pos</a>(). +<p> <p>See also <a href="#search">search</a>(), <a href="#searchRev">searchRev</a>(), and <a href="qregexpvalidator.html">TQRegExpValidator</a>. + +<h3 class=fn>bool <a name="isEmpty"></a>TQRegExp::isEmpty () const +</h3> +Returns TRUE if the pattern string is empty; otherwise returns +FALSE. +<p> If you call <a href="#exactMatch">exactMatch</a>() with an empty pattern on an empty string +it will return TRUE; otherwise it returns FALSE since it operates +over the whole string. If you call <a href="#search">search</a>() with an empty pattern +on <em>any</em> string it will return the start offset (0 by default) +because the empty pattern matches the 'emptiness' at the start of +the string. In this case the length of the match returned by +<a href="#matchedLength">matchedLength</a>() will be 0. +<p> See <a href="ntqstring.html#isEmpty">TQString::isEmpty</a>(). + +<h3 class=fn>bool <a name="isValid"></a>TQRegExp::isValid () const +</h3> +Returns TRUE if the <a href="ntqregexp.html#regular-expression">regular expression</a> is valid; otherwise returns +FALSE. An invalid regular expression never matches. +<p> The pattern <b>[a-z</b> is an example of an invalid pattern, since +it lacks a closing square bracket. +<p> Note that the validity of a regexp may also depend on the setting +of the wildcard flag, for example <b>*.html</b> is a valid +wildcard regexp but an invalid full regexp. +<p> <p>See also <a href="#errorString">errorString</a>(). + +<p>Example: <a href="regexptester-example.html#x2487">regexptester/regexptester.cpp</a>. +<h3 class=fn>int <a name="match"></a>TQRegExp::match ( const <a href="ntqstring.html">TQString</a> & str, int index = 0, int * len = 0, bool indexIsStart = TRUE ) const +</h3> <b>This function is obsolete.</b> It is provided to keep old source working. We strongly advise against using it in new code. +<p> Attempts to match in <em>str</em>, starting from position <em>index</em>. +Returns the position of the match, or -1 if there was no match. +<p> The length of the match is stored in <em>*len</em>, unless <em>len</em> is a +null pointer. +<p> If <em>indexIsStart</em> is TRUE (the default), the position <em>index</em> in +the string will match the start of string anchor, <b>^</b>, in the +regexp, if present. Otherwise, position 0 in <em>str</em> will match. +<p> Use <a href="#search">search</a>() and <a href="#matchedLength">matchedLength</a>() instead of this function. +<p> <p>See also <a href="ntqstring.html#mid">TQString::mid</a>() and <a href="qconststring.html">TQConstString</a>. + +<p>Example: <a href="qmag-example.html#x1791">qmag/qmag.cpp</a>. +<h3 class=fn>int <a name="matchedLength"></a>TQRegExp::matchedLength () const +</h3> +Returns the length of the last matched string, or -1 if there was +no match. +<p> <p>See also <a href="#exactMatch">exactMatch</a>(), <a href="#search">search</a>(), and <a href="#searchRev">searchRev</a>(). + +<p>Examples: <a href="archivesearch-example.html#x480">network/archivesearch/archivedialog.ui.h</a> and <a href="regexptester-example.html#x2488">regexptester/regexptester.cpp</a>. +<h3 class=fn>bool <a name="minimal"></a>TQRegExp::minimal () const +</h3> +Returns TRUE if minimal (non-greedy) matching is enabled; +otherwise returns FALSE. +<p> <p>See also <a href="#setMinimal">setMinimal</a>(). + +<h3 class=fn>int <a name="numCaptures"></a>TQRegExp::numCaptures () const +</h3> +Returns the number of captures contained in the <a href="ntqregexp.html#regular-expression">regular expression</a>. + +<p>Example: <a href="regexptester-example.html#x2489">regexptester/regexptester.cpp</a>. +<h3 class=fn>bool <a name="operator!-eq"></a>TQRegExp::operator!= ( const <a href="ntqregexp.html">TQRegExp</a> & rx ) const +</h3> + +<p> Returns TRUE if this <a href="ntqregexp.html#regular-expression">regular expression</a> is not equal to <em>rx</em>; +otherwise returns FALSE. +<p> <p>See also <a href="#operator-eq-eq">operator==</a>(). + +<h3 class=fn><a href="ntqregexp.html">TQRegExp</a> & <a name="operator-eq"></a>TQRegExp::operator= ( const <a href="ntqregexp.html">TQRegExp</a> & rx ) +</h3> +Copies the <a href="ntqregexp.html#regular-expression">regular expression</a> <em>rx</em> and returns a reference to the +copy. The case sensitivity, wildcard and minimal matching options +are also copied. + +<h3 class=fn>bool <a name="operator-eq-eq"></a>TQRegExp::operator== ( const <a href="ntqregexp.html">TQRegExp</a> & rx ) const +</h3> +Returns TRUE if this <a href="ntqregexp.html#regular-expression">regular expression</a> is equal to <em>rx</em>; +otherwise returns FALSE. +<p> Two TQRegExp objects are equal if they have the same pattern +strings and the same settings for case sensitivity, wildcard and +minimal matching. + +<h3 class=fn><a href="ntqstring.html">TQString</a> <a name="pattern"></a>TQRegExp::pattern () const +</h3> +Returns the pattern string of the <a href="ntqregexp.html#regular-expression">regular expression</a>. The pattern +has either regular expression syntax or wildcard syntax, depending +on <a href="#wildcard">wildcard</a>(). +<p> <p>See also <a href="#setPattern">setPattern</a>(). + +<h3 class=fn>int <a name="pos"></a>TQRegExp::pos ( int nth = 0 ) +</h3> +Returns the position of the <em>nth</em> captured text in the searched +string. If <em>nth</em> is 0 (the default), <a href="#pos">pos</a>() returns the position +of the whole match. +<p> Example: +<pre> + TQRegExp rx( "/([a-z]+)/([a-z]+)" ); + rx.<a href="#search">search</a>( "Output /dev/null" ); // returns 7 (position of /dev/null) + rx.<a href="#pos">pos</a>( 0 ); // returns 7 (position of /dev/null) + rx.<a href="#pos">pos</a>( 1 ); // returns 8 (position of dev) + rx.<a href="#pos">pos</a>( 2 ); // returns 12 (position of null) + </pre> + +<p> For zero-length matches, pos() always returns -1. (For example, if +<a href="#cap">cap</a>(4) would return an empty string, pos(4) returns -1.) This is +due to an implementation tradeoff. +<p> <p>See also <a href="#capturedTexts">capturedTexts</a>(), <a href="#exactMatch">exactMatch</a>(), <a href="#search">search</a>(), and <a href="#searchRev">searchRev</a>(). + +<h3 class=fn>int <a name="search"></a>TQRegExp::search ( const <a href="ntqstring.html">TQString</a> & str, int offset = 0, <a href="ntqregexp.html#CaretMode-enum">CaretMode</a> caretMode = CaretAtZero ) const +</h3> +Attempts to find a match in <em>str</em> from position <em>offset</em> (0 by +default). If <em>offset</em> is -1, the search starts at the last +character; if -2, at the next to last character; etc. +<p> Returns the position of the first match, or -1 if there was no +match. +<p> The <em>caretMode</em> parameter can be used to instruct whether <b>^</b> +should match at index 0 or at <em>offset</em>. +<p> You might prefer to use <a href="ntqstring.html#find">TQString::find</a>(), <a href="ntqstring.html#contains">TQString::contains</a>() or +even <a href="ntqstringlist.html#grep">TQStringList::grep</a>(). To replace matches use +<a href="ntqstring.html#replace">TQString::replace</a>(). +<p> Example: +<pre> + <a href="ntqstring.html">TQString</a> str = "offsets: 1.23 .50 71.00 6.00"; + TQRegExp rx( "\\d*\\.\\d+" ); // primitive floating point matching + int count = 0; + int pos = 0; + while ( (pos = rx.<a href="#search">search</a>(str, pos)) != -1 ) { + count++; + pos += rx.<a href="#matchedLength">matchedLength</a>(); + } + // pos will be 9, 14, 18 and finally 24; count will end up as 4 + </pre> + +<p> Although const, this function sets <a href="#matchedLength">matchedLength</a>(), +<a href="#capturedTexts">capturedTexts</a>() and <a href="#pos">pos</a>(). +<p> <p>See also <a href="#searchRev">searchRev</a>() and <a href="#exactMatch">exactMatch</a>(). + +<p>Examples: <a href="archivesearch-example.html#x481">network/archivesearch/archivedialog.ui.h</a> and <a href="regexptester-example.html#x2490">regexptester/regexptester.cpp</a>. +<h3 class=fn>int <a name="searchRev"></a>TQRegExp::searchRev ( const <a href="ntqstring.html">TQString</a> & str, int offset = -1, <a href="ntqregexp.html#CaretMode-enum">CaretMode</a> caretMode = CaretAtZero ) const +</h3> +Attempts to find a match backwards in <em>str</em> from position <em>offset</em>. If <em>offset</em> is -1 (the default), the search starts at the +last character; if -2, at the next to last character; etc. +<p> Returns the position of the first match, or -1 if there was no +match. +<p> The <em>caretMode</em> parameter can be used to instruct whether <b>^</b> +should match at index 0 or at <em>offset</em>. +<p> Although const, this function sets <a href="#matchedLength">matchedLength</a>(), +<a href="#capturedTexts">capturedTexts</a>() and <a href="#pos">pos</a>(). +<p> <b>Warning:</b> Searching backwards is much slower than searching +forwards. +<p> <p>See also <a href="#search">search</a>() and <a href="#exactMatch">exactMatch</a>(). + +<h3 class=fn>void <a name="setCaseSensitive"></a>TQRegExp::setCaseSensitive ( bool sensitive ) +</h3> +Sets case sensitive matching to <em>sensitive</em>. +<p> If <em>sensitive</em> is TRUE, <b>\.txt$</b> matches <tt>readme.txt</tt> but +not <tt>README.TXT</tt>. +<p> <p>See also <a href="#caseSensitive">caseSensitive</a>(). + +<p>Example: <a href="regexptester-example.html#x2491">regexptester/regexptester.cpp</a>. +<h3 class=fn>void <a name="setMinimal"></a>TQRegExp::setMinimal ( bool minimal ) +</h3> +Enables or disables minimal matching. If <em>minimal</em> is FALSE, +matching is greedy (maximal) which is the default. +<p> For example, suppose we have the input string "We must be +<b>bold</b>, very <b>bold</b>!" and the pattern +<b><b>.*</b></b>. With the default greedy (maximal) matching, +the match is "We must be <u><b>bold</b>, very +<b>bold</b></u>!". But with minimal (non-greedy) matching the +first match is: "We must be <u><b>bold</b></u>, very +<b>bold</b>!" and the second match is "We must be <b>bold</b>, +very <u><b>bold</b></u>!". In practice we might use the pattern +<b><b>[^<]+</b></b> instead, although this will still fail for +nested tags. +<p> <p>See also <a href="#minimal">minimal</a>(). + +<p>Examples: <a href="archivesearch-example.html#x482">network/archivesearch/archivedialog.ui.h</a> and <a href="regexptester-example.html#x2492">regexptester/regexptester.cpp</a>. +<h3 class=fn>void <a name="setPattern"></a>TQRegExp::setPattern ( const <a href="ntqstring.html">TQString</a> & pattern ) +</h3> +Sets the pattern string to <em>pattern</em>. The case sensitivity, +wildcard and minimal matching options are not changed. +<p> <p>See also <a href="#pattern">pattern</a>(). + +<h3 class=fn>void <a name="setWildcard"></a>TQRegExp::setWildcard ( bool wildcard ) +</h3> +Sets the wildcard mode for the <a href="ntqregexp.html#regular-expression">regular expression</a>. The default is +FALSE. +<p> Setting <em>wildcard</em> to TRUE enables simple shell-like wildcard +matching. (See <a href="#wildcard-matching">wildcard matching + (globbing)</a>.) +<p> For example, <b>r*.txt</b> matches the string <tt>readme.txt</tt> in +wildcard mode, but does not match <tt>readme</tt>. +<p> <p>See also <a href="#wildcard">wildcard</a>(). + +<p>Example: <a href="regexptester-example.html#x2493">regexptester/regexptester.cpp</a>. +<h3 class=fn>bool <a name="wildcard"></a>TQRegExp::wildcard () const +</h3> +Returns TRUE if wildcard mode is enabled; otherwise returns FALSE. +The default is FALSE. +<p> <p>See also <a href="#setWildcard">setWildcard</a>(). + +<!-- eof --> +<hr><p> +This file is part of the <a href="index.html">TQt toolkit</a>. +Copyright © 1995-2007 +<a href="http://www.trolltech.com/">Trolltech</a>. All Rights Reserved.<p><address><hr><div align=center> +<table width=100% cellspacing=0 border=0><tr> +<td>Copyright © 2007 +<a href="troll.html">Trolltech</a><td align=center><a href="trademarks.html">Trademarks</a> +<td align=right><div align=right>TQt 3.3.8</div> +</table></div></address></body> +</html> |