<br>Oh, the class prints the file names in it&#39;s print method:<br><br><span style="font-family: courier new,monospace;">nameGenerator-&gt;Print(std::cout, 0);</span><br style="font-family: courier new,monospace;"><br>A function to print only the file names (based on the test code):<br>

<br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">void printFileNames(std::vector&lt;std::string&gt; fileNames)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    {</span><br style="font-family: courier new,monospace;">

<span style="font-family: courier new,monospace;">    std::vector&lt;std::string&gt;::iterator nameIdx;</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">    std::cout &lt;&lt; &quot;File names --------&quot; &lt;&lt; std::endl;</span><br style="font-family: courier new,monospace;">

<span style="font-family: courier new,monospace;">    for (nameIdx = fileNames.begin();</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        nameIdx != fileNames.end();</span><br style="font-family: courier new,monospace;">

<span style="font-family: courier new,monospace;">        nameIdx++)</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        {</span><br style="font-family: courier new,monospace;">

<span style="font-family: courier new,monospace;">        std::cout &lt;&lt; &quot;File: &quot; &lt;&lt; (*nameIdx).c_str() &lt;&lt; std::endl;</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">        }</span><br style="font-family: courier new,monospace;">

<span style="font-family: courier new,monospace;">    } </span><br style="font-family: courier new,monospace;"><br>Called as a regular function like:<br><br><span style="font-family: courier new,monospace;">printFileNames(nameGenerator-&gt;GetFileNames());</span><br style="font-family: courier new,monospace;">

<br>(BTW, why is the c_str() format used instead of<span style="font-family: courier new,monospace;"> std::cout &lt;&lt; &quot;File: &quot; &lt;&lt; *nameIdx &lt;&lt; std::endl;</span>).<br style="font-family: courier new,monospace;">

<br>As a method implementation, the call could simplify to something like:<br><br><span style="font-family: courier new,monospace;">nameGenerator-&gt;</span><span style="font-family: courier new,monospace;">PrintFileNames();</span><span style="font-family: courier new,monospace;"><br>

</span><br>Or it would have no input args and extract the file names within the method implementation (as in the Print method).<br><br>Take care,<br>Darren<br><br><br><br><div class="gmail_quote">On Wed, Jun 3, 2009 at 11:59 AM, Darren Weber <span dir="ltr">&lt;<a href="mailto:darren.weber.lists@gmail.com">darren.weber.lists@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>Thanks, Bill.<br><br>It&#39;s not clear from the header what the regex engine is.  Is it a custom regex or does it include a common regex library? e.g.:<br>

<a href="http://www.gnu.org/s/libc/manual/html_node/Pattern-Matching.html" target="_blank">http://www.gnu.org/s/libc/manual/html_node/Pattern-Matching.html</a><br>

<a href="http://www.gnu.org/s/libc/manual/html_node/Regular-Expressions.html" target="_blank">http://www.gnu.org/s/libc/manual/html_node/Regular-Expressions.html</a><br><a href="http://www.pcre.org/" target="_blank">http://www.pcre.org/</a><br>

<br>If the class uses a regex library, the documentation could point to online resources that define the regex language.  Unfortunately, there are subtle differences among regex libraries and it can be difficult to debug a regex without extended documentation and examples.<br>


<br><a href="http://www.regular-expressions.info/reference.html" target="_blank">http://www.regular-expressions.info/reference.html</a><br><a href="http://www.regular-expressions.info/refadv.html" target="_blank">http://www.regular-expressions.info/refadv.html</a><br>


<a href="http://www.regular-expressions.info/refext.html" target="_blank">http://www.regular-expressions.info/refext.html</a><br><a href="http://www.regular-expressions.info/refflavors.html" target="_blank">http://www.regular-expressions.info/refflavors.html</a><br>


<br>Now I understand that the subMatch is a component of the regex.  (Why didn&#39;t I understand that from the description?)  The header comment makes it clear that this is<br><br><span style="font-family: courier new,monospace;">/** The index of the submatch that will be used to sort the matches. */</span><br style="font-family: courier new,monospace;">


<br>May I suggest this phrase is included in the description.<br><br>So the sub-regex must be defined within the regex using the () notation.  For example, the test contains:<br><br><span style="font-family: courier new,monospace;">  fit-&gt;SetRegularExpression(&quot;[^.]*.(.*)&quot;);</span><br style="font-family: courier new,monospace;">


<span style="font-family: courier new,monospace;">  fit-&gt;SetSubMatch(1);</span><br><br>It appears that the regex engine doesn&#39;t require escapes for [] and ().  In this example, when the SetSubMatch method is called with the numeric argument, it refers to the sub-regex pattern within &quot;(.*)&quot;, which looks like it might be the filename extension (bmp, gif, png, tif, etc.).  However, the prior . is not escaped, so it&#39;s unclear whether it matches any character (&#39;.&#39;) or a period char (&#39;\.&#39; in most regex engines).<br>


<br>So the SetSubMatch method will always take a numeric argument, the index of the sub-regex (starting at 1, not 0).  For example, the following should be designed to exclude any files that begin with any number of &#39;.&#39; chars, then the file name is split into two sub-regex patterns to capture the file name and the file extension (assuming the file only has one &#39;.&#39; char in it to separate these parts of the full file name).  The period char &#39;.&#39; is not part of either sub-regex (unless the full file name has more than one).<br>


<br><span style="font-family: courier new,monospace;">  fit-&gt;SetRegularExpression(&quot;[^.]*(.*)\.(.*)$&quot;);</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">

  fit-&gt;SetSubMatch(2);</span><br>

<br><span style="font-family: courier new,monospace;">- [^.]*</span> matches zero or more &#39;.&#39; char at the beginning of the string (the &#39;.&#39; is not escaped within [ ]).<br><span style="font-family: courier new,monospace;">- (.*)\.(.*)$ </span>matches patterns like &quot;abcdef.xyz&quot; at the end of a string, sub 1 is &quot;abcdef&quot;, sub 2 is &quot;xyz&quot;.<br>


<br>In this pattern, the second subexpression should be the file extension, without a &#39;.&#39; char.  (Although the effect of this regex may depend on how greedy the .* pattern is.)  The &#39;\.&#39; prior to the second sub-expression is used to escape the usual meaning of the &#39;.&#39; char to match any char, so that the file name can be split into the file name and its extension (assuming the full file name has only one &#39;.&#39; char in it).<br>


<br>May I suggest a couple of features?<br><br>First, the SetSubMatch method could take an array of arguments.  It could be possible to sort on more than one sub-regex, with the sort precedence based on the values in the array.  In the example above, a call like the following would sort first by the file extension, then by the file name.<br>


<br><span style="font-family: courier new,monospace;">  unsigned int sub[2] = {2, 1};</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">  fit-&gt;SetSubMatch(sub);</span><br>


<br>Second, the class might include a convenience method for debugging, to print the file names.  The method might adapt some of the code in the test.  Perhaps call it PrintFileNames, PrintFileNamesSortedAlpha, PrintFileNamesSortedNumeric.<br>


<br>Thanks again!<br><br>Take care,<br><font color="#888888">Darren</font><div><div></div><div class="h5"><br><br><br><br><br><div class="gmail_quote">On Wed, Jun 3, 2009 at 5:06 AM, Bill Lorensen <span dir="ltr">&lt;<a href="mailto:bill.lorensen@gmail.com" target="_blank">bill.lorensen@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">The test Testing/Code/IO/itkRegularExpressionSeriesFileNamesTest.cxx<br>

shows how to sort and print the results.<br>

<div><div></div><div><br>

On Wed, Jun 3, 2009 at 12:59 AM, Darren Weber<br>

&lt;<a href="mailto:darren.weber.lists@gmail.com" target="_blank">darren.weber.lists@gmail.com</a>&gt; wrote:<br>

&gt;<br>

&gt; The software guide and Examples/IO/ImageSeriesReadWrite2.cxx provide some<br>

&gt; explanation of how to work with itk::RegularExpressionSeriesFileNames.<br>

&gt;<br>

&gt; However, I have not been able to find information on how to specify the<br>

&gt; regex and sort command line arguments for it.  The file name regex might be<br>

&gt; compatible with grep or sed, or some other regex engine?  What is the sort<br>

&gt; input, is it another regex?<br>

&gt;<br>

&gt; What is the best way to debug the regex and sort inputs?  What is the<br>

&gt; easiest way to get a std:cout list of the files after they are found and<br>

&gt; sorted?<br>

&gt;<br>

&gt; Thanks in advance,<br>

&gt; Darren<br>

&gt;<br>

&gt;<br>

&gt; PS,<br>

&gt;<br>

&gt; Detailed Description<br>

&gt;<br>

&gt; Generate an ordered sequence of filenames that match a regular expression.<br>

&gt;<br>

&gt; This class generates an ordered sequence of files whose filenames match a<br>

&gt; regular expression.  [What is the regex library?]  The file names are sorted<br>

&gt; using a sub expression match selected by SubMatch.  [What does this mean?]<br>

&gt; Regular expressions are a powerful, compact mechanism for parsing strings.<br>

&gt; Expressions consist of the following metacharacters:<br>

&gt;<br>

&gt; ^ Matches at beginning of a line<br>

&gt;<br>

&gt; $ Matches at end of a line<br>

&gt;<br>

&gt; . Matches any single character<br>

&gt;<br>

&gt; [ ] Matches any character(s) inside the brackets<br>

&gt;<br>

&gt; [^ ] Matches any character(s) not inside the brackets<br>

&gt;<br>

&gt; Matches any character in range on either side of a dash<br>

&gt;<br>

&gt; * Matches preceding pattern zero or more times<br>

&gt;<br>

&gt; + Matches preceding pattern one or more times<br>

&gt;<br>

&gt; ? Matches preceding pattern zero or once only<br>

&gt;<br>

&gt; () Saves a matched expression and uses it in a later match<br>

&gt;<br>

&gt; Note that more than one of these metacharacters can be used in a single<br>

&gt; regular expression in order to create complex search patterns. For example,<br>

&gt; the pattern [^ab1-9] says to match any character sequence that does not<br>

&gt; begin with the characters &quot;ab&quot; followed by numbers in the series one through<br>

&gt; nine.<br>

&gt;<br>

&gt; Definition at line 72 of file itkRegularExpressionSeriesFileNames.h.<br>

&gt;<br>

</div></div>&gt; _____________________________________<br>

&gt; Powered by <a href="http://www.kitware.com" target="_blank">www.kitware.com</a><br>

&gt;<br>

&gt; Visit other Kitware open-source projects at<br>

&gt; <a href="http://www.kitware.com/opensource/opensource.html" target="_blank">http://www.kitware.com/opensource/opensource.html</a><br>

&gt;<br>

&gt; Please keep messages on-topic and check the ITK FAQ at:<br>

&gt; <a href="http://www.itk.org/Wiki/ITK_FAQ" target="_blank">http://www.itk.org/Wiki/ITK_FAQ</a><br>

&gt;<br>

&gt; Follow this link to subscribe/unsubscribe:<br>

&gt; <a href="http://www.itk.org/mailman/listinfo/insight-users" target="_blank">http://www.itk.org/mailman/listinfo/insight-users</a><br>

&gt;<br>

&gt;<br>

</blockquote></div><br>

</div></div></blockquote></div><br>