Welcome to Professional ASP.NET - Chris Love's Official Blog Sign in | Join | Help

Chris Love's Official ASP.NET Blog

Chris Love's Helpful tips, tricks and pragmatic development knowledge for the ASP.NET world.
Add to Technorati Favorites


ASP Insider Follow Me On Twitter
More Fun with Regular Expressions : Word and Paragraph Parsing

Trolling the ASP.NET forums again this morning, I know I do it a lot, I found a question trying to parse the paragraphs out of a series of text. So I knew I had to answer it. The regular expression needed is '(.+)'. This tells the Regular Expression object to match on a series of one or more word related characters. This means it will group matches for a paragraph, indicated by a line or carriage return. Code for this solution would look like this:

public static MatchCollection GetParagraphs()
{
 using (StreamReader sr = new StreamReader(@"{Path To Sampel File}\SampleText.txt"))
 {
  string textFromFile = sr.ReadToEnd();
                Regex rg = new Regex(@"(.+)");

                return rg.Matches(textFromFile);
 }

}

I thought I would extend this to get a word count as well as all the words. In this case the expression is '(\w+)'.

public static MatchCollection GetWords()
{
 using (StreamReader sr = new StreamReader(@"{Path To Sampel File}\SampleText.txt"))
 {
  string textFromFile = sr.ReadToEnd();
                Regex rg = new Regex(@"(\w+)");

                return rg.Matches(textFromFile);
 }

}

Calling the RegEx.Matches method returns a MatchCollection, which has a Count property, can be used to get the count of matches. It can also be enumerated through to get that actual matches.

public static void WriteMatchCollectionResults(MatchCollection mc)
{
 Console.WriteLine(mc.Count);

            foreach (Match m in mc)
            {
                Console.WriteLine(m.Value);
            }
 Console.WriteLine("...........................................");
  Console.WriteLine("");
}
Posted: Tuesday, October 13, 2009 12:43 PM

by Chris Love

Comments

Dan (ASP.NET conveyancing software developer) said:

Good Stuff. It seems that regular expressions are very powerful. My first real use of them in anger was when trying to match for URL re-writing. A nice big library of "useful" examples would be very helpful. They are really quite difficult to get your head around.
# October 23, 2009 9:37 AM

dorah said:

Trolling the ASP.NET forums again this morning, I know I do it a lot, I found a question trying to parse...

# November 13, 2009 10:13 AM
Leave a Comment

(required) 

(required) 

(optional)

(required) 

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS