Tuesday, June 10, 2008

Regex: replace multiple strings in a single pass with C#


I wish I could say I was the clever one to think of this but I ran into it in my copy of the Python Cookbook (the original author is Xavier Defrang, the Python implementation here). It's cool enough that I ported it today - I'll know I'll use the C# implementation of it quite often:

        static string MultipleReplace(string text, Dictionary replacements) {
return Regex.Replace(text,
"(" + String.Join("|", adict.Keys.ToArray()) + ")",
delegate(Match m) { return replacements[m.Value]; }
// somewhere else in code
string temp = "Jonathan Smith is a developer";
adict.Add("Jonathan", "David");
adict.Add("Smith", "Seruyange");
string rep = MultipleReplace(temp, adict);



Chris Partridge said...

This is great, thank you!

Fabian said...

Hi! Your code works well with simple replacement but will fail with complex pattern as the dictionnary key will not be the pattern but the matched string in the input.

SeeNotSoSharp said...

I get a KeyNotFoundException for the following....

private string ApplyReplacements(string text)
Dictionary replacements = new Dictionary();

string link = @"http[s]?[^\s]+";
string reTweet = @"^RT[\s][^\s]+[\s]";
string viaUser = @"via @[^\s]+[\s]";
string amperAnd = @"&";
string hashtagsAtTheEnd = @"((#[^\s]+)\s+){2,12}.{0,7}$";
string multipleDots = @"(…|[.]{2,})";
string tooManyAmperands = ".+(amp;){2,50}.*";
string multipleSpaces = @"\s{2,20}";

replacements.Add(link, string.Empty);
replacements.Add(reTweet, string.Empty);
replacements.Add(viaUser, string.Empty);
replacements.Add(amperAnd, " und ");
replacements.Add(hashtagsAtTheEnd, string.Empty);
replacements.Add(multipleDots, string.Empty);
replacements.Add(tooManyAmperands, string.Empty);
replacements.Add(multipleSpaces, " ");

return MultipleReplace(text, replacements);

private void button2_Click(object sender, EventArgs e)
reOutput.Text = ApplyReplacements(reOutput.Text);