Tag Archives: regular expressions

Importacular and Regular Expression Transformations

Some Preamble

Regular Expression are really complicated. Even now I find it difficult to get my head around them. If you are new to them check out these two sites:

https://www.regular-expressions.info/ – a great tutorial and reference

https://regexr.com/ – a really good “playground” for testing your regular expressions

Overview

Importacular offers the user the ability to transform incoming data from one value into another. When we first started out this was simply a “from” value and a “to” value and if the incoming value matched the “from” it would change it to the “to”. That was very simple but effective. We soon realised that more power was needed so Importacular added partial matches or word matches (and clarified that the original was an “exact” match).

We also then added different replacement types too. These were “Complete” and “Partial” and later “Append” and “Prepend. If you selected “Complete” then all of the incoming value was replaced with the replacement value. If you selected “Partial” then only the matched part would be replaced keeping the remainder of the original value. “Append” and “Prepend” would add the replacement text to the end or the beginning of the original respectively.

Then we added RegEx – firstly for matching and then for replacing. The rest of this post describes how that works.

Matching

Importacular loops through each row in the data transformation grid and continues through each row unless the stop processing flag has been set.

If you choose a match type of RegEx you can put your RegEx in the “From Source” cell and Importacular will try and match on it. For example if you use this very simple RegEx:

.*

Importacular will match on any number of any character i.e. it will always match on what is found.

If you use this RegEx:

B[a-z]g

It will match on “Big”, “Bog”, “Bag” and also “Bkg” (as well as every letter from a to z).

If it finds a match it will try to replace the value

Replacing

When you replace using RegEx there are two thing to note.

  1. It does not matter how the match was made. It could be a RegEx, a complete, a partial or a word match. Replace is independent of how the match was made.
  2. Importacular does not use the classic replace mechanism of RegEx i.e. create a capture group often using parenthesis or sometime slash and parenthesis and then reference that group with a dollar e.g. $1 or $2. Importacular does not use this method!

Importacular’s replace works like this. It takes the incoming value and applies the regular expression to it in order to extract a value. That value is then used as the replacement text. For example if the incoming value is:

2022 Annual Appeal

We can extract the year by using the regular expression:

^20[0-9][0-9]

(Note that there are a number of different ways you could get the same information out using a RegEx. This is just one of them)

Say I have a US phone number and I want to get the area code. The phone number is in two different formats e.g. (415)-123-456 or 415-123-456. I can extract the area code using the following:

(?<=()[0-9]{3}|^[0-9]{3}

If I want to be really clever, I can use a second row in my transformation to transform the area code into the city. In this case after extracting “415” I would transform it to San Francisco.

Conclusion

The hardest part of using regular expressions in Importacular really is the regular expression itself. I won’t try to convince you otherwise. Hopefully this post will make it easier to use those regular expressions once you have determined what you need. Use the RegExr site (link at the top of this post) to test your matching and replace extraction before you put it into the transformation grid. Once in the transformation grid you can also check the review screen to see if it has worked as the resulting value will show up there transformed if everything has worked as expected.