September 11, 2019
Regular Expression Anchors with multiple lines
Comments
(0)
September 11, 2019
Regular Expression Anchors with multiple lines
I try to bend the internet to my will.
Newbie 34 posts
Followers: 24 people
(0)

In CFML, you can use anchors to match the start and /or end of the string you are testing against. To match the start of the string you’d use ^ and to match the end of the string you’d use $.

Here’s an example of using the ^ anchor to match the first word in a string:

s = "Apple line 1 1000";
words = s.reMatchNoCase("[a-z]+");
firstword = s.reMatchNoCase("^[a-z]+");

writeDump(words); // ["Apple","line"]
writeDump(firstword); // ["Apple"]

Here’s an example of using the $ anchor to match the last integer in a string:

s = "Apple line 1 1000";
integers = s.reMatchNoCase("[0-9]+");
lastinteger = s.reMatchNoCase("[0-9]+$");

writeDump(integers); // [1, 1000]
writeDump(lastinteger); // [1000]

Useful stuff, but what happens if you have a multi-line string?

s = "Apple line 1 1000
Banana line 2 2000";
words = s.reMatchNoCase("[a-z]+");
firstword = s.reMatchNoCase("^[a-z]+");

writeDump(words); // ["Apple","line","Banana","line"]
writeDump(firstword); // ["Apple"]

The pattern is still matching on the whole string as it did before which is what you’d expect, but what if you wanted to get the first word from each line? Enter multi-line mode!

By adding the (?m) flag to the start of the regular expression we are tell it to match per line.

s = "Apple line 1 1000
Banana line 2 2000";
words = s.reMatchNoCase("[a-z]+");
firstwords = s.reMatchNoCase("(?m)^[a-z]+");

writeDump(words); // ["Apple","line","Banana","line"]
writeDump(firstwords); // ["Apple","Banana"]

Now firstwords is an array with two elements it which are the words at the start of each line.

We can use the same trick to match the end of each line.

s = "Apple line 1 1000
Banana line 2 2000";
integers = s.reMatchNoCase("[0-9]+");
lastinteger = s.reMatchNoCase("(?m)[0-9]+$");

writeDump(integers); // [1,1000,2,2000]
writeDump(lastinteger); // [1000,2000]

So the final thing to consider is if you are in multi-line mode but only want to match the first word in the string. To do that you can use the \A anchor to match the start of the string and the \Z anchor to match the end of the string. Here it is in action:

s = "Apple line 1 1000
Banana line 2 2000";
words = s.reMatchNoCase("[a-z]+");
firstwords = s.reMatchNoCase("(?m)\A[a-z]+");

writeDump(words); // ["Apple","line","Banana","line"]
writeDump(firstwords); // ["Apple"]

Now although we are in multi-line mode it only matches the start of the string. The same applies with the \Z anchor.

s = "Apple line 1 1000
Banana line 2 2000";
integers = s.reMatchNoCase("[0-9]+");
lastinteger = s.reMatchNoCase("(?m)[0-9]+\Z");

writeDump(integers); // [1,1000,2,2000]
writeDump(lastinteger); // [2000]

Hopefully this will be useful to someone.

0 Comments
Add Comment