Breaking up character sequences using a regular expression
There was a question asked on the CFML Slack channel recently which I answered. I thought I’d share it here in case it’s useful for others.
The problem was that the user had a string which was composed of 6 digit sequences of characters. As there was no delimiters the user wanted to know how to get each ‘chunk’ of 6 digit sequences.
Here’s an example of the string they had:
ab12ergr3eba23erca
The desired output is the following array:
["ab12er","gr3eba","23erca"]
We could tackle this using a good old loop, something like this:
token = "ab12ergr3eba23erca"; result = []; while(token.len() > 6) { result.append(token.left(6)); token = token.mid(6, token.len()); } writeDump(result);
That does the job, but I think this exercise is a good example of where the power of regular expressions can solve all kinds of problems like this.
token = "ab12ergr3eba23erca"; chunks = reMatch(".{6}", token); writeDump(chunks);
You can run the above code using this link: https://cffiddle.org/app/file?filepath=d15721a5-0f9f-4c77-9ee7-3deae58a3af2/9b5ee3b9-1b76-4359-99fb-c69437bb03ec/144d2084-4c76-4397-b2f9-5fde937aaadf.cfm
We can see that we do indeed get the desired result. So how does the pattern .{6}
work? In regular expressions a ‘.’ is a special character that matches any character. The ‘{6}’ says that we want to get six matches – in this case, we are asking for 6 of any character.
I’m not an expert on regular expressions, but I do find them incredibly useful so would recommend at least learning the basics.
You must be logged in to post a comment.