Regex course – part three. Grouping and using ES6 features.

JavaScript Regex

We covered quite a few features of regex so far. There is a lot more, though. Today we will deal with more advanced concepts, like groping and cover more of the RegExp object features in JavaScript. We will also learn how to use some of the features that ES6 brought us. Let’s go!

exec

It is a method that executes a search for a match in a string – similar to the test method – but returns a result array (or null). Its result has additional properties, like index and input

The index is the position of a match, and input is the provided string. Please note, that I am using a global flag here, that is mentioned in the first part of the course. Thanks to that, we can look for more than one match in our string, by calling exec multiple times. It will set the lastIndex property of the RegExp object to a number indicating the place where the searching stopped.

Grouping in regex

With regular expressions, we can not only check the string for matches but also extract certain information while ignoring unnecessary characters. To do this, we will use grouping with round brackets.

In this case, we extracted three groups of characters and ignored the dashes. Just note that   will be the full string of characters matched.

There is a named groups proposition that is in stage 4 already and proves to be helpful in use-cases such as the one above. It was nicely described in the article on the 2ality blog by Axel Rauschmayer.

Nested groups

You can actually nest groups:

Here, in the part   of our pattern, we nest one group in the other. Thanks to that, we get both long and short string for the year.

Conditional patterns

There is another useful feature, which is the OR statement. We can use it with the pipe character:

In our pattern,   will cause the years to match even if the second one is provided in a short form.

Capture all

While working with groups, there is a particular one that might come in handy: 

Thanks to using the   operator, it will work also if there are additional spaces:

Sticky flag

As you’ve already seen, RegExp object has a property called lastIndex. It is used when the search is global (with the use of appropriate flag) for the pattern matching to be continued in the right place. With the sticky flag , introduced in ES6, we can force the search start at a certain index.

Remember that performing a check on a string (for example with exec) changes the lastIndex property, so if you would like it to stay the same between multiple sticky searches, don’t forget to set it. If the pattern matching fails, lastIndex is set to 0.

It is a good time to note that you can check if the RegExp object has flags enabled.

Same goes for other flags: for more, visit MDN web docs.

Unicode flag

ES6 brought a better support for Unicode, too. Adding a Unicode flag,   , enables additional features related to Unicode. Thanks to it, you can use   in your patterns, where x is the code of the desired character.

It won’t work without  flag. It is important to know, that it impacts more than just that, though. It is possible to use some more exotic Unicode characters without the flag:

but it will fail us in more advanced cases:

We can easily draw a conclusion, that it is a good practice to include   flag in our patterns, especially if there is any chance that there would be characters other than just the standard ASCII.

If you combine it with the ignore case flag, the pattern will also match for both lowercase and uppercase characters.

An interesting note is that in the pattern attribute of input and textarea elements in HTML has this flag enabled by default.

Summary

Today we learned more about RegExp object in JavaScript and how we can use this knowledge with a great feature of regular expressions: grouping. We’ve also learned two new flags:  sticky and Unicode. Hopefully, you now see more and more use-cases for regular expressions. Until the next time!

Series Navigation<< Regex course – part two. Writing more elegant and precise patterns.Regex course – part four. Avoiding catastrophic backtracking using lookahead >>
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments