Regular expressions in java with regex
A regular expression is a string of characters that describes a pattern in a sequence of characters. You can use the API regex for:
Example:
Example
Example
Examples
Catch groups are counted by the number of parentheses opening from left to right. In the expression (A(B(C))), there are 4 groups. Group 0 always contains the entire expression:
The substring captured by the group is returned by the group(int).
Example 1:
This program creates a regular expression that reads and checks the validity of a phone number:
Examples
Java Doc: Regular Expressions
JavaPoint: Java Regex
TutorialsPoint: Java - Regular Expressions
Expand: Regular expressions with the Java
- Validate a sequence of characters, for example, check the validity of an email or password;
- Search in a string;
- Replace a pattern or set of characters in a String.
The API java.util.regex has a single interface and three classes:
- Pattern class: A compiled representation of a regular expression. To create a pattern, you must invoke one of its methods public static compile, which will return a Pattern object. These methods accept regular expressions as argument.
- Matcher class: A pattern search engine that parses the string. You get the Matcher object by calling the Matcher method in the Pattern object. These two classes work together.
- PatternSyntaxException: throws an exception when the regular expression is invalid.
Matcher class
The methods of the Matcher:
No. | Method | Description |
---|---|---|
1 | boolean matches() | returns true if the string checks a pattern |
2 | boolean find() | < td class="tg-031e">find the next expression that checks the pattern|
3 | boolean find(int start) | find the next expression that checks the pattern from a start index |
Pattern class
This is the compiled version of a regular expression. it is used for a pattern or regex.
No. | Method | Description |
---|---|---|
1 | static Pattern compile(String regex) | compiles the regex and returns an instance of Pattern |
2 | Matcher matcher( CharSequence input) | creates a Matcher that parses the input sequence |
3 | static boolean matches(String regex, CharSequence input) | compiles and parses the input sequence. |
String[] split(CharSequence c) | returns an array of substrings that begin with the character c | |
5 | String pattern() | returns the regular expression string |
Example
import java.util.regex.*;
public class regexTest {
public static void main(String args[]) {
Pattern p;
Matcher m;
//compilation of the regex with the pattern: "a"
p = Pattern.compile("a");
//create and associate the engine with the regex on the string "ab"
m = p.matcher("ab");
//if the pattern is found
if(m.find()) {
System.out.println("pattern found");
}
}
}
pattern found
Regular expression syntax
1- Meta characters
Meta characters are characters with a meaning or in some other way, how the pattern is constructed. For example, if you precede a meta character with the character , it would not be interpreted by the parser. The metacharacters supported by java regular expressions are in the following table:Character | Description |
---|---|
[ ] | defines a set of characters within a |
{ } | Quantizer |
\ | character is not considered metacharacter |
^ | Beginning of line |
$ | Endline |
| | Operator OU |
? | 0 or once the preceding expression |
* | 0 or more than one time the preceding expression |
+ | one or more times the preceding expression |
. | Replaces any character |
Example:
import java.util.regex.Matcher;The matches()belongs to the Matcher and Pattern class, it returns true if the pattern you are looking for exists in the string.
import java.util.regex.Pattern;
public class metaCharactersExample: {
public static void main(String[] args) {
System.out.println(
Pattern.matches(".c", "abc"));//false (. only replaces a)
System.out.println(
Pattern.matches(".. c", "abc"));//true (3rd character is c)
System.out.println(
Pattern.matches("... c", "abbc"));//true (4th character is c)
System.out.println(Pattern.matches("\\d", "2"));//true (only one digit)
System.out.println(Pattern.matches("\\d", "332"));//false (multiple digits)
System.out.println(Pattern.matches(
"\\d","123abc"));//false (digits and characters)
System.out.println(Pattern.matches(
"\\D*", "geek"));//true (Unencrypted and appears at least once)
}
}
2- Character classes
A character class is a set of characters. The metacharacters [...] means a character class within a regular expression. You can define the range with the hyphen '-'. For example[0-9] represents the digits from 0 to 9.[abc] | a, b or c |
---|---|
[^abc] | Negation: Replaces all alphabet except a,b and c |
[a-zA-Z] | Range: Replaces all characters from a to z and from A to Z |
[a-d[m-p]] | Union: Replaces characters from a to d or from m to p: [a-dm-p] |
[a-z& & [abc]] | Intersection: Replaces the entire intersection of a,b, and c with characters from a to z |
[a-z& & [^cd]] | Subtraction: Replaces all characters from a to z except c and d: [abe-z] |
[a-z& & [^m-p]] | Subtraction: from a to z except from m to p: [a-lq-z] |
Example
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class metaFeaturesExample: {
public static void main(String[] args) {
Pattern p;
Matcher m;
//all digits from 0 to 9 except 345
p = Pattern.compile("[0-9& & [^345]]");
m = p.matcher("7");
boolean b = m.matches();
System.out.println(b);
}
}
true
Predefined character classes
These are the classes already defined in the Java:Class | Description |
---|---|
. | Any character |
\d | One number: [0-9] |
\D | Any character except the numbers [^0-9] |
\s | A blank character: line break, space: [ \t\n\x0B\f\r] |
\S | A non-white character: [^\s] |
\w | A word character: [a-zA-Z_0-9] |
\W | A character that is not a word: [^\w] |
Example
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class CharacterClassesExample{
public static void main(String args[]) {
//d: one digit
//+: 1 or more digits
String regex = "\\d+";
Pattern p = Pattern.compile(regex);
String phrase = "the year 2015 21";
Matcher m = p.matcher(sentence);
if (m.find()) {
//show first group
System.out.println("group 0:" + m.group(0));
}
}
}
group 0:2015In this example, the regex "\\d+" contains two slashes, because in Java you always add a '\ before. \d means the range between 0 and 9 If you remove the '+ , only the first digit found will be considered: 2.
3- Quantifiers
Quantizers allow you to set the number of times a character is repeated.Quantizers | Description |
---|---|
X? | X occurs no more than once |
X+ | One or more times |
X* | zero or multiple times |
X{n} | n times |
X{n, } | n or multiple times |
X{y,z} | at least y times but less than z times |
Examples
Motif | String | Results |
---|---|---|
[abc]? | a | a |
[abc]? | aa | none | tr>
[abc]? | ab | none |
a? | abdaaaba | {a}, { },{ },{a},{a},{a},{ },{a} |
a* | abdaaaba | {a},{ },{ },{aa},{ },{a} |
a+ | abdaaaba | {a}, {aaa},{a} |
a[3] | aaaa | aaa |
a{3, 6} | aaaaaaaa | aaaaaa | tr>
[0-9]{4} | The year 2038 bug is similar to the year 2000 bug | {2038}, {2000} |
4- Capture groups
Capture groups give the ability to process multiple characters as a single unit or sub-pattern. For example, (abc) creates a single group containing the characters "a", "b" and "c".Catch groups are counted by the number of parentheses opening from left to right. In the expression (A(B(C))), there are 4 groups. Group 0 always contains the entire expression:
- Group 0: (A(B(C)))
- Group 1: (A)
- Group 2: (B(C))
- Group 3: (C)
The substring captured by the group is returned by the group(int).
Example 1:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Goup {
public static void main(String[] args) {
Pattern p = Pattern.compile("(A(B(C)))");
Matcher m = p.matcher("ABC");
if( m.matches())
for(int i= 0; i<= m.groupCount(); ++i)
System.out.println("group "+i+" :"+m.group(i));
}
}
group 0:ABCExample 2
group 1:ABC
group 2:BC
group 3:C
This program creates a regular expression that reads and checks the validity of a phone number:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class regex_telephone {
public static void main(String[] args) {
String regex = "\\b(\\d{3})(\\d{4})(\\d{3})(\\d{3})\\b";
Pattern p = Pattern.compile(regex);
String tel = "2541724156348";
Matcher m = p.matcher(tel);
if (m.find()) {
System.out.println("Phone: ("
+ m.group(1) + ") " + m.group(2) + "-"
+ m.group(3) + "-" + m.group(4));
}
}
}
Phone: (254) 1724-156-348
5- Search boundaries
You can make your pattern more precise by specifying the location of the pattern you are looking for and where it starts.limiter | Description |
---|---|
^ | Beginning of Line |
$ | Endline |
\b | Word End. |
\B | Non-Word End. |
\A | Input Sequence Start |
\ G | End of previous occurrence |
\Z | End of sequence, except for the final character |
\z | End of sequence | tr>
Examples
Motif | String | Results |
---|---|---|
^java$ | java | java |
\s*java$ | java | java |
^hello\w* | helloblahblah | helloblahblah |
\bjava\B | javascript is a programming language | java |
\Gtest | test test | test |
\btest\b | this is a test | test |
Find a pattern and replace it
The Regex API gives you the ability to find one text and replace it with another. In Java, you can use two methods of the Matcher class to accomplish this task:- replaceFirst(String): Replaces the first occurrence only;
- replaceAll(String): Iterates and replaces all occurrences.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Replacement {
public static void main(String[] args) {
Pattern p = Pattern.compile("bus");
Matcher m = p.matcher("I'm traveling by bus");
String s = m.replaceAll("train");
System.out.println(s);
}
}
I'm travelling by trainReferences:
Java Doc: Regular Expressions
JavaPoint: Java Regex
TutorialsPoint: Java - Regular Expressions
Expand: Regular expressions with the Java