A Regular expression is a set of characters that constitute a search pattern. It’s most commonly used in pattern matching with strings, also known as string matching.
Regular Expressions (RE) is a module that specifies a set of strings (patterns) that must match. Regular expressions are a generalised method of matching patterns with character sequences.
re.search()
This method returns None, if the pattern does not match otherwise returns re.MatchObject
with information about the matching component of the string. This method is more suited for checking a regular expression than extracting data because it ends after the first match.
import re regex = r"(1[0-9]{3})|(200[0-9])|(202[1-8])" match = re.search(regex, "This is 2021") if match != None: print("Current Year: % s" % (match.group(0))) else: print("The regex pattern does not match.")
Result:
Current Year: 2021
in the above code, we have imported the re module and used the regular expression for matching the string of data with the pattern, that is Year.
The regex expression should match the year and print the found string, if not found print the “The regex pattern does not match.”.
The match.group(0)
will always return the fully matched string of data,
re.findall()
The str.findall()
function is used to find all occurrences of pattern or regular expression in the Series or Index. This method returns in the form of a list of strings, with each string representing one match.
import re regex = "(202[0-9])" match = re.findall(regex, "This is 2021.The previous year was 2020 and next will be 2022") print(match)
Result:
Current Year: [‘2021’, ‘2020’, ‘2022’]
We have imported the re
module and used the regular expression for matching the string of data with the pattern, that is Year.
The regex expression should match the year string, and display the all integer that will start from 202.
The match.group(0)
will always return the fully matched string of data.