Computer Science
Grade 8
20 min
Regular Expressions
Regular Expressions
Tutorial Preview
1
Introduction & Learning Objectives
Learning Objectives
Define what a regular expression is and why it is useful.
Identify and explain at least 5 common metacharacters (e.g., \d, \w, ., +, *).
Use Python's `re.search()` function to find the first occurrence of a pattern in a string.
Use Python's `re.findall()` function to extract all occurrences of a pattern from a string.
Write a simple regular expression to validate a common data format, like a username or a zip code.
Explain the difference between a literal character and a metacharacter in a regex pattern.
Ever wondered how a website instantly knows if your email address is typed in the right format? 🤔 Let's learn the secret power of pattern matching!
In this lesson, you'll learn about Regular Expressions, often called 'regex'...
2
Key Concepts & Vocabulary
TermDefinitionExample
Regular Expression (Regex)A sequence of characters that defines a search pattern. It's used to check if a string contains the specified pattern.The pattern `\d+` is a regex that looks for one or more digits.
PatternThe string that contains the regular expression rules.In `re.search('cat', 'the cat is cute')`, the pattern is 'cat'.
Literal CharacterA character in a regex pattern that matches itself exactly.The pattern `abc` will only match the exact sequence of characters 'a', then 'b', then 'c'.
MetacharacterA character with a special meaning in a regex pattern. It doesn't match itself.The metacharacter `.` matches any single character (except a newline). So, the pattern `c.t` would match 'cat...
3
Core Syntax & Patterns
Finding the First Match: `re.search()`
re.search(pattern, string)
Scans through a string looking for the first location where the regex pattern produces a match. If a match is found, it returns a special 'match object'; otherwise, it returns `None`.
Finding All Matches: `re.findall()`
re.findall(pattern, string)
Finds all non-overlapping matches of the pattern in a string and returns them as a list of strings. If no matches are found, it returns an empty list.
Common Metacharacter Shortcuts
\d (any digit), \w (any letter, digit, or underscore), \s (any whitespace), . (any character), + (one or more), * (zero or more)
These shortcuts make patterns shorter and more readable. For example, instead of writing `[0123456789]`, you can just write `\d`.
4 more steps in this tutorial
Sign up free to access the complete tutorial with worked examples and practice.
Sign Up Free to ContinueSample Practice Questions
Challenging
You need a regex pattern to validate a product ID that must start with 'P', be followed by exactly 3 digits, and have no other characters. Which pattern is correct?
A.r'P\d+'
B.r'P\d{3}'
C.r'^P\d{3}$'
D.r'[P]\d\d\d'
Challenging
A student mistakenly uses `re.search(r'\d+', 'ID: 123, Code: 456')` to get all numbers. What does the returned match object specifically represent?
A.It represents both '123' and '456'.
B.It represents only the first match found, which is '123'.
C.It represents the entire string since a match was found.
D.It returns the list `['123']`.
Challenging
How would you modify the username pattern `r'^\w{4,12}$'` to also allow hyphens (`-`) as a valid character?
A.r'^\w-{4,12}$'
B.r'^[\w-]{4,12}$'
C.r'^\w+\-{4,12}$'
D.r'^(\w|-){4,12}$'
Want to practice and check your answers?
Sign up to access all questions with instant feedback, explanations, and progress tracking.
Start Practicing Free