Regex for Beginners

How to Read a Regex Pattern (60-Second Crash Course)

You only need eight symbols to understand every pattern in this guide. Here they are, nothing more:

^ marks the start of a string. $ marks the end. Together they mean "this is the whole string, not just part of it."

\d matches any digit 0-9. \w matches any word character - letter, digit, or underscore. \s matches any whitespace character including spaces, tabs, and newlines.

[abc] matches any single character inside the brackets. [a-z] is a range - any lowercase letter. [^abc] is a negation - any character that is NOT a, b, or c.

+ means one or more of the preceding thing. * means zero or more. ? means zero or one - it makes the preceding character optional.

{3} means exactly 3 times. {2,5} means between 2 and 5 times. {2,} means at least 2 times.

() is a capture group - it saves whatever it matches so you can reference it later.

| is OR. /cat|dog/ matches either "cat" or "dog."

\ is an escape. Since . means "any character" in regex, \. means a literal dot. Same principle applies to (, ), +, *, ?, $, ^, and |.

^ / $Start and end anchors

\dAny digit 0-9

\wLetter, digit, or underscore

\sWhitespace

[abc]Any listed character

+ / * / ?Repetition controls

{3} / {2,5}Exact or ranged counts

() / |Capture group and OR

Quick example - break down /^\d{3}-\d{4}$/ character by character:

^start

\d{3}3 digits

-literal dash

\d{4}4 digits

$end

^ - start of string
\d{3} - exactly three digits
- - a literal hyphen
\d{4} - exactly four digits
$ - end of string

Matches "555-1234". Does not match "55-1234" (only two digits before the dash) or "555-12345" (five digits after it).

That is the entire theory section. Everything else is application.

10 Regex Patterns Developers Actually Reuse

Pattern 1 - Email Address (Basic Validation)

/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/

Email address/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/

OKuser@example.com

OKfirst.last@company.co.uk

OKdev+tag@gmail.com

✗user@

✗@example.com

Breakdown:

^ - start of string
[a-zA-Z0-9._%+-]+ - one or more characters that can be letters (upper or lower), digits, dots, underscores, percent signs, plus signs, or hyphens - this is the local part before the @
@ - literal at sign
[a-zA-Z0-9.-]+ - one or more characters for the domain name (letters, digits, dots, hyphens)
\. - literal dot separating domain from TLD
[a-zA-Z]{2,} - at least two letters for the TLD (.com, .org, .uk, .museum)
$ - end of string

Matches: user@example.com, first.last@company.co.uk, dev+tag@gmail.com

Does not match: user@ (no domain), @example.com (no local part)

Note: This catches obviously malformed strings. It will not tell you whether the address actually exists or whether the mailbox accepts mail. For anything important, send a verification email - regex cannot do that job.

const isValidEmail = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/.test(input);

Pattern 2 - URL (HTTP/HTTPS)

/https?:\/\/[^\s/$.?#].[^\s]*/

HTTP/HTTPS URL/https?:\/\/[^\s/$.?#].[^\s]*/

OKhttps://example.com

OKhttp://blog.site.co/path?q=1

OKhttps://api.example.com/v2/users

✗ftp://files.example.com

✗example.com

Breakdown:

https? - matches "http" or "https" - the s is made optional by ?
:\/\/ - literal "://" with the slashes escaped because / is the regex delimiter
[^\s/$.?#] - the first domain character, which cannot be whitespace or URL-structural characters
. - any single character (the dot here is unescaped, intentionally matching anything)
[^\s]* - zero or more characters that are not whitespace - this captures the rest of the URL until a space

Matches: https://example.com, http://blog.site.co/path?q=1, https://api.example.com/v2/users

Does not match: ftp://files.example.com (protocol does not start with http), example.com (no protocol prefix)

const urls = text.match(/https?:\/\/[^\s/$.?#].[^\s]*/g);

Pattern 3 - Phone Number (US Format)

/^$?\d{3}$?[-.\s]?\d{3}[-.\s]?\d{4}$/

US phone number/^$?\d{3}$?[-.\s]?\d{3}[-.\s]?\d{4}$/

OK(555) 123-4567

OK555-123-4567

OK555.123.4567

OK5551234567

✗55-123-4567

Breakdown:

^ - start of string
\(? - optional opening parenthesis (escaped because ( is a special character)
\d{3} - exactly three digits for the area code
\)? - optional closing parenthesis
[-.\s]? - optional separator: hyphen, dot, or whitespace
\d{3} - three-digit exchange
[-.\s]? - another optional separator
\d{4} - four-digit subscriber number
$ - end of string

Matches: (555) 123-4567, 555-123-4567, 555.123.4567, 5551234567

Does not match: 55-123-4567 (only two digits in area code), 555-123-45678 (five digits in the last group)

const isValidPhone = /^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$/.test(phone);

Pattern 4 - Date (YYYY-MM-DD)

/^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/

ISO-style date/^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/

OK2026-06-04

OK2025-12-31

OK2024-01-01

✗2026-13-01

✗06-04-2026

Breakdown:

\d{4} - four-digit year
- - literal hyphen
(0[1-9]|1[0-2]) - month from 01 to 12: either 0 followed by 1-9, or 1 followed by 0-2
- - literal hyphen
(0[1-9]|[12]\d|3[01]) - day from 01 to 31: either 0 + 1-9, or 1 or 2 + any digit, or 3 + 0 or 1
$ - end of string

Matches: 2026-06-04, 2025-12-31, 2024-01-01

Does not match: 2026-13-01 (month 13 does not exist), 2026-00-15 (month 00 does not exist), 06-04-2026 (wrong order, year is not four digits at the start)

Note: This validates format and plausible ranges, not calendar logic. It will accept 2026-02-30 without complaint. For actual date validation, parse with a library like date-fns or dayjs after the format check passes.

const isValidDate = /^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/.test(date);

Pattern 5 - IP Address (IPv4)

^(\d{1,3}\.){3}\d{1,3}$

IPv4 address/^(\d{1,3}\.){3}\d{1,3}$/

OK192.168.1.1

OK10.0.0.1

OK255.255.255.0

✗192.168.1

✗192.168.1.1.1

Breakdown:

^ - start of string
(\d{1,3}\.) - a group: one to three digits followed by a literal dot
{3} - that group repeated exactly three times (covers the first three octets)
\d{1,3} - one to three digits for the fourth octet, no trailing dot
$ - end of string

Matches: 192.168.1.1, 10.0.0.1, 255.255.255.0

Does not match: 192.168.1 (only three octets), 192.168.1.1.1 (five octets)

Note: 999.999.999.999 will pass this pattern because the regex validates structure, not value ranges. If you need to confirm each octet is between 0 and 255, add a post-match check in JavaScript - splitting on . and running parseInt on each segment is cleaner than extending the regex into an unreadable wall of alternation.

const isValidIP = /^(\d{1,3}\.){3}\d{1,3}$/.test(ip);

Pattern 6 - HTML Tag (Extracting Tag Names)

/<\/?([a-zA-Z][a-zA-Z0-9]*)\b[^>]*>/g

HTML tag extraction/<\/?([a-zA-Z][a-zA-Z0-9]*)\b[^>]*>/g

OK<div>

OK</p>

OK<input type="text">

OK<img src="photo.jpg" />

✗< div>

Breakdown:

< - literal opening angle bracket
\/? - optional forward slash for closing tags like </p>
([a-zA-Z][a-zA-Z0-9]*) - capture group: tag name must start with a letter, followed by zero or more letters or digits
\b - word boundary, prevents partial matches bleeding into attribute text
[^>]* - any characters that are not > - this captures attributes like class="foo" without consuming the closing bracket
> - literal closing angle bracket
/g - global flag, find all matches in the string

Matches: <div>, </p>, <input type="text">, <img src="photo.jpg" />

Does not match: < div> (space after opening bracket), plain text with no angle brackets

Warning: Do not use regex to parse HTML documents. Use document.querySelector, DOMParser, or a proper HTML parser library. Regex cannot handle nested tags, optional attributes, or malformed markup reliably. This pattern is for log file scanning, quick extraction from controlled strings, or template processing - not for building anything that touches arbitrary HTML from the web.

const tags = html.match(/<\/?([a-zA-Z][a-zA-Z0-9]*)\b[^>]*>/g);

Pattern 7 - Whitespace Cleanup (Multiple Spaces to One)

/\s{2,}/g - collapse multiple spaces

/^\s+|\s+$/g - trim leading and trailing whitespace

Whitespace cleanup/\s{2,}/g and /^\s+|\s+$/g

OKhello world

OK trim me

OKtrim me

✗already clean

Breakdown of /\s{2,}/g:

\s - any whitespace character: space, tab, newline, carriage return
{2,} - two or more consecutive occurrences
/g - replace all instances, not just the first

Breakdown of /^\s+|\s+$/g:

^\s+ - one or more whitespace characters at the very start of the string
| - OR
\s+$ - one or more whitespace characters at the very end
/g - apply globally

Use case: Users paste text from Word documents, PDFs, or Slack messages. That content arrives with double spaces, non-breaking spaces, tab indentation, and trailing newlines. Before you store or display any user-submitted text, clean it.

Note: JavaScript's built-in .trim() handles leading and trailing whitespace fine, so the second pattern is mainly useful when you also need to handle the multi-space collapse in the same pipeline. Chaining both handles everything in two operations.

const cleaned = input.replace(/\s{2,}/g, ' ').replace(/^\s+|\s+$/g, '');

Pattern 8 - Password Strength Check

/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*]).{8,}$/

Password strength/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*]).{8,}$/

OKStr0ng!Pass

OKMyP@ss1234

OKAb1!efgh

✗password

✗SHORT1!

Breakdown:

^ - start of string
(?=.*[a-z]) - lookahead: somewhere in the string there must be at least one lowercase letter
(?=.*[A-Z]) - lookahead: at least one uppercase letter
(?=.*\d) - lookahead: at least one digit
(?=.*[!@#$%^&*]) - lookahead: at least one of these specific special characters
.{8,} - any character, at least 8 times - the actual length requirement
$ - end of string

Matches: Str0ng!Pass, MyP@ss1234, Ab1!efgh

Does not match: password (no uppercase, no digit, no special character), SHORT1! (fewer than 8 characters)

Key concept: Lookaheads (?=...) assert that a condition is true at the current position without actually consuming characters. All four lookaheads run from the start of the string before .{8,} does its job. This is how you enforce multiple independent requirements in a single pattern without a mess of nested conditions.

const isStrongPassword = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*]).{8,}$/.test(pw);

Pattern 9 - File Extension Check

/\.(jpg|jpeg|png|gif|webp|svg)$/i

OKphoto.jpg

OKimage.PNG

OKicon.svg

OKbanner.WebP

✗document.pdf

Breakdown:

\. - literal dot (escaped because unescaped . means "any character")
(jpg|jpeg|png|gif|webp|svg) - one of these six extensions, matched by the | OR operator
$ - end of string - only matches if the extension is at the very end of the filename
/i - case-insensitive flag - .JPG, .Png, and .WebP all match

Matches: photo.jpg, image.PNG, icon.svg, banner.WebP

Does not match: document.pdf (extension not in the list), noextension (no dot at all)

Important: Client-side extension checking is a user experience feature, not a security feature. A malicious user can rename malware.exe to malware.jpg and your regex will happily accept it. Always validate file type on the server using actual MIME type detection - read the file's magic bytes, not its name.

const isValidImageFile = /\.(jpg|jpeg|png|gif|webp|svg)$/i.test(filename);

Pattern 10 - URL Slug Validation

/^[a-z0-9]+(-[a-z0-9]+)*$/

URL slug/^[a-z0-9]+(-[a-z0-9]+)*$/

OKhello-world

OKmy-blog-post-2026

OKregex

OKsection-4b

✗Hello-World

Breakdown:

^ - start of string
[a-z0-9]+ - one or more lowercase letters or digits - the slug must start with this, which prevents a leading hyphen
(-[a-z0-9]+)* - zero or more groups of: a hyphen followed by one or more lowercase letters or digits - this structure prevents trailing hyphens and consecutive hyphens because every hyphen must be followed by at least one letter or digit
$ - end of string

Matches: hello-world, my-blog-post-2026, regex, section-4b

Does not match: Hello-World (uppercase letters fail the [a-z0-9] set), -leading-dash (starts with a hyphen, fails [a-z0-9]+ at the beginning), trailing-dash- (the final group requires at least one character after the hyphen), double--dash (two consecutive hyphens - the second hyphen has no preceding letter/digit group)

const isValidSlug = /^[a-z0-9]+(-[a-z0-9]+)*$/.test(slug);

Three Regex Mistakes That Bite Every Beginner

Greedy vs lazy matching

By default, .* is greedy - it grabs as many characters as possible before yielding. If you try to extract content between quotes using ".*" on the string "first" and "second", the match runs from the opening quote of "first" all the way to the closing quote of "second", swallowing the word "and" in between. Use ".*?" instead - the ? after * switches to lazy matching, grabbing as few characters as possible. The lazy version stops at the first closing quote it finds rather than the last. In practice, whenever you are extracting content between delimiters, your default should be lazy until you have a specific reason to be greedy.

Forgetting to escape special characters

This produces bugs that are baffling to debug because the pattern technically runs without errors - it just matches the wrong things. The dot . in regex means "any single character except newline." If you write /3.14/ expecting to match the number 3.14, it also matches 3X14, 3!14, and 3 14. Write /3\.14/ to mean a literal dot. The characters that require escaping when you mean them literally are: . ^ $ * + ? { } [ ] \ | ( ). When in doubt, escape it.

Not using anchors when validating

There is a significant difference between searching a string and validating it. The pattern /\d{3}/ searches for any sequence of three digits anywhere in the input - it matches in abc123def, phone: 555-1234, and 999 bottles. If you use this to validate that a user entered exactly three digits and nothing else, you will accept all of those strings and your validation is broken. Add ^ and $ to make it /^\d{3}$/ and now it only matches a string that is three digits from start to finish. Every validation pattern in this guide uses anchors for exactly this reason.

Test your regex patterns instantly with JavaScript in Tooliest's browser-based developer tools - your code stays in your browser, nothing is sent to a server.

About the Author

Anurag is the founder of Tooliest and reviews the site's browser tools, AI-assisted workflows, and editorial guides with a focus on privacy, practical clarity, and real-world usefulness.

Want the site-level context behind this guide? Visit About Tooliest, review the privacy policy, or read the site disclaimer before relying on output for sensitive work.

Frequently Asked Questions

What does ^ and $ mean in regex?

They are anchors. ^ marks the start of the string or line, and $ marks the end. They are useful when you want to match the whole input rather than just any substring.

Why does .* match too much?

Because it is greedy by default. It will keep consuming characters as long as the pattern can still succeed, which is why non-greedy matching often matters.

Can one regex fully validate every email address?

Not reliably in one short beginner-friendly pattern. Simple regexes are fine for basic client-side checks, but real-world email validation often needs broader business logic.

When should I avoid regex?

Avoid it when a simple parser, built-in string method, or explicit rule set would be clearer and easier to maintain than a dense pattern.

Related Tooliest Tools

Regex Tester - Test matches, flags, and capture groups with live feedback.
Slug Generator - See a practical example of string cleanup without writing regex by hand.
Remove Duplicate Lines - Clean text before or after regex-based processing.

Regex for Beginners: 10 Patterns Every Developer Should Know

How to Read a Regex Pattern (60-Second Crash Course)

10 Regex Patterns Developers Actually Reuse

Pattern 1 - Email Address (Basic Validation)

Pattern 2 - URL (HTTP/HTTPS)

Pattern 3 - Phone Number (US Format)

Pattern 4 - Date (YYYY-MM-DD)

Pattern 5 - IP Address (IPv4)

Pattern 6 - HTML Tag (Extracting Tag Names)

Pattern 7 - Whitespace Cleanup (Multiple Spaces to One)

Pattern 8 - Password Strength Check

Pattern 9 - File Extension Check

Pattern 10 - URL Slug Validation

Three Regex Mistakes That Bite Every Beginner

⚠ Greedy vs lazy matching

⚠ Forgetting to escape special characters

⚠ Not using anchors when validating

About the Author

Frequently Asked Questions

Related Tooliest Tools

Greedy vs lazy matching

Forgetting to escape special characters

Not using anchors when validating