Using PHP with Regex
 

Back | Print Version

Regular Expressions in PHP

Using PHP with Regex

PHP is a very powerful server-side scripting language that allows you to dynamically alter the Content of your website. In this brief tutorial, we will show you an introduction to the powerful realm of regular expressions in PHP.

Regular expressions are tools provided by scripting languages to identify and edit strings and sections of strings. They are more powerful than the usual string functions as they allow you to identify patterns rather than fixed strings, and then work with those patterns.

For example, a pattern might represent "any complete word (with no hyphens) that starts with R and ends with D" This pattern would match "red", "read", "removed", "realised" and many other words.

In order to do this kind of pattern matching, we look for the things in the pattern that are common, and represent the rule we are looking to reproduce. In the above example, we are looking for a complete word, so it must be preceded by a space, and followed by a space or one of a small group of punctuations, such as a full stop or a comma. We also know that the word must start with the letter R, and finish with the letter D.

So we know that the pattern must represent: {space}R{unknown letters}D{space or punctuation}

Regular expressions use a range of characters that allow us to translate this pattern into something that PHP will understand. (Note that Perl and other languages sometimes use variations of these characters, so if you are using regular expressions elsewhere, remember to check if the pattern matching uses the same characters).

Some of the important characters to remember in regular expressions are:

There are others, which you can find at various advanced tutorials for PHP, but I won’t complicate matters here.

Repeating the example above…
" R" is represented by " R"
"D followed by space or punctation" is represented by "D^[a-z0-9]" (which actually means D follwed by anything not alphanumeric)
and "unknown letters" is represented by " [a-z]*" (note the * which means zero or more letters).

Our complete pattern is therefore:
R[a-z]*D^[a-z0-9]

This of course doesn’t tell you if the match is a real word, but it will match real or pseudo-words in the context of a string of text or sentence.

Now we know how to put a simple regular expression together, we need to see how to use it.

PHP has two regular expression types, preg and ereg. We will use ereg here.

There are two main types of regular expression command you need to know, and two variations of each.

The commands are ereg, and ereg_replace. Their variations are eregi and eregi_replace. The extra "i" means case insensitive, so our pattern could match capital or lower case or a mixture.

The ereg command will look to see if a needle pattern matches in a haystack string, and optionally return the matches in an array if required. It’s syntax is thus:

<?php

ereg(needle, haystack, array );

?>

where needle is the regular expression pattern
haystack is the string you are examining,
and array is the array you want to create with the results.

If there are no results, ereg returns FALSE. This is important because you can check to see if results have been gathered by putting ereg in the conditions of an IF statement.

<?php

if(eregi($needle, $haystack, $array) {
    // TRUE therefore the results are the values in $array
    foreach($array as $value){
        echo "This match is $value<br>";
    }
} else {
    // FALSE therefore $array doesn’t exist
}

?>

The code above checks for $needle in $haystack (with no case sensitivity), and lists all the pattern matches.

ereg_replace is slightly different. It will replace every pattern match with a pre-defined replacement string. This is similar to str_replace, conceptually.

The syntax is

<?php

$newString = eregi_replace($needle, $replacement,  $haystack)

?>

where $newString is the string that is the result,
$needle is the pattern to be matched
$replacement is the string to replace any pattern matches with,
$haystack is the original string before any replacements.

An example of its use is:

<?php

$num = '4';
$string = "This string has four words.";
$string = ereg_replace('four', $num, $string);
echo $string;  /* Output: 'This string has 4 words.' */

?>

If you find our tutorials helpful, please feel free to link to this page. You can you the code snippet below:

Search Engine Optimization