Replace eregi with preg_match

A couple of weeks ago, I wrote a small PHP code to validate if the url being passed is in fact valid. I used a regular expression and PHP’s eregi function to find a pattern match. Looking back, the problem is, the eregi function is deprecated in PHP 5.3.0.

For compatibility going forward, we need to replace eregi with the preg_match function which is better suited for regular expression matches. We will use the same example as last time. As you recall, the function was:

Code With eregi

$urlregex = "^(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*$";
 
if (eregi($urlregex, $_POST['url'])) {
//  url is valid. shorten long url
} else {
// reject. url is invalid
}

Code with preg_match

$urlregex = "^(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*$";
 
if(!preg_match($urlregex, $_POST['url']))
// reject. url is invalid //
} else {
url is valid. shorten long url
}

Here, we’ve sucessfully replaced the eregi function with preg_match.

Regex For URL Validation

I just wanted to share a regular expression that works for me. I was just trying to shorten a URL from Nike’s website using my own URL shortener. There was a slight problem. My script was rejecting the long URL because it was not in the right format, but there was nothing wrong with Nike’s URL. My regex was faulty. Time for a change. So, I searched online for a better regex to work with my URL shortener script. I found this code from the regexlib.com library that works for me.

RegEx

^(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\’\\\+&%\$#\=~_\-]+))*$

It’s a handful. All in one line by the way.

URL Validation

Assuming we are using a HTML form to submit the long URL, here’s how to validate the URL with the help of PHP’s eregi function. The eregi function is a case insensitive regular expression matching function. If the posted URL matches the regular expression, then the statement is true. We can then shorten the URL. If the URL is invalid, we reject the URL and send out a message saying the URL is not in the correct format.

$urlregex = “^(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|me|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\’\\\+&%\$#\=~_\-]+))*$”;

if (eregi($urlregex, $_POST[‘url’])) {
//  url is valid. shorten long url
} else {
// reject. url is invalid
}

Matches

http://www.sysrage.net | https://64.81.85.161/site/file.php?cow=moo’s | ftp://user:pass@host.com:123 | http://www.sysrage.net | https://64.81.85.161/site/file.php?cow=moo’s | ftp://user:pass@host.com:123

One last thing, You can add more TLDs and country code to the regular expression. I added the .me domain since one of my domains is a .me.