Regular Expressions


Regular Expressions with PHP:

Regular Expressions is a powerful tool where in one can match for a particular string with set of rules defined. One can also match and replace a given string, replacing is an optional functionality.

In short, Regular Expressions are very useful in:

  1. Validating forms and end results.
  2. Searching for a particular string or value.
  3. Search and Replace a particular string or value.
  4. It returns precise results.

 

PHP supports two different types of regular expressions: POSIX-extended and Perl-Compatible Regular Expressions (PCRE). The PCRE functions are more powerful than the POSIX ones and faster too.

 

For example:

If you search for the regular expression “dance” in the string “Sam dances to the street beats!” you get a match because ” dance” occurs in that string and hence will return true or the given string.

 

Let’s now start in understanding the advanced part. There are many special characters that give different meaning to the regular expression. It helps us to match a string with wildcards or special characters and we can mix and match any string we like.

 

The characters that match themselves are called literals. The characters that have special meanings are called metacharacters.

 

Let us understand them better:

  1. caret (^) : A caret (^) character at the beginning of a regular expression indicates that it must match the beginning of the string. 
  2. dollar sign ($): A dollar sign ($) is used to match strings that ends with the given pattern. 
  3. dot (.) : A dot (.) metacharacter matches any single character only. 
  4. vertical pipe (|) : A vertical pipe (|) metacharacter is used for alternatives in a regular expression. Its just like an ‘or’ condition. 
  5. backslash (\): If you want to match a literal metacharacter in a pattern, you have to escape it with a backslash. 
  6. plus (+) : The plus sign means match one or more of the preceding expression.
  7. asterix (*): An asterix sign means match zero or more of the preceding expression.
  8. question (?): The question sign means match zero or one of the preceding expression.
  9. curly braces ({}): The curly braces can be used differently as it servers more than one purpose, like:  

 

 

 

  • {x} : match exactly x occurrences of the preceding expression. 
  • {x,} : match x or more occurrences of the preceding expression. 
  • {x,y}: match x or more occurrences of the preceding expression but not more than y times.

 

 

 

We can group the characters inside a pattern like this:

  • Normal characters which match themselves like hello
  • Start and end indicators as ^ and $
  • Count indicators like +,*,?
  • Logical operator like |
  • Grouping with {},(),[]

 

Impressive!!!!!

So now you are ready to mix and match some regular expressions, lets start with some simple examples:

 

Regular Expression

Meaning

 A  Will match string having character uppercase A
 a  Will match string having character lowercase A
 [a-z]   Will match string having characters between the  a to z
 [0-9]   Will match string having numbers between the 0  to 9
 ^[a-z]  Will match string having characters between the  a to z and must start with an alphabet
 [a-z]$  Will match string having characters between the  a to z and must end with an alphabet
 [a-z]+  Will match string having characters between the  a to z and must have atleast one alphabet
 [a-z]?  Will match string having characters between the  a to z and may have zero or more alphabets

 

Programming PHP and Regular Expression

PHP provides functions to find matches in text, to replace each match with other text (a la search and replace), and to find matches among the elements of a list. The functions are:

  • preg_match()
  • preg_match_all()
  • preg_replace()
  • preg_replace_callback()
  • preg_grep()
  • preg_split()
  • preg_last_error()
  • preg_quote()
[Note: We will take the examples based on preg_match() function.]

Lets us first understand the basic meaning and usage of preg_match:

This is used to perform a regular expression match (PHP 4, PHP 5)

Syntax:

int preg_match ( string pattern, string subject [, array &matches [, int flags [, int offset]]] )

Searches subject for a match to the regular expression given in pattern.

Lets us go a lttle deeper in understanding the parameters passed:

pattern

         The pattern to search for, as a string.

subject

        The input string.

matches

        If a match is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized sub pattern, and so on.

flags

       Flags can be the following flag:

       PREG_OFFSET_CAPTURE

       If this flag is passed, for every occurring match the appendant string offset will also be returned.

       Note that this changes the return value in an array where every element is an array consisting of the

       matched string at index 0 and its string offset into subject at index 1.

offset

        Normally, the search starts from the beginning of the subject string.

        The optional parameter offset can be used to specify the alternate place from which to start the search (in bytes).

 

So let’s start with a small example:

<?php
     
$subject = "abcdef";
     
$pattern = '/^def/';
     
preg_match($pattern, $subject, $matches, PREG_OFFSET_CAPTURE, 3);
     
print_r($matches);
?>

 

The above example will output:

Array
(
)

While this example

 

<?php
     
$subject = "abcdef";
     
$pattern = '/^def/';
     
preg_match($pattern, substr($subject,3), $matches, PREG_OFFSET_CAPTURE);
     
print_r($matches);
?>

 

Will produce

Array
(
   [0] => Array
       (
           [0] => def
           [1] => 0
       ) 

)

 

Reference:-

  1. regular expressions info
  2. web cheat sheet
  3. PHP Manual : preg_match
  4. phpro regex
  5. ibm developerworks
  6. practical php regular expression recipes

 

There is another surprise for all of you’ll!!!!

We have faced many difficulties in implementing or understand complex regular expressions and we had to test it over and over and over again to get it right.

So to lessen the burden I have written a script that will test the regular expression and the string to be matched and return the Boolean (true / false) with the first occurrence of the string.

It’s really helped me in mastering my skills and techniques in regular expression.

So do try it out………..

Also share in your tips and techniques regarding regular expressions and PHP.

Requirements:

  • Any server having PHP installed
  • That’s it!!!!

You can find the script here:

regular_expression_php

 

Preview:

 

Regular Expression and Preg Match

Regular Expression and Preg Match

Secure PHP Applications (PHP Security)


To understand PHP security better let us first understand what is PHP and Security

Security is a process, not a product, and adopting a sound approach to security during the process of application development will allow you to produce tighter, more robust code.

(PHP Hypertext Preprocessor) A scripting language used to create dynamic Web pages. With syntax from C, Java and Perl, PHP code is embedded within HTML pages for server side execution. It is commonly used to extract data out of a database and present it on the Web page

PHP is a powerful scripting language for building web applications, and also one of the easiest ways for hackers to gain access to your web server. Developers need to understand how their scripts can be exploited in order to protect them.

PHP is widely used in many high-end applications that maybe a Web Based (Internet) or and Intranet Applications. We can say that from the total PHP in Web Based (Internet) Applications : 80% and Intranet Applications:20%.

As IBM as suggested few basic principles that we could follow to make our website secure and guard our application from any vulnerabilities:

Validate input
Guard your file system
Guard your database
Guard your session data
Guard against Cross-Site Scripting (XSS) vulnerabilities
Verify form posts
Protect against Cross-Site Request Forgeries (CSRF)
  1. Validate input
  2. Guard your file system
  3. Guard your database
  4. Guard your session data
  5. Guard against Cross-Site Scripting (XSS) vulnerabilities
  6. Verify form posts
  7. Protect against Cross-Site Request Forgeries (CSRF)