Using PHP and Regular Expressions to Tidy Up VariablesI recently had to develop a very simple email manager for a client. It was necessary to extract some text from a database and then to insert that text into an email message, which was then fired off to a mailing list.The problem I had was that the text contained HTML tags as well as other HTML characters (for example, nbsp;), and I only wanted plain text in the email. (Please note that for display reasons in this article, I've omitted the leading ampersand from the HTML character.) I was able to use the PHP strip_tags function to remove the HTML tags (see below), but this still left me with several HTML characters in the text. The use of a regular expression solved the problem. Here is the bit of code I used to clean up the contents of the variable: // Get rid of HTML tags $contents = strip_tags($contents); // Get rid of non-breaking spaces $pattern = '/nbsp;/'; $replacement = ' '; $contents = preg_replace($pattern, $replacement, $contents); When I extracted the piece of text from the database I placed it in a variable called $contents. I then ran the PHP strip_tags function on the variable to get rid of the HTML tags. Next we have the bit of code that includes the regular expression. $pattern contains the HTML character we want to search for. Here, $pattern contains nbsp;, which is the HTML character for a non-breaking space. I needed to get rid of this and replace it with a normal space because it looked a bit strange in the email message. For example, I needed to change: 'thisnbsp;week'snbsp;specialnbsp;offernbsp;is...' to: 'this week's special offer is...' $replacement contains a blank space, which is what I want to replace nbsp; with. The last line in the bit of code is the actual regular expression. |