Archive for the Web Development Category

Using Regular Expression In PHP – The Basics

regular expressions cookbookRegular expressions are a powerful tool for finding, examining and/or modifying text. Regular expressions themselves are, with a general pattern notation almost like a mini programming language, allowing you to define and parse text. They enable you to search for patterns within a string, extracting matches flexible and precise. However, you should note that because regular expressions are more powerful, they also suffer from added overhead and are slower than the more basic string functions. You should make careful consideration and only use regular expressions if you have a particular need.

PHP supports two different types of regular expressions: POSIX-extended and Perl-Compatible Regular Expressions (PCRE). The PCRE functions are more commonly used, are more powerful than the POSIX ones and faster as well

In a regular expression, most characters match only themselves. For instance, if you search for the regular expression “foo” in the string “only a fool does not use regular expressions” you get a match because “foo” occurs in that string. Some characters have a special meaning. For instance, the dollar sign ($) is used to match strings that end with the given pattern. Similarly, a caret (^) character at the beginning of a regular expression indicates that it must match the beginning of the string. Characters that match themselves are called literals while characters that have special meanings are called metacharacters. (more…)

Install And Use CouchDB With JSON And Map-Reduce

CouchDB is another offspring from the open-source, NoSQL, non-relational databases and is maintained under the Apache Foundation. It differs itself form the likes of MongoDB or Cassandra in that CouchDB is storing data in so called “documents” that are in JSON format, which can be hashes, lists, nested arrays and of course scalar values. This added complexity results in more powerful features, mainly to have a db that is not just a single key/value pair, but it comes at a price of speed reduction.

CouchDB can be a little bit of a pain to install, because it needs a few pre-requisites and they in turn have a few of their own quirks. This outline should help you get CouchDB with all it’s necessities installed. We’ve used MacOS, but you can substitute your OS where applicable.

The CouchDB source code and installer packages are downloadable here. As of this writing, version 0.11.2 is the latest stable version, with 1.0.1 just around the corner. You will also need Spider Monkey, Mozilla’s C implementation of JavaScript.

Installing SpiderMonkey

Once downloaded, extract the tarball and move into the sources folder:

tar -xzvf js-1.8.0-rc1.tar.gz

cd js/src

(more…)

Enable Gzip Compression In Apache And Optimize Page Load

HTML, CSS and Javascript compression is a simple and effective way to save bandwidth and speed up page load on your site. It’s often overlooked and yet simple to implement, just enable Gzip compression for the right document types and enhance your site’s user experience.

In Apache, we achieve this by enabling content encoding. When a user requests a file like http://www.msn.com/index.html, the browser communicates to a web server, and the conversation goes a something like this:

1. Browser: GET me /index.html
2. Server: Found indes.html, it’s 187KB! Response code is 200 (200 OK) , requested file is coming
3. Browser: 187KB, loading

The actual headers and protocols sent between the two are much more formal (monitor them with HTTPFox, Live HTTP Headers or others if you like.

Now, everything worked just fine in our little scenario, the browser got it’s requested file (index.html) of 187KB in size and probably also loaded include files such as javascript, css and a bunch of images. Especially on lower speed Internet connections, it took a few seconds for the page to load all files.

That’s where compression comes in. Html, javascript and css are plain text and are ideal candidates for compression, while images and videos usually underwent some compression algorithm when saved as jpg, png, avi or flv. Compressing those files would provide very little to no improvement in file size and add processing overhead to the web server. So we want to ensure we exclude those. (more…)

Redis High Speed Storage Or Cache System

NoSQL databases are the hype, with MongoBD and CouchDB on the forefront, while Memcache has found a place in many high load web applications during the past few years. Each of these applications has their own, very specific characteristic. MongoDB finds its usage where single key-value pairs are not sufficient, but adds a slight overhead and complexity with its hash table like multi field storage architecture. CouchDB is an ideal candidate where single key-value pair storage engine is sufficient.

And there is Redis, the new kid on the block. Redis is a high speed storage or cache system, much like Memcache on steroids. Redis writes data into memory, which makes it really fast. And in contrast to Memcache, it writes data periodically to disk depending on the amount of data that has changed. Redis is been said to be able to handle in excess of 10’000 reads/writes per second! (more…)

7 Tips To Improve Web Page Load Time

With the increasing focus on Google’s Site Speed Algorithm, the following are 7 tips to improve web page load time, proven techniques well known websites use to boost their site speed.

1. Enable Gzip Compression

While compressing pages adds just a tad to your web server’s overhead, it will reduce bandwidth and transmission time and make pages appear to load faster for your users. Gzip is a open source compression algorithm that can be used to compress the content of your website before your the web server sends data to a client browser. You can learn how to enable Gzip in Apache here.

2. Minify Javascript/CSS

Minify is the process (and software) of removing unnecessary formatting characters and white space from javascript code. The result is smaller files, faster transmission and quicker page loads. You can learn all about minify javascript here. (more…)

Install PHP 5.2 on Ubuntu 10.04 Lucid Lynx

There are many reasons why we want to install PHP 5.2 on Ubuntu 10.04 Lucid Lynx, the most prominent is that many web packages are not compatible yet with PHP 5.3. Drupal 6 being a prime example.

But there is no automated method out of the box, and there are now several scripts floating around the Internet that may work or just partially work. A major concern is the ability to update, easily switch to PHP 5.3 when the application is ready and also easily add / remove extensions.

With these considerations in mind, this is the best way to install PHP 5.2 on Ubuntu 10.04 Lucid Lynx:

We will install PHP 5.2 using apt-get and install from Ralp Janke’s repository. (more…)

Best New Chrome Extensions For Developers

With increasing popularity in Google’s Chrome browser, so increases the demand for extensions. This is especially true for Developers, as we are used to such a nice variety of high quality tools in Firefox. And the more tools are available on one browser, the more that browser is used in developing web sites, resulting in higher quality web experience for end users on that platform. Btw., did you know that Google Chrome surpasses Safari in US browser market (macstories.net)? The following are arguably the best new Chrome extensions for developers.

6. BuiltWith Technology Profiler

BuiltWith ExtensionBuiltWith is a web site profiler tool, that returns all the technologies it can find on a particular page. BuiltWith extension helps developers, researchers and designers find out what technologies web pages are using and in turn help them to decide what technologies to implement. BuiltWith currently tracks widgets (snap preview), analytics (Google, Nielsen), frameworks (.NET, Java), publishing (WordPress, Blogger), advertising (DoubleClick, AdSense), CDNs (Amazon S3, Limelight), standards (XHTML,RSS), hosting software (Apache, IIS, CentOS, Debian).

5. Session Manager

Session Manager ExtensionWith Session Manager you can quickly save your current browser state and reload it whenever necessary. You can manage multiple sessions, rename or remove them from the session library. Each session remembers the state of the browser at its creation time, i.e the opened tabs and windows. Once a session is opened, the browser is restored to its state.

(more…)

Testing Memcached Using Telnet Commands

Troubleshooting memcached is not so transparent as some other technologies, but testing memcached using telnet commands can give us quite some insight on what’s happening under the hood.

Following is a short list of useful commands to inspect a running memcached instance.

How to find the IP address and port to connect:

ps aux | grep memcached will give us the process running memcached, with listening ip address and port. If this command does not yield any results, you likely not running the daemon and need to start it up first.

We can now connect using this info:

telnet 127.0.0.1 11211 (replace your IP address and port)

Supported Commands:

The following is a list of the most important memcached commands. For a more complete list of supported commands, check out the memcached wiki document. (more…)

How to find duplicate rows in a MySQL database table

I’ve been asked the question “How can I return duplicate rows only from a MySQL db table” so many times already, that I’ve decided to post it here in a short article.

It is not something intuitive or readily available (at least it seems), but the solution is short and very simple.

While this query:

SELECT DISTINCT column1
FROM table1

gives us all records without the duplicates, this one returns only the duplicate ones:

SELECT DISTINCT column1
FROM table1
GROUP BY column1
HAVING COUNT(column1) > 1

And by increasing the having count, you can retrieve records with multiple occurrences.




Track And Parse Twitter Messages Stream

Ever wanted to listen, track and parse tweets from a Twitter stream from an individual user? It’s quite an easy task, if you know the right URL’s to parse. Curl is a great little tool to get sources from almost any web resource and Curl has roots in Linux command like and PHP, just to name a couple.

Let’s look at how this would work by pulling CNN’s breaking news feed as JSON:

curl http://twitter.com/status/user_timeline/cnnbrk.json

or XML:

curl http://twitter.com/status/user_timeline/cnnbrk.xml

That will give you the last 20 tweets in either JSON or XML format.

An even more interesting option is to get messages where you have been mentioned. It’s a bit more complicated, as we need to supply login credentials, but no rocket science either:

curl -u “username:password” http://www.twitter.com/statuses/mentions.json (or .xml if you prefer)

That’ll give you the last few tweets mentioning the user supplied with the curl command. Now all you got to figure out is how to parse the messages and plug them into your application.

Hot To Install Memcache And PHP Client On Mac Snow Leopard

I recently installed the memcached daemon on my MacBook Pro, incuding the necessary PHP client for development purposes. I just prefer to work locally instead of using a VM running Linux. And the process is actually quite simple and straight forward. Please note, I have included both clients, the old standard one and the newer PECL extension, because I deal with different applications and also lots of people seems to get confused when they install one version and their memcache classes cannot get instantiated and throw errors. So, if in doubt, just install both.

These are the five (four if you know which extension you want) components needed:

- libevent (requred library for memcached)

- memcached daemon

- libmemcached (required library for the php client)

- php extension (standard)

- php extension (PECL)

Now open your terminal and off we go: (more…)

What’s new in Google’s Search update – code named Caffeine

Google CaffeineGoogle’s search algorithm update, codenamed “Google Caffeine” in set to launch in the coming days. Google Caffeine is said to stir up the Webmaster / SEO world with a slew of algorithm updates. Most predominantly, it will be much quicker than the current Google search and rely more on keyword strings in the page content, and it will make real-time search much more important.

The new focus on real-time search results means that sites that consistently ranked high will take a hit. This will give smart webmasters who understand the dynamic of  real-time social media an excellent opportunity to markedly boost their Google rankings.

Here is a summary of what we know about Google Caffeine and it’s effect on SEO:

Page load speed will be important

The time it takes for a webpage to load is now important  to rank high on Google. Studies have shown that improving page load speed results in improved user retention and increased conversions.

Check out Google’s Site Performance diagnostic tool in their Google Webmaster Accounts. Basically, it boils down to:

  1. Writing lean HTML and CSS code
  2. More intelligent use of Javascript
  3. Compressing html, css and Javascript
  4. Improve browser page caching

(more…)

Moonlight 2 implements Silverlight 2 on Linux

One of the intricate aspects of open source software is in implementing support — where it is even possible to do so — for the wide realm of codecs, formats, and a plethora of proprietary technologies that users have come to rely on. One such technology is Microsoft’s Silverlight framework, which until early this year was not available to Linux users.

This changed in January, when the first version of the Moonlight Project was released, providing Linux users with Open Source Silverlight support. Also included, provided that Moonlight has been obtained via Novell and meets certain other conditions, is a license to Microsoft’s free but closed-source Media Pack, containing codecs needed to decode audio and video streams. (more…)

How To Add PHP Mcrypt Module On Snow Leopard 10.6

These instructions assume that you have a working Apache 2.2 / PHP 5.3 in place and want to add the php mcrypt module. It will work as a fresh install, but keep in mind that additional configuration steps after these instructions are necessary to get your webserver working properly. Those additional steps are omitted here, as there are countless resources available on the Internet.

1. If you don’t already have the mcrypt module (in /usr/lib), download the libmcrypt source code from sourceforge here. Then extract the downloaded file in a Terminal and move inside the created directory. (cd libmcrypt-2.5.8 in my case)

2. Execute the following lines in one command in your Terminal, if you have a 64bit version of Apache/PHP:

CFLAGS=”-arch x86_64″ \
CXXFLAGS=”-arch x86_64″ \
./configure –disable-posix-threads

(for 32 bit versions: ./configure –disable-posix-threads)

(more…)

MySQL – How To Analyze, Repair and Optimize all Tables

Ever come across a situation, where you’d like to check all tables in a database and have them all repaired and optimized? My guess is yes.

In case you didn’t know, there is a helpful MySQL utility called mysqlcheck, available as of version 3.23.38. It does exactly what we need.

To check all tables in all databases for corruption and errors and also fix them in one go, this is your command:

mysqlcheck -u username -p password  –check –optimize –auto-repair –all-databases

mysqlcheck executes statements like CHECK TABLE, REPAIR TABLE, ANALYZE TABLE, and OPTIMIZE TABLE and chooses the best statements for any given operation and storage engine.

Note that the operations complete a lot faster if you can afford to to disable any external services, especially if your database is large.

←Older