John Esposito edits Refcardz at DZone, while writing a dissertation on ancient Greek philosophy and raising two cats. In a previous life he was a database developer and network administrator. John is a DZone Zone Leader and has posted 320 posts at DZone. You can read more from them at their website. View Full User Profile

Writing a Compiler -- in PHP?

11.23.2011
| 5995 views |
  • submit to reddit

Coders don't always write compilers. But even when they don't, they do often like to learn about compilers, and maybe even play with bits of one.

One of the standard introductions is the 'dragon book' -- very comprehensive, but very theoretical, and not entirely up to date. (Yes, I own it, and I will read it..someday...) It's a serious textbook -- but textbooks can be tedious and inefficient, especially apart from a full course of study (though Stanford does offer extensive materials from number of complete courses that use the dragon book).

In any case, because most developers won't be writing a compiler, the majority of developer interest in compilers is probably more playful than the dragon will permit. You don't normally worry about lexing and parsing and so forth. Normally you just write fairly high-level code, and fiddle until the compiler is cool with it. (You can tell that I don't write much assembly.)

But what if you do want to play with some of the parts of a compiler? or maybe even create a domain-specific language, to simplify common programming tasks for your particular projects?

Then, first, you'll want to use a very familiar language -- otherwise you're playing two games at once, neither really helping the other very much. And, second, you'll want to work with some of the very basic elements of compiling before anything else -- the indivisible primitives, like things and groups of things. Or lexers and parsers, in formal programming terms.

Terence Parr takes this easy, stepwise approach in his book Language Implementation Patterns. As a result, Parr's book reads reads to me a bit like Dennis Ritchie's classic The C Programming Language: everything makes sense as you do it, and you never stop making incremental progress.

Sameer Borate recently worked through the first chapter and created a simple lexer and parser in PHP. This seems like an ideal first step into compiler design: plenty of people know PHP, and lexers and parsers are fundamental -- plus PHP syntax is pretty clear and intuitive, so the implementation language features don't get in the way. 

You should read the full article, which isn't long (and is mostly code snippets), but here's a teaser: Sameer's lexer for a programming language consisting of a simple list:

<?php
 
require_once('ListLexer.php');
require_once('Token.php');
 
$lexer = new ListLexer($argv[1]);
$token = $lexer->nextToken();
 
while($token->type != 1) {
    echo $token . "\n";
    $token = $lexer->nextToken();
}
 
?>

Sweet, I understood that. Maybe I can write a compiler now too.

Well, maybe not now, but at least read through the rest of the code...

Tags:

Comments

Yaron Levy replied on Mon, 2012/06/11 - 12:41pm

Why the projects software fail

The article reflects very well 50 % of the reasons or origins of the failures of a project software. But other one is absent 50 % about which one never speaks: the economy. Without an economic previous serious study on having begun a project it has 50 % of probabilities of which it fails independently of the arguments exposed in his article.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.