Tags related to tag unixapi bell labs blogware cachegrind cms codeigniter cpan cron crontab debug developer dns drupal email emulator extension firebug framework freebsd google howto http internet kde languages linux lucent mac markup mediawiki module mvc oop open-source operating system optimization osx patterns performance perl php plugin profiling programming python resources ruby security sendmail smarty sysadmin template tutorial user-interface validation
Thursday, November 20, 2008
Okay, now we're getting somewhere! Still dissatisfied at my attempt to reduce the glut in the parent PHP category page, I took yet another look. Wouldn't you know it? There was a definite thread of resources in there relating to improving the performance of applications written in PHP. Not surprising given the growing complexity of the language itself, the applications written in it, and the myriad of Frameworks available these days. The latter is especially critical, because with abstraction, underlying complexity, and growing feature sets, before you even start a project your code base is large. Very large. I'm not going to name names, you know who you are.
Frameworks
I prefer the simple over the complex, the modular over the monolithic. Given the opportunity, I would go with something like CodeIgnitor, which allows you to use the pieces you want and eliminate the rest. The same is true for templating engines. I've used Smarty of course, but it puzzles me why the developers would reinvent the wheel, so to speak, by adding an entirely new dynamic language syntax for templates, when PHP already is a templating language (and a whole lot more). I'm a Savant man, call me an idiot.
Web Publishing
Now we're getting into that gray area I've mentioned before. Is Drupal a framework, a CMS or an application? In my mind, some of each. Wikis and blogware packages I consider applications. It certainly helps to be a developer to get the most out of them, but it isn't strictly necessary.
User or Developer?
And this is also the point where I change my stance on complexity and features. There are certainly faster and easier to use Wiki packages than MediaWiki, but do they have all the power? As always there is a trade-off. I get frustrated with Firefox for instance, because it has become rather bloated and I have bloated it even more with lots of extensions. But as a developer, and a bit of a designer, I could not live without FireBug, Web Developer, ColorZilla, and countless other tools. Christ, I'm not even sure how the hell I managed back in the bad old days of using telnet to test HTTP requests.
Simple vs. Complex
My own philosophy on programming goes like this, start with primitives and build complexity as you move up towards the more abstract and powerful—without going too far. We can see this in the OSI layer model, and an even better example, the early days of Unix. When Dennis Ritchie got tired of working on the kernel in PDP assembler, he took the time to build the more abstract language C. Funny how all the scripting languages, tools, hell, pretty much everything, is written in C, but now days how many of us code in C anymore? Even many of the extensions written for languages like Perl, Python, and of course PHP, are written in C. And there is a damn good reason for this: Performance.
Extensibility
This concept, to me, is absolutely key to good software. Use a high-performance compiled language to build the tools and you're left with solutions that are both easy to apply and responsive. The best of both worlds.
Results
Okay, and now for the final tally. After fragmenting the PHP category into "general" (now around 50 resources) there are six sub-categories (for a total of around 50 also). Divide and conquer my friends.
Specific Navigation
Sunday, November 9, 2008
Valgrind is a entire suite of open-source tools, including basic debugging, profiling, and more advanced techniques such as threading, memory management, and leak detection. For the purposes of this article, I will focus on Cachegrind, and in particular within the domain of Web applications. Although there are a number of developers contributing to Valgrind, Julian Seward is the original designer and author.
Out of the three server-side languages I am most familiar with, PHP seems to be the one that is best represented, with some Python—but I found very little if any information on Perl.
What is Cachegrind?
Cachegrind is an Intel CPU emulator and cache profiler that performs detailed simulations of the onboard I1, D1 and L2 caches and can accurately pinpoint the sources of cache misses in your code. It identifies the number of cache misses, memory references, and instructions executed for each line of source code. (paraphrased)
Xdebug
Xdebug is the tool of
choice for PHP developers when it comes to standard debugging, and the built-in profiler with Cachegrind output is also maturing. Typically the output is in a file named cachegrind.out.pid, which is plain text, but be careful with large, complex applications as it can grow to on the order of 500MB. The raw data is almost useless, to really analyze and visualize the results you need a parsing/graphing tool. There are several available depending on your platform and needs.
Viewers
Once you have your output you need an application to make sense of it. Listed below are solutions for Linux (or any Unix-like OS running KDE), Mac OS X, Windows, and even a browser-based solution from Google Code.
Related Reading
Wednesday, December 13, 2006
I was way overdue and finally made it down to the National Press Building (map) for my first DC PHP Developers group meeting, and I was pleasantly surprised by the turnout. The conference room (#495) was close to standing room only and additional chairs had to be brought down so everyone had a seat. I'm not sure if this was atypical, maybe it had something to do with the presentation. This, in contrast to the Washington Web Standards group meetings, which are very informal and only a handful of people (at best) regularly show up.
I suspect the reason for the strong showing was Keith Casey's presentation on the Smarty templating engine. He was obviously enthusiastic about his recent experiences working with it, and went beyond simple include/if/foreach constructs in your markup, or the "view" side of separating your application into data/code and presentation. Smarty also allows you to create custom filters, or plugins, which boil down to PHP functions that can be used in your templates to deal with special situations.
As ever, I will not mince words. I still cannot see how a "designer" would be any less intimidated by editing templates over markup embedded in a PHP module using here document syntax. I had the same impression I've been seeing for years, being this separation of the "progammer" camp and the frontend designer. To me, Web development is the whole ball of wax. You should understand everything from Internet protocols, server configuration, valid and semantic markup, CSS/presentation, database design, efficiency, security, and so on, or you pigeonhole yourself into one role.
And I didn't even mention client-side stuff like JavaScript/DOM scripting, or XHR.
Another thing that puzzles me are constant references to the MVC paradigm. "This is not a new idea," one that predates the Mac and DOS/Windows, not to mention Linux, by a long shot. MVC is rooted in the research done on GUI design at Xerox/PARC. For a generic overview, MVC is composed of:
- Model — Data and the logic to access it (the backend).
- View — Presentation or user interface (the frontend).
- Controller — Event handlers (most of which is already done by the browser unless you are using scripting).
I admit the latter isn't quite that simple. Server-side code (other than preemptive client-side validation) would include such things as processing form inputs and the security baggage that goes along with it.
My design and development philosophy has always been deeply rooted in the original principles of Unix and KISS. How many layers of abstraction are necessary? Or worse, layer after layer after layer...when does the developer lose all sense of the underlying system? I'm not advocating coding server-side HTTP responses in assembler or C/C++ (which I have done), but when I see programmers writing classes using methods that are nothing but wrappers around PHP functions, which are themselves wrappers around library calls that then make system calls...I hope you see the point of this rant.
However, it was a good presentation and Keith's strongest assertion, one I happen to agree with, is this: If you are going to use a templating engine Smarty is worth considering. It is mature, has a strong and loyal userbase, and is well maintained and documented.
Related reading: PHP Templating Engines.
Sunday, September 24, 2006
Following a tip from Russ I was pleased to find an interesting post on the Official Google Webmaster Central Blog titled
How to verify Googlebot. In a nutshell, it explains how to use the Unix shell program host to authenticate that an IP address copied from your Web server log file really is a Googlebot and not some email harvester (or whatever).
I decided to take this a step further and demonstrate how you can automate this procedure using a scripting language. For these examples I chose PHP
and Perl, although you could certainly use Python or Ruby or whatever your preferred language is, as long as it has an interface to the gethostbyname and gethostbyaddr system calls.
Using these calls under PHP is the simpler of the two approaches, as the interface to these routines are written at a more abstract level than using the Perl Socket module. Below is an example googlebot() function in PHP that returns true if the IP address parameter authenticates, although there is no 100% guarantee of preventing a spoof getting through (but it will catch the vast majority of them). A bit of test code is included.
<?php
function googlebot($ip) {
// check to see if this IP really is a Googlebot
$bot = 'googlebot.com';
$name = gethostbyaddr($ip);
if ($name == $ip) return false;
return (strpos($name, $bot) !== false and gethostbyname($name) == $ip) ? true : false;
}
// test it
$ip = '66.249.66.1';
echo $ip . ' is ';
if (!googlebot($ip)) echo 'not ';
echo 'a Google bot' . "\n";
?>
The Perl version is at a much lower level, very similar to the corresponding C system calls. In fact, the module is derived directly from the sys/sockets.h header file and the functions are just wrappers around these Standard C library calls. See Berkeley Sockets for more information. If you have a copy of Programming Perl, the chapter 16 Interprocess Communications section on socket programming will help, and if you are lucky enough to have a copy of the Perl Cookbook, chapter 18 Internet Services has some great recipes for DNS lookups. For really gory details, refer to chapter 14 DNS: The Domain Name System of TCP/IP Illustrated, Volume I—The Protocols.
#!/usr/bin/perl
use Socket;
sub googlebot($) {
# check to see if this IP really is a Googlebot
my $ip = shift;
my $bot = 'googlebot\.com';
my $name = gethostbyaddr(inet_aton($ip), AF_INET) or return 0;
my @addr = gethostbyname($name);
my $addr = inet_ntoa($addr[4]);
return ($name =~ m/$bot/ and $ip eq $addr) ? 1 : 0;
}
# test it
$ip = '66.249.66.1';
print $ip . ' is ';
unless (googlebot($ip)) { print 'not '; }
print 'a Google bot' . "\n";
Finally, in case anyone is interested why it's been so long since I posted anything, much of the summer I was sick as a dog and since recovering, busy as a bee. It's nice to be feeling better and back to work!
Wednesday, May 3, 2006
People love to argue. A recent thread on the DC Perl Mongers mailing list opened yet another discussion of which CPAN module is the best for sending email.
Most Unix systems, at least the open-source flavors I use most frequently (FreeBSD and Linux), have sendmail installed. So when I need to send an email from a Perl script I just roll my own code to do it. You could argue that it's too simplistic because it's procedural or it isn't portable, but it works, it's fast and I have control over the code.
The problem with using someone else's module is you're tied into the way they think, and it can be difficult or time consuming to modify the code to add a feature or fix a bug.
Who made this rule that you have to use a module for everything or that object-oriented methods are somehow intrinsically superior to procedural programming? Don't get the wrong idea, I have nothing against modules or OOP. It would have taken me a hell of a lot longer to write many of the Perl scripts I've written without wonderful tools like CGI.pm and DBI. Not to mention about a 100 others.
#!/usr/bin/perl
$to = 'you@somehost.com';
$from = 'me@somehost.com';
$subject = 'test sendmail';
$body = <<EOB;
Hello from Perl/sendmail
EOB
if ($error = sendmail($to, $from, $subject, $body)) {
die "Can't sendmail: $error\n";
}
print 'mail sent';
sub sendmail($$$$) {
my @args = (
'to',
'from',
'subject',
'body'
);
my $arg;
foreach $arg (@args) {
unless ($$arg = shift) {
return (caller(0))[3] . ': missing $' . $arg . ' parameter';
}
}
my $sendmail = '/usr/sbin/sendmail';
my $switch = '-t';
open MAIL, "|$sendmail $switch" or return $!;
print MAIL <<EOM;
To: $to
From: $from
Subject: $subject
$body
EOM
close MAIL;
return;
} # sendmail()
Friday, April 28, 2006
Most crontab entries are very simple. For instance, "run this script once every day exactly at midnight." To do this you would add the following line to your crontab file:
0 0 * * * /path/to/script
Depending on your version of cron, you could also use the shorthand string @midnight or @daily. But what if you need to run a script between the hours of 8am and 6pm every two hours, but only on Mondays through Saturdays? That's exactly the problem I had to solve earlier today. Although I use cron quite a bit, I needed a quick syntax refresher course covering some of the more advanced features.
So naturally (like we all do) I reached for my browser and fired up Google. After 20 wasted minutes trying various keyword search combinations, wading several result pages deep and following links, it hit me. Do'h! All the information I needed was right on my FreeBSD server. Man pages contain a wealth of information about your system and have been around for many years. Decades actually. Many people forget that Unix was originally designed as a text processing and technical manual printing system. The entire system is documented using these manual pages and they are designed for viewing, searching and printing. I once kept large chunks of the standard C library in a 3-ring binder.
Since crontab is a file format, I went back to my shell and entered:
$ man 5 crontab
And within a matter of minutes I had the solution. The point of all of this is sometimes we are over reliant on the Web and search. Often the information we are seeking has been there all along. Call me old school, I still like books too.
By the way, the crontab entry looks like this:
0 8-18/2 * * 1-6 /path/to/script
If you really have to use a browser, try this "advanced crontab tutorial."
How ironic, I posted this exactly at midnight.
Thursday, August 18, 2005
When Bell Labs finally closed its doors on Department 1127 this month, it didn't signal the end of Unix. Perhaps the end of an operating system research environment at Lucent, as most of the original team are long gone to other ventures (Google, Pixar, NASA/JPL, Princeton, Dartmouth...) or retired anyway.
Hardly. Unix was first developed around 1970 (aka the "epoch"), and the original concepts and many of the technologies are stronger than ever. The vast majority of the servers that drive the Internet and the Web are some form of Unix operating system. Hell, we wouldn't have an Internet if wasn't for the role that Unix played in it.
I use Linux, FreeBSD and OS X every day. Both as servers and as desktop environments. There are countless other Unix variants, both open-source (OpenBSD, NetBSD, DragonflyBSD, Darwin...) as well as commercial ones (Sun, HP...).
And all of this, thanks to a couple of guys who were allowed to tinker with an old PDP-11 in exchange for developing an electronic typesetting system so AT&T could publish their own technical manuals. Oh, and while they were at it they threw in a little thing called the C programming language. But that's another story.
The King is dead, long live the King!
|