<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule">
<channel>
    <title>blogZero - FreeBSD</title>
    <link>http://loadaveragezero.com/app/s9y/</link>
    <description>Web Development Community News, Culture and Commentary, Tools and Techniques</description>
    <dc:language>en</dc:language>
    <admin:errorReportsTo rdf:resource="mailto:" />
    <generator>Serendipity 0.8.3 - http://www.s9y.org/</generator>
    
    <image>
        <url>http://loadaveragezero.com/img/blogzero.gif</url>
        <title>RSS: blogZero - FreeBSD - Web Development Community News, Culture and Commentary, Tools and Techniques</title>
        <link>http://loadaveragezero.com/app/s9y/</link>
        <width>96</width>
        <height>52</height>
    </image>
<item>
    <title>Authenticating a Googlebot in PHP and Perl</title>
    <link>http://loadaveragezero.com/app/s9y/index.php?/archives/135-Authenticating-a-Googlebot-in-PHP-and-Perl.html</link>
<category>PHP</category><category>FreeBSD</category><category>Linux</category><category>Perl</category>    <comments>http://loadaveragezero.com/app/s9y/index.php?/archives/135-Authenticating-a-Googlebot-in-PHP-and-Perl.html#comments</comments>
    <wfw:comment>http://loadaveragezero.com/app/s9y/wfwcomment.php?cid=135</wfw:comment>
    <slash:comments>3</slash:comments>
    <wfw:commentRss>http://loadaveragezero.com/app/s9y/rss.php?version=2.0&amp;type=comments&amp;cid=135</wfw:commentRss>
    <author>dwclifton@gmail.com (Douglas Clifton)</author>
    <content:encoded>
&lt;p&gt;&lt;img src=&quot;http://loadaveragezero.com/img/fav/drx/pencil.gif&quot; class=&quot;icon&quot; alt=&quot;code&quot; title=&quot; Sample Code &quot; /&gt; Following a &lt;a href=&quot;http://www.maxdesign.com.au/2006/09/23/some-links-97/&quot; title=&quot; Some links for light reading &quot;&gt;tip&lt;/a&gt; from &lt;a href=&quot;http://loadaveragezero.com/drx/author/R#a21&quot; title=&quot; Russ Weakley &quot;&gt;Russ&lt;/a&gt; I was pleased to find an interesting post on the Official Google Webmaster Central Blog titled
&lt;a href=&quot;http://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html&quot;&gt;&lt;em&gt;How to verify Googlebot&lt;/em&gt;&lt;/a&gt;. In a nutshell, it explains how to use the &lt;a href=&quot;http://loadaveragezero.com/app/drx/Software/Operating_Systems/Unix&quot;&gt;Unix&lt;/a&gt; shell program &lt;a href=&quot;http://www.freebsd.org/cgi/man.cgi?query=host&quot;&gt;host&lt;/a&gt; to authenticate that an IP address copied from your Web &lt;a href=&quot;http://loadaveragezero.com/app/drx/Internet/WWW/Servers&quot;&gt;server&lt;/a&gt; log file really is a Googlebot and not some email harvester (or whatever).&lt;/p&gt;

&lt;p&gt;I decided to take this a step further and demonstrate how you can automate this procedure using a scripting language. For these examples I chose &lt;a href=&quot;http://loadaveragezero.com/app/drx/Programming/Languages/PHP&quot;&gt;PHP&lt;/a&gt;
and &lt;a href=&quot;http://loadaveragezero.com/app/drx/Programming/Languages/Perl&quot;&gt;Perl&lt;/a&gt;, although you could certainly use &lt;a href=&quot;http://loadaveragezero.com/app/drx/Programming/Languages/Python&quot;&gt;Python&lt;/a&gt; or Ruby or whatever your preferred language is, as long as it has an interface to the &lt;a href=&quot;http://www.freebsd.org/cgi/man.cgi?query=gethostbyaddr&amp;amp;sektion=3&quot;&gt;gethostbyname and gethostbyaddr&lt;/a&gt; system calls.&lt;/p&gt;

&lt;p&gt;Using these calls under PHP is the simpler of the two approaches, as the interface to these routines are written at a more abstract level than using the Perl &lt;a href=&quot;http://search.cpan.org/~nwclark/perl-5.8.8/ext/Socket/Socket.pm&quot;&gt;Socket module&lt;/a&gt;. Below is an example googlebot() function in PHP that returns true if the IP address parameter authenticates, although there is no 100% guarantee of preventing a spoof getting through (but it will catch the vast majority of them). A bit of test code is included.&lt;/p&gt;

&lt;pre&gt;
&amp;lt;?php

function googlebot($ip)  {

    // check to see if this IP really is a Googlebot

    $bot = 'googlebot.com';
    $name = gethostbyaddr($ip);
    if ($name == $ip) return false;

    return (strpos($name, $bot) !== false and gethostbyname($name) == $ip) ? true : false;
}

// test it

$ip = '66.249.66.1';

echo $ip . ' is ';
if (!googlebot($ip)) echo 'not ';
echo 'a Google bot' . &quot;\n&quot;;
?&amp;gt;
&lt;/pre&gt;

&lt;p&gt;The Perl version is at a much lower level, very similar to the corresponding &lt;a href=&quot;http://loadaveragezero.com/app/drx/Programming/Languages/C&quot;&gt;C&lt;/a&gt; system calls. In fact, the module is derived directly from the sys/sockets.h header file and the functions are just wrappers around these Standard C library calls. See &lt;a href=&quot;http://en.wikipedia.org/wiki/Berkeley_sockets&quot;&gt;Berkeley Sockets&lt;/a&gt; for more information. If you  have a copy of &lt;a href=&quot;http://www.amazon.com/o/tg/detail/-/0596000278/loadaverageze-20&quot;&gt;Programming Perl&lt;/a&gt;, the chapter 16 &lt;em&gt;Interprocess Communications&lt;/em&gt; section on socket programming will help, and if you are lucky enough to have a copy of the &lt;a href=&quot;http://www.amazon.com/o/tg/detail/-/0596003137/loadaverageze-20&quot;&gt;Perl Cookbook&lt;/a&gt;, chapter 18 &lt;em&gt;Internet Services&lt;/em&gt; has some great recipes for &lt;acronym title=&quot; Domain Name System &quot;&gt;DNS &lt;/acronym&gt; lookups. For &lt;em&gt;really&lt;/em&gt; gory details, refer to chapter 14 &lt;em&gt;DNS: The Domain Name System&lt;/em&gt; of &lt;a href=&quot;http://www.amazon.com/o/tg/detail/-/0201633469/loadaverageze-20&quot;&gt;TCP/IP Illustrated, Volume I&amp;#8212;The Protocols&lt;/a&gt;.&lt;/p&gt;

&lt;pre&gt;
#!/usr/bin/perl

use Socket;

sub googlebot($)  {

    # check to see if this IP really is a Googlebot

    my $ip = shift;
    my $bot = 'googlebot\.com';
    my $name = gethostbyaddr(inet_aton($ip), AF_INET) or return 0;
    my @addr = gethostbyname($name);
    my $addr = inet_ntoa($addr[4]);

    return ($name =~ m/$bot/ and $ip eq $addr) ? 1 : 0;
}

# test it

$ip = '66.249.66.1';

print $ip . ' is ';
unless (googlebot($ip)) { print 'not '; }
print 'a Google bot' . &quot;\n&quot;;
&lt;/pre&gt;

&lt;p&gt;Finally, in case anyone is interested why it's been so long since I posted anything, much of the summer I was sick as a dog and since recovering, busy as a bee. It's nice to be feeling better and back to work!&lt;/p&gt;
    </content:encoded>
    <pubDate>Sun, 24 Sep 2006 18:08:26 -0400</pubDate>
    <guid isPermaLink="false">http://loadaveragezero.com/app/s9y/index.php?/archives/135-guid.html</guid>
    <creativeCommons:license>http://creativecommons.org/licenses/by-nc-sa/2.0/</creativeCommons:license><category>api</category>
<category>cpan</category>
<category>dns</category>
<category>freebsd</category>
<category>google</category>
<category>howto</category>
<category>internet</category>
<category>languages</category>
<category>module</category>
<category>perl</category>
<category>php</category>
<category>python</category>
<category>ruby</category>
<category>tutorial</category>
<category>unix</category>
<category>web server</category>
</item>
<item>
    <title>Advanced crontab Tutorial</title>
    <link>http://loadaveragezero.com/app/s9y/index.php?/archives/131-Advanced-crontab-Tutorial.html</link>
<category>FreeBSD</category><category>Linux</category>    <comments>http://loadaveragezero.com/app/s9y/index.php?/archives/131-Advanced-crontab-Tutorial.html#comments</comments>
    <wfw:comment>http://loadaveragezero.com/app/s9y/wfwcomment.php?cid=131</wfw:comment>
    <slash:comments>3</slash:comments>
    <wfw:commentRss>http://loadaveragezero.com/app/s9y/rss.php?version=2.0&amp;type=comments&amp;cid=131</wfw:commentRss>
    <author>dwclifton@gmail.com (Douglas Clifton)</author>
    <content:encoded>
&lt;p&gt;&lt;img src=&quot;http://loadaveragezero.com/img/fav/drx/pencil.gif&quot; class=&quot;icon&quot; alt=&quot;note&quot; title=&quot; Noteworthy &quot; /&gt; Most crontab entries are very simple. For instance, &quot;run this script once every day exactly at midnight.&quot; To do this you would add the following line to your crontab file:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;0 0 * * * /path/to/script&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Depending on your version of cron, you could also use the shorthand string @midnight or @daily. But what if you need to run a script between the hours of 8am and 6pm every two hours, but only on Mondays through Saturdays? That's exactly the problem I had to solve earlier today. Although I use cron quite a bit, I needed a quick syntax refresher course covering some of the more advanced features.&lt;/p&gt;

&lt;p&gt;So naturally (like we all do) I reached for my browser and fired up Google. After 20 wasted minutes trying various keyword search combinations, wading several result pages deep and following links, it hit me. Do'h! All the information I needed was right on my &lt;a href=&quot;http://loadaveragezero.com/app/drx/Software/Operating_Systems/Unix/FreeBSD&quot;&gt;FreeBSD&lt;/a&gt; server. &lt;a href=&quot;http://loadaveragezero.com/app/drx/Software/Operating_Systems/Unix/FreeBSD#freebsd:man&quot;&gt;Man pages&lt;/a&gt; contain a wealth of information about your system and have been around for many years. Decades actually. Many people forget that Unix was originally designed as a text processing and technical manual printing system. The entire system is documented using these manual pages and they are designed for viewing, searching and printing. I once kept large chunks of the standard &lt;a href=&quot;http://loadaveragezero.com/app/drx/Programming/Languages/C&quot;&gt;C&lt;/a&gt; library in a 3-ring binder.&lt;/p&gt;

&lt;p&gt;Since crontab is a file format, I went back to my shell and entered:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;$ man 5 crontab&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;And within a matter of minutes I had the solution. The point of all of this is sometimes we are over reliant on the Web and search. Often the information we are seeking has been there all along. Call me old school, I still like books too.&lt;/p&gt;

&lt;p&gt;By the way, the crontab entry looks like this:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;0 8-18/2 * * 1-6 /path/to/script&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;If you really &lt;em&gt;have&lt;/em&gt; to use a browser, try this &quot;&lt;a href=&quot;http://www.freebsd.org/cgi/man.cgi?query=crontab&amp;amp;apropos=0&amp;amp;sektion=5&amp;amp;manpath=FreeBSD+7.0-RELEASE+and+Ports&amp;amp;format=html&quot;&gt;advanced crontab tutorial&lt;/a&gt;.&quot;&lt;/p&gt;

&lt;p&gt;How ironic, I posted this exactly at midnight.&lt;/p&gt;
    </content:encoded>
    <pubDate>Fri, 28 Apr 2006 00:00:00 -0400</pubDate>
    <guid isPermaLink="false">http://loadaveragezero.com/app/s9y/index.php?/archives/131-guid.html</guid>
    <creativeCommons:license>http://creativecommons.org/licenses/by-nc-sa/2.0/</creativeCommons:license><category>cron</category>
<category>crontab</category>
<category>freebsd</category>
<category>linux</category>
<category>sysadmin</category>
<category>unix</category>
</item>
<item>
    <title>Long Live the King</title>
    <link>http://loadaveragezero.com/app/s9y/index.php?/archives/20-Long-Live-the-King.html</link>
<category>Mac</category><category>FreeBSD</category><category>Linux</category>    <comments>http://loadaveragezero.com/app/s9y/index.php?/archives/20-Long-Live-the-King.html#comments</comments>
    <wfw:comment>http://loadaveragezero.com/app/s9y/wfwcomment.php?cid=20</wfw:comment>
    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://loadaveragezero.com/app/s9y/rss.php?version=2.0&amp;type=comments&amp;cid=20</wfw:commentRss>
    <author>dwclifton@gmail.com (Douglas Clifton)</author>
    <content:encoded>
&lt;p&gt;&lt;img src=&quot;http://loadaveragezero.com/img/fav/drx/lucent.gif&quot; class=&quot;icon&quot; alt=&quot;lucent&quot; title=&quot; Bell Labs &quot; /&gt; When &lt;a href=&quot;http://www.lucent.com/&quot; title=&quot; Lucent Technologies/Bell Labs &quot;&gt;Bell Labs&lt;/a&gt; finally closed its doors on &lt;a href=&quot;http://www.unixreview.com/documents/s=9846/ur0508l/ur0508l.html&quot; title=&quot; Department 1127: going, Going, GONE! &quot;&gt;Department 1127&lt;/a&gt; this month, it didn't signal the end of &lt;a href=&quot;http://loadaveragezero.com/app/drx/Software/Operating_Systems/Unix&quot; title=&quot; Unix &quot;&gt;Unix&lt;/a&gt;. Perhaps the end of an operating system research environment at Lucent, as most of the original team are long gone to other ventures (Google, Pixar, NASA/&lt;acronym title=&quot; Jet Propulsion Laboratory &quot;&gt;JPL&lt;/acronym&gt;, Princeton, Dartmouth...) or retired anyway.&lt;/p&gt;

&lt;p&gt;Hardly. Unix was first developed around 1970 (aka the &quot;epoch&quot;), and the original concepts and many of the technologies are stronger than ever. The vast majority of the servers that drive the Internet and the Web are some form of Unix operating system. Hell, we wouldn't &lt;em&gt;&lt;strong&gt;have&lt;/strong&gt;&lt;/em&gt; an Internet if wasn't for the role that Unix played in it.&lt;/p&gt;

&lt;p&gt;I use &lt;a href=&quot;http://loadaveragezero.com/app/drx/Software/Operating_Systems/Unix/Linux&quot; title=&quot; Linux &quot;&gt;Linux&lt;/a&gt;, &lt;a href=&quot;http://loadaveragezero.com/app/drx/Software/Operating_Systems/Unix/FreeBSD&quot; title=&quot; FreeBSD &quot;&gt;FreeBSD&lt;/a&gt; and &lt;a href=&quot;http://loadaveragezero.com/app/drx/Software/Operating_Systems/Unix/Mac_OS_X&quot; title=&quot; Mac OS X &quot;&gt;OS X&lt;/a&gt; every day. Both as servers and as desktop environments. There are countless other Unix variants, both open-source (OpenBSD, NetBSD, DragonflyBSD, Darwin...) as well as commercial ones (Sun, HP...).&lt;/p&gt;

&lt;p&gt;And all of this, thanks to a couple of guys who were allowed to tinker with an old &lt;a href=&quot;http://loadaveragezero.com/img/pdp11.jpg&quot;&gt;PDP-11&lt;/a&gt; in exchange for developing an electronic typesetting system so AT&amp;amp;T could publish their own technical manuals. Oh, and while they were at it they threw in a little thing called the &lt;a href=&quot;http://loadaveragezero.com/app/drx/Programming/Languages/C&quot; title=&quot; C Programming Language &quot;&gt;C&lt;/a&gt; programming language. But that's another story.&lt;/p&gt;

&lt;p&gt;The King is dead, long live the King!&lt;/p&gt;
    </content:encoded>
    <pubDate>Thu, 18 Aug 2005 00:15:33 -0400</pubDate>
    <guid isPermaLink="false">http://loadaveragezero.com/app/s9y/index.php?/archives/20-guid.html</guid>
    <creativeCommons:license>http://creativecommons.org/licenses/by-nc-sa/2.0/</creativeCommons:license><category>bell labs</category>
<category>freebsd</category>
<category>linux</category>
<category>lucent</category>
<category>operating system</category>
<category>osx</category>
<category>unix</category>
</item>
</channel>
</rss>
