How come my converted awk/sed/sh script runs more slowly in Perl?


    The natural way to program in those languages may not make for the fastest
    Perl code.  Notably, the awk-to-perl translator produces sub-optimal code;
    see the a2p man page for tweaks you can make.

    Two of Perl's strongest points are its associative arrays and its regular
    expressions.  They can dramatically speed up your code when applied
    properly.  Recasting your code to use them can help a lot.

    How complex are your regexps?  Deeply nested sub-expressions with {n,m} or
    * operators can take a very long time to compute.  Don't use ()'s unless
    you really need them.  Anchor your string to the front if you can.

    Something like this:
        next unless /^.*%.*$/; 
    runs more slowly than the equivalent:
        next unless /%/;

    Note that this:
        next if /Mon/;
        next if /Tue/;
        next if /Wed/;
        next if /Thu/;
        next if /Fri/;
    runs faster than this:
        next if /Mon/ || /Tue/ || /Wed/ || /Thu/ || /Fri/;
    which in turn runs faster than this:
        next if /Mon|Tue|Wed|Thu|Fri/;
    which runs *much* faster than:
        next if /(Mon|Tue|Wed|Thu|Fri)/;

    There's no need to use /^.*foo.*$/ when /foo/ will do.

    Remember that a printf costs more than a simple print.

    Don't split() every line if you don't have to.

    Another thing to look at is your loops.  Are you iterating through 
    indexed arrays rather than just putting everything into a hashed 
    array?  For example,

        @list = ('abc', 'def', 'ghi', 'jkl', 'mno', 'pqr', 'stv');

        for $i ($[ .. $#list) {
            if ($pattern eq $list[$i]) { $found++; } 
        } 

    First of all, it would be faster to use Perl's foreach mechanism
    instead of using subscripts:

        foreach $elt (@list) {
            if ($pattern eq $elt) { $found++; } 
        } 

    Better yet, this could be sped up dramatically by placing the whole
    thing in an associative array like this:

        %list = ('abc', 1, 'def', 1, 'ghi', 1, 'jkl', 1, 
                 'mno', 1, 'pqr', 1, 'stv', 1 );
        $found += $list{$pattern};
    
    (but put the %list assignment outside of your input loop.)

    You should also look at variables in regular expressions, which is
    expensive.  If the variable to be interpolated doesn't change over the
    life of the process, use the /o modifier to tell Perl to compile the
    regexp only once, like this:

        for $i (1..100) {
            if (/$foo/o) {
                &some_func($i);
            } 
        } 

    Finally, if you have a bunch of patterns in a list that you'd like to 
    compare against, instead of doing this:

        @pats = ('_get.*', 'bogus', '_read', '.*exit', '_write');
        foreach $pat (@pats) {
            if ( $name =~ /^$pat$/ ) {
                &some_func();
                last;
            }
        }

    If you build your code and then eval it, it will be much faster.
    For example:

        @pats = ('_get.*', 'bogus', '_read', '.*exit', '_write');
        $code = <<EOS
                while (<>) { 
                    study;
EOS
        foreach $pat (@pats) {
            $code .= <<EOS
                if ( /^$pat\$/ ) {
                    &some_func();
                    next;
                }
EOS
        }
        $code .= "}\n";
        print $code if $debugging;
        eval $code;