In the initial revision of the WideFinder, I couldn't get away from the thought that I was handling the data one too many times. If you look at WideFinder.run (WideFinder.java:57),
byte seq = new byte[ seek - _mbb.position() ];We're grabbing bytes out of the buffer, sticking them into a fresh byte array, and creating a new String object based on those bytes, using a specific charset. This is because the character set used by CharBuffer treats each byte as half of a 16-byte char, which means that the bytes are interpreted as non-English characters.
_mbb.get( seq );
nextLine = new String( seq, LOG_CHARSET );
Creating new bytes and new Strings like that, in a pretty tight loop, is just asking for performance trouble. I wanted to get to the point where the regex could be applied straight to the ByteBuffer. Looking at the implementation of Charset.encode, it seemed like it suffered from the same sort of space-inefficiency as my original approach.
The solution was to wrap the ByteBuffer in a trivial implementation of CharSequence that indexed directly into the current line of the buffer. The only gotcha was to use the mark() and reset() methods of the buffer in the implementation of subSequence.
Unfortunately, my efforts produced only a minimal increase in performance (~2%) on either my single-core Athlon or the T2K. So much for "low-hanging fruit".
I created a new revision of the Java 6 WideFinder with some tweaks to the ByteBuffer implementation. It's available here: wf_02.tar.gz. It turned out to be a few hours' effort for a very minimal performance gain.
With that little misadventure under my belt I decided to run the WideFinder through DTrace to see where the bulk of time was being spent. DTrace is Sun's dynamic tracing tool, it allows you to inspect a bunch of different aspects of a program's performance as it is running. If you've never used it, you should really give it a try. It's pretty amazing, the stuff you get visibility into. I gather it's also available on OS X as the "Instruments" app.
This is the output of running the method-timings.d script against the WideFinder on a 100-line log file (produced by the DataGenerator). Times given are in microseconds and represent the total time spent in methods in classes of the given package:
The absolute values are not as meaningful as the relative values. The act of observing the program's performance has skewed the measurements (where's that damned cat!), but we hope that it's skewed the measurements equally for each package. From these data, we can see that the "big spenders" are java.util.regex (~39% of total time), java.nio (~25% of total time, with subpackages), and java.lang (~15% of total time).
Here are the 10 most expensive methods in java.nio (not including subpackages), also measured in microseconds:
CharBuffer.arrayOffset 43204And here are the 10 most expensive methods in java.util.regex (although there seem to be a lot of very expensive methods in this package):
Pattern.escape 165501Looks like, if there's low-hanging fruit, it's in the regex processing. A little regex-optimization may go a long way.