Archive for the ‘System Administration’ Category

Showing progress using dd

Released September 16th, 2009

One of the frustrating behaviors of dd is that it provides no feedback about what it is doing. It does however provide a signal (USR1) that you can send to the process that will dump the current progress. Open a new terminal (I use screen) and type:

while true; do kill -USR1 `pidof dd`; sleep 2; done

(NOTE: if `pidof dd` doesn’t work for you, just use the process id directly)

Switch back to the terminal where dd is running and you should see:

9902751744 bytes (9.9 GB) copied, 732.883 s, 13.5 MB/s
9469+0 records in
9468+0 records out
9927917568 bytes (9.9 GB) copied, 734.914 s, 13.5 MB/s
9496+0 records in
9495+0 records out
9956229120 bytes (10 GB) copied, 736.941 s, 13.5 MB/s

updating every couple of seconds.

Deploying a Subversion branch with Capistrano

Released January 30th, 2008

Capistrano is a tool for automating tasks via SSH on remote servers. It has many uses, I (and many others) use it to deploy their (rails) applications. The best I can tell there is no built in way to deploy a branch from your source code control system. There are a couple ways of accomplishing this. I chose passing in the branch as a parameter to the Capistrano command:

cap --set-before branch=testbranch  deploy

(more…)

‘cvs update: move away foo.bar; it is in the way’

Released April 26th, 2007

A client site still uses cvs, the ever trusty version control system. After what seemed a run of the mill merge I noticed this:

C lib/jt400_3_3.jar
cvs update: move away lib/jt400_3_3.jar; it is in the way

Very peculiar. That file hadn’t changed on the branch. I googled around and an explanation started to come together.

First, from the always excellent Roedy Green’s Java & Internet Glossary on Mindprod:

CVS is disturbed by the appearance of repository files it did not put there. Your best bet is simply to delete the entire directory containing the offending file by hand, and re checkout or reupdate the directory to build the necessary entries. Then you can add the files safely.

Then, I found this mailing list post:

This means the file that CVS wants to checkout exists on the local machine but CVS never created the file in the past. This it isn’t CVS’s file and it complains rather than blindly overwriting.

The solution everywhere was the same: just blow away the directory and check it out again. Worked for me. Now I need to puzzle out how it happened in the first place…

Mongrel vs. WEBrick

Released April 3rd, 2007

We had the need to hook up a simple ruby web process (non-rails) to our production apache server. As Leslie Hensley pointed out, fcgi and scgi are on the way out, and mongrel + mod_proxy_balancer is the new way to go. We use mongrel extensively in our rails deploys, but always just used WEBrick to do simple serving. To make a long story short, we had to ask ourselves….is mongrel that much better (and in this case better == faster) than WEBrick?

(more…)

iSeries SQL Performance

Released March 11th, 2007

Our project has entered the stage where we start tuning the data access to improve performance. This project uses Hibernate to access a AS400/iSeries system via the JT400 JDBC driver. My prior experience using the SQL Server
database introduced me to Microsoft’s Query Analyzer, which is a nice tool for identifying where you need to index the database. A bit of searching turned up an IBM tool called Visual Explain that fits the same need. The sysadmin started the profiling tool against my connection, and I ran through a standard usage scenario in our application. Visual Explain showed each query, how long it took, the access strategy, and recommended indexes. This kind of tool is absolutely essential for database performance tuning.

The second important item we learned from this process is the performance difference between a true SQL index on the iSeries and a keyed logical index. From iSeries V4R2 on, SQL indexes have page sizes of 64k. By comparison, a DDS logical will max out at 32k, and will typically be much smaller. The performance is so much better that our customer is converting all all their DDS logical indexes to SQL, even if only their legacy RPG applications use it. This IBM document has a good discussion of this issue.

Scary apache errors moving from prefork to worker

Released March 7th, 2007

We were helping a client push more data through their apache/tomcat web application running on a Solaris box and decided to switch from the process (prefork) to the threaded(worker) processing model. When we fired up the new apache we started getting errors like these:

[emerg] (45)Deadlock situation detected/avoided: apr_proc_mutex_lock failed. Attempting to shutdown process gracefully.

Not good. After a little digging we found that the default AcceptMutex switched from pthread to fcntl between apache 2.0.49 and 2.0.52.

Adding:

AcceptMutex pthread

to httpd.conf cleared up the errors and we’ve had smooth sailing ever since.

How to build the PHP rrdtool extension by hand

Released May 9th, 2006

I think by now most sysadmin types know about rrdtool and the nice graphs it makes. I recently wanted to create some graphs by hand using PHP so I turned to the php-rrdtool extension. I found that it takes a little work to get it to compile but that could be because I’m not constantly recompiling PHP and just don’t know better. You can get this module as an rpm for fedora (php-rrdtool) but I like to compile php by hand so I couldn’t use it. I’m going to assume that you know how to compile PHP normally with whatever other items you want to include and that you have the rrdtool development libraries installed or have compiled and installed rrdtool from source.

Step 1. Get the PHP rrdtool source

Go to the contrib directory on the rrdtool distribution site:
http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/pub/contrib/

There are a number of files in this directory that mention rrd. You want the one named: php_rrdtool.tgz

(more…)

managing disk space with logrotate

Released April 26th, 2006

At a customer site, the test and production linux servers for some intranet applications were slowly running out of disk space. The apps themselves were running fine, it was the logfiles generated by the apps that were the issue; logging that came in useful just in case something went wrong, but quickly aged. In one particular case, a catalina.out file was several years old and was 11 gigabytes; 90% of it wasn’t really relevant any longer.

The solution to the problem of wrangling logfiles is probably already installed on your server: it’s logrotate. Chances are logrotate is already configured in your system crontab as a daily task, and chances are it is configured to obey any configuration files found in the /etc/logrotate.d directory. If you find that directory, you are probably good to go.

(more…)

accessing as400 databases with Ruby, Java, and the RubyJavaBridge

Released April 24th, 2006

iSeries systems and Ruby are separate universes right now; I know of no native Ruby way to connect to an AS400 system. However, with the help of Java we can bridge the gap.

Probably the simplest example is connecting to an iSeries system using JDBC to select from some tables, and with the right libraries it is straightforward. It’s also fun to be doing some nifty quick Ruby but with some preexisting and proven Java libraries.

Also, if you use Java and Ruby, definitely put arton’s RubyJavaBridge in your pocket; you’ll end up using it more than once.

What you’ll need:

  • Ruby (well yea)
  • Java (ok)
  • access to an AS400 system (that’s the whole point here…)
    For testing, I’ve been using a free account available from Holger Scherer Software und Beratung.
  • jt400.jar jdbc drivers from JTOpen, a great open-source Java library for AS400 access.
  • RubyJavaBridge is the keystone for this.

Installing the software is the hardest part of this exercise, but it’s more tedium than anything. Once you’ve got it all downloaded and running the fun can begin.

(more…)

Long arguments and getops

Released March 7th, 2006

I recently had a need to adapt a script that recrawls a site with nutch. One of my design goals was to use the same command line options as the Fetchtool (one of the steps I had to take to recrawl a site).

It became apparent fairly quickly that bash’s built-in ‘getopts’ didn’t support long command line arguments, so I had to fall back on getopt.

Here is the portion of the script that parses the command line arguments:

set -- `getopt -n$0 -u -a --longoptions="depth: adddays: topN:" "h" "$@"` || usage
[ $# -eq 0 ] && usage

while [ $# -gt 0 ]
do
    case "$1" in
       --depth)   depth=$2;shift;;
       --adddays) adddays=$2;shift;;
       --topN)    topN=$2;shift;;
       -h)        usage;;
       --)        shift;break;;
       -*)        usage;;
       *)         break;;            #better be the crawl directory
    esac
    shift
done

Deconstructing this bit by bit:

(more…)