GNU Parallel

Date and Time: 
7 November, 2010 - 11:30 - 12:00

If you have used xargs, foreach-loops, or while-read-loops in shell
GNU Parallel is for you.

GNU Parallel is a tool that make it easy to run jobs in parallel. In addition to that GNU Parallel is also useful for writing small scripts - especially the use-once scripts that most sysadmins write all the time.

GNU Parallel is options compatible with xargs, but is much more powerful. It deals nicely with file name containing spaces and quotes - only if the file names contain newline do you have to take extra precautions.

Small scripts can be written with {} representing the file name. The scripts will be run in parallel.

GNU Parallel automatically detects the number of CPU cores and can schedule jobs depending on this number - e.g. -j+0 would schedule one job per core.

GNU Parallel can distribute jobs to remote computers using SSH, collect the results and present them as if they were executed serially. With the autodetection of number of CPU cores it is possible to mix both fast and slow computers: GNU Parallel will schedule more jobs on the fast computers than on the slow.

The video describes the most basic use of GNU Parallel. During the talk both the basic use and more advanced examples will be shown.

Room 3
Free Software