HomepageTeaching PageResearch PageResources Page

Chapel Tutorial for Programmers


This tutorial is a quick introduction to Chapel for programmers. (It expects you to already be comfortable programming in another language.) This code is currently compatible with the version 1.8.0 Chapel compiler. (Last Updated Nov. 2013.) If you are waiting for it to work for a new version, it is okay to let me know I'm being slow!


More information about Chapel is available here. I'm assuming you're on a Linux/Unix system with it already installed or you will follow the directions there to install it yourself.

Let's get started! First, create a file, test.chpl in your favorite text editor. Then, let's declare our first variable. Add the following line to your code:

var maximum: real = 0.0;

This creates a new real (float) variable with value 0. Chapel derives the type of a variable itself in initialization statements, so you avoid declaring the type and instead write:

var maximum = 0.0;

Print statements are important! Let's add one after the line we just added:

writeln("maximum: ", maximum);

We can initialize an array in the following way:

var n: int = 10;
var realArray: [0..n-1] real;
realArray[0] = 1.0;
realArray[1] = 2.0;
realArray[2] = 3.0;
realArray[3] = 4.0;
realArray[4] = 5.0;
realArray[5] = 4.0;
realArray[6] = 3.0;
realArray[7] = 2.0;
realArray[8] = 1.0;
realArray[9] = 0.0;

Cool! We have an array! We can print it out with just one line:

writeln("realArray: ", realArray);

Okay, now open a terminal, make sure the Chapel command is loaded into the terminal window (see the Chapel installation instructions), and navigate to the directory where your file is located. Then type:

$ chpl test.chpl

To compile and create the executable file. After that's finished, type:

$ ./a.out

to run the executable file. The output should look something like:

maximum: 0.0
realArray: 1.0 2.0 3.0 4.0 5.0 4.0 3.0 2.0 1.0 0.0

If an error occurs, go back to your code and try to fix it. At this point, not all of the error messages are super helpful. If you get a weird message that doesn't tell you anything, you might need to ask for more help. I recommend checking out the archives of the Chapel mailing list.

Next, let's write a loop to find the maximum element in an array. Add the following to the end of your file:

for realNumber in realArray {
    if (realNumber > maximum) {
        maximum = realNumber;
    }
}
writeln("Max: ", maximum);

Compile and run this and make sure it reports that the max is 5.0. Next, let's encapsulate that loop into a function. Replace it with the following:

proc findMax(array) {
    var arrayMax = array[0];
    for number in array {
        if (number > arrayMax) {
            arrayMax = number;
        }
    }
    return arrayMax;
}

Now call the function after the array has been initialized:

writeln("Max of realArray is: ", findMax(realArray));

Get rid of the old maximum variable, compile, and run the code to make sure it's still working.

One nice part of Chapel is that you can easily measure the running time of your code using the Time package. Let's do that next! Add this to the top of your code (first line):

use Time;

Now we can create Timer objects inside the code. Add lines before and after the printing line like so:

var timer: Timer;
timer.start();
writeln("Max of realArray is: ", findMax(realArray));
timer.stop();
writeln("That took ", timer.elapsed(), " seconds.");

You may not want the printing to be timed. We can change that by splitting the print statement into two lines:

var timer: Timer;
timer.start();
maximum = findMax(realArray);
timer.stop();
writeln("Max of realArray is: ", maximum);
writeln("That took ", timer.elapsed(), " seconds.");

Compile and run your code to see how long it takes to find the maximum element. We would like to see how this changes with different values of n. The main problem is that we don't want to have to explicitly initialize each element of realArray in the code. So, next, let's learn how to create arrays of random values. We will use the Random package. Just like before, add a use statement to the top of your code:

use Time;
use Random;

Now change the section initializing realArray to the following:

var n = 10;
var realArray : [0..n-1] real;
var randomStream = new RandomStream(SeedGenerator.currentTime);
randomStream.fillRandom(realArray);
writeln("realArray: ", realArray);

Compile and run your code to make sure this new code is working. You'll see that the array is now populated by random values. Now go back and change n to be larger:

var n = 500;

Compile and run again. Wouldn't it be great to not have to recompile just because you change n? Absolutely! Let's make that happen using configurable constants. Change the initialization of n to the following:

config const n = 500;

Now if you want to run the code with a value of n different from the code, you declare the new value like so:

./a.out --n=5000

Great! Now let's try harnessing the power of extra processors! Chapel does this by creating new tasks, which are then run on available processors. For example, if we had two processors available to find the maximum value of an array, we could have two tasks: one to search each half of the array. Let's work towards that by first adding a function that looks in a range of the array. We'll use a while loop to iterate here instead of a for loop (for example's sake). Add this function to your code:

proc findMaxInRange(array, lower, upper) {
    var arrayMax = array[lower];
    while (lower <= upper) {
        if (array[lower] > arrayMax) {
            arrayMax = array[lower];
        }
        lower += 1;
    }
    return arrayMax;
}

If you try to use this function right away, Chapel complains about the line where lower is incremented, saying: "error: non-lvalue actual passed to ref argument". This happens because Chapel treats passed integers as constants; you can't modify them. You can fix this by changing the function signature:

proc findMaxInRange(array, in lower, upper) {

Here, in tells Chapel not to treat that variable as a constant. Another option to fix this problem is to create a separate index variable. Instead of using in, I will continue from the following example, using index i, which might make the code easier to read:

proc findMaxInRange(array, lower, upper) {
    var i = lower;
    var arrayMax = array[i];
    while (i <= upper) {
        if (array[i] > arrayMax) {
            arrayMax = array[i];
        }
        i += 1;
    }
    return arrayMax;
}

Compile to check that you don't have any syntax errors. Now let's change findMax to use that new function:

proc findMax(array) {
    var middle = array.numElements/2;
    var lowerMax, upperMax: real;
    lowerMax = findMaxInRange(array, 0, middle);
    upperMax = findMaxInRange(array, middle+1, array.numElements - 1);
    if (lowerMax > upperMax) { return lowerMax; }
    return upperMax;
}

The comparison at the end can be replaced with a call to the built-in max function:

proc findMax(array) {
    var middle = array.numElements/2;
    var lowerMax, upperMax: real;
    lowerMax = findMaxInRange(array, 0, middle);
    upperMax = findMaxInRange(array, middle+1, array.numElements - 1);
    return max(lowerMax, upperMax);
}

If we have two processors, we can have separate tasks find the values of the two partial-maximums in parallel. That part then may take only half as long! Chapel let's us do this really easily: just wrap them in a cobegin:

proc findMax(array) {
    var middle = array.numElements/2;
    var lowerMax, upperMax: real;
    cobegin ref(lowerMax, upperMax) {
        lowerMax = findMaxInRange(array, 0, middle);
        upperMax = findMaxInRange(array, middle+1, array.numElements - 1);
    }
    return max(lowerMax, upperMax);
}

The cobegin creates a new task for each line (with the hope they will get assigned to their own hardware thread) then waits for all of them to finish before continuing with the rest of the code. The ref(lowerMax, upperMax) tells Chapel it can modify those variables that were declared outside the cobegin. (Without this statement, it treats most of those "outside" variables as constants.) Awesome! Compile and run your code again.

Challenge: write fastFindMaxInRange, a recursive version of findMaxInRange. Important: Chapel doesn't automatically determine the return type of recursive functions, so your signature needs to declare this:

proc fastFindMaxInRange(array, lower, upper) : real {

Try to get this to work before continuing. Hint: try getting it to work serially before adding a cobegin. If you get stuck, my solution is below.

proc fastFindMaxInRange(array, lower, upper) : real {
    if (lower == upper) { return array[lower]; }
    var lowerMax, upperMax: real;
    var middle = ((upper - lower) / 2) + lower;
    cobegin ref(lowerMax, upperMax) {
        lowerMax = fastFindMaxInRange(array, lower, middle);
        upperMax = fastFindMaxInRange(array, middle + 1, upper);
    }
    return max(lowerMax, upperMax);
}

Okay, let's see what kind of speedup we attained! Add two wrapper functions to test the parallel and serial versions.

proc serialFindMax(array) {
    return findMaxInRange(array, 0, array.numElements - 1);
}

proc parallelFindMax(array) {
    return fastFindMaxInRange(array, 0, array.numElements - 1);
}

... then write some code to time them both:

timer.clear();
timer.start();
var serialMax = serialFindMax(realArray);
timer.stop();
var serialSeconds = timer.elapsed();
timer.clear();
timer.start();
var parallelMax = parallelFindMax(realArray);
timer.stop();
var parallelSeconds = timer.elapsed();
timer.clear();
writeln("SERIAL test: found the max to be ", serialMax, " in ", serialSeconds, " seconds.");
writeln("PARALLEL test: found the max to be ", parallelMax, " in ", parallelSeconds, " seconds.");
var speedup = serialSeconds / parallelSeconds;
if (speedup > 1) {
    writeln(" The parallel trial achieved a ", speedup, "x speedup over the serial!");
} else {
    writeln(" No speedup attained!");
}

Running this with n = 5000 on my 8-CPU machine gives me the following output:

$ ./a.out --n=5000
...
Serial test: found the max to be 0.999846 in 0.0008 seconds.
Parallel test: found the max to be 0.999846 in 0.55384 seconds.
Achieved a 0.00144446x speedup!

Yuck! One problem here is that there is too much overhead creating a bunch of different tasks that aren't able to run simultaneously. It's not advantageous to create too many tasks. (If you have more processors, you're probably getting better results.) To fix this problem, let's cap the number of tasks we create. The idea is that we'll specify the number of tasks we are allowed to use. To do this, create a new function, hybridFindMaxInRange, similar to fastFindMaxInRange, except with an extra parameter, numTasks, which specifies the maximum number of tasks to use in this function call. Try to write this yourself; my solution is below if you get stuck.

proc hybridFindMaxInRange(array, lower, upper, numTasks) : real {
    if (lower == upper) { return array[lower]; }
    if (numTasks == 1) {
        return findMaxInRange(array, lower, upper);
    }
    var lowerMax, upperMax: real;
    var middle = ((upper - lower) / 2) + lower;
    var lowerTasks = numTasks / 2;
    var upperTasks = numTasks - lowerTasks;
    cobegin ref(lowerMax, upperMax) {
        lowerMax = hybridFindMaxInRange(array, lower, middle, lowerTasks);
        upperMax = hybridFindMaxInRange(array, middle+1, upper, upperTasks);
    }
    return max(lowerMax, upperMax);
}

proc hybridFindMax(array, numTasks) {
    return hybridFindMaxInRange(array, 0, array.numElements-1, numTasks);
}

After this, add some code to test this just like the parallel version, including reporting the speedup over the serial. I added another configurable constant for the number of threads I wanted to use. I actually ran a number of tests, slowly increasing n. The parallel test started taking too much time, so I had to comment it out. Here are the results of one of my tests:
$ ./a.out --n=100000000 --threads=8
...
SERIAL test: found the max to be 1.0 in 6.5324 seconds.
HYBRID test: found the max to be 1.0 in 0.816864 seconds.
The hybrid achieved a 7.99692x speedup over the serial trial!

That's much better! This gave us some actual speedup! We've done a lot here. There is, however, one way to do this entire thing in just one line of Chapel code. This uses a parallel algorithm style called a reduction, which is a single pairwise operation applied to an array using a parallel divide-and-conquer as appropriate. Here's how to write that single line:

maximum = max reduce realArray;

For comparison, here is my code for this tutorial. Inside I commented out sections instead of deleted them so you might be able to find whatever section you were working on.

Acknowledgements: Thanks to Ernie Heyder for doing ALL the proofreading for early versions of this. Thanks to Brad Chamberlain both for pointing out errors and all his work on the Chapel project. Thanks to David Bunde for his own excellent tutorial and for working with me on our workshops at SC.

I'd love to add more things to this page; please let me know of anything you could suggest!