Next Previous Contents

9. Functions

A function may be thought of as a group of statements that work together to perform a computation. While there are no imposed limits upon the number statements that may occur within a function, it is considered poor programming practice if a function contains many statements. This notion stems from the belief that a function should have a simple, well defined purpose.

9.1 Declaring Functions

Like variables, functions must be declared before they can be used. The define keyword is used for this purpose. For example,

      define factorial ();
is sufficient to declare a function named factorial. Unlike the variable keyword used for declaring variables, the define keyword does not accept a list of names.

Usually, the above form is used only for recursive functions. In most cases, the function name is almost always followed by a parameter list and the body of the function:

define function-name (parameter-list) { statement-list }
The function-name is an identifier and must conform to the naming scheme for identifiers discussed in chapter ???. The parameter-list is a comma-separated list of variable names that represent parameters passed to the function, and may be empty if no parameters are to be passed. The body of the function is enclosed in braces and consists of zero or more statements (statement-list).

The variables in the parameter-list are implicitly declared, thus, there is no need to declare them via a variable declaration statement. In fact any attempt to do so will result in a syntax error.

9.2 Parameter Passing Mechanism

Parameters to a function are always passed by value and never by reference. To see what this means, consider

     define add_10 (a) 
     {
        a = a + 10;
     }
     variable b = 0;
     add_10 (b);
Here a function add_10 has been defined, which when executed, adds 10 to its parameter. A variable b has also been declared and initialized to zero before it is passed to add_10. What will be the value of b after the call to add_10? If S-Lang were a language that passed parameters by reference, the value of b would be changed to 10. However, S-Lang always passes by value, which means that b would retain its value of zero after the function call.

S-Lang does provide a mechanism for simulating pass by reference via the reference operator. See the next section for more details.

If a function is called with a parameter in the parameter list omitted, the corresponding variable in the function will be set to NULL. To make this clear, consider the function

     define add_two_numbers (a, b)
     {
        if (a == NULL) a = 0;
        if (b == NULL) b = 0;
        return a + b;
     }
This function must be called with two parameters. However, we can omit one or both of the parameters by calling it in one of the following ways:
     variable s = add_two_numbers (2,3);
     variable s = add_two_numbers (2,);
     variable s = add_two_numbers (,3);
     variable s = add_two_numbers (,);
The first example calls the function using both parameters; however, at least one of the parameters was omitted in the other examples. The interpreter will implicitly convert the last three examples to
     variable s = add_two_numbers (2, NULL);
     variable s = add_two_numbers (NULL, 3);
     variable s = add_two_numbers (NULL, NULL);
It is important to note that this mechanism is available only for function calls that specify more than one parameter. That is,
     variable s = add_10 ();
is not equivalent to add_10(NULL). The reason for this is simple: the parser can only tell whether or not NULL should be substituted by looking at the position of the comma character in the parameter list, and only function calls that indicate more than one parameter will use a comma. A mechanism for handling single parameter function calls is described in the next section.

9.3 Referencing Variables

One can achieve the effect of passing by reference by using the reference (&) and dereference (@) operators. Consider again the add_10 function presented in the previous section. This time we write it as

     define add_10 (a)
     {  
        @a = @a + 10;
     }
     variable b = 0;
     add_10 (&b);
The expression &b creates a reference to the variable b and it is the reference that gets passed to add_10. When the function add_10 is called, the value of a will be a reference to b. It is only by dereferencing this value that b can be accessed and changed. So, the statement @a=@a+10; should be read `add 10' to the value of the object that a references and assign the result to the object that a references.

The reader familiar with C will note the similarity between references in S-Lang and pointers in C.

One of the main purposes for references is that this mechanism allows reference to functions to be passed to other functions. As a simple example from elementary calculus, consider the following function which returns an approximation to the derivative of another function at a specified point:

     define derivative (f, x)
     {
        variable h = 1e-6;
        return (@f(x+h) - @f(x)) / h;
     }
It can be used to differentiate the function
     define x_squared (x)
     {
        return x^2;
     }
at the point x = 3 via the expression derivative(&x_squared,3).

9.4 Functions with a Variable Number of Arguments

S-Lang functions may be defined to take a variable number of arguments. The reason for this is that the calling routine pushes the arguments onto the stack before making a function call, and it is up to the called function to pop the values off the stack and make assignments to the variables in the parameter list. These details are, for the most part, hidden from the programmer. However, they are important when a variable number of arguments are passed.

Consider the add_10 example presented earlier. This time it is written

     define add_10 ()
     {
        variable x;
        x = ();
        return x + 10;
     }
     variable s = add_10 (12);  % ==> s = 22;
For the uninitiated, this example looks as if it is destined for disaster. The add_10 function looks like it accepts zero arguments, yet it was called with a single argument. On top of that, the assignment to x looks strange. The truth is, the code presented in this example makes perfect sense, once you realize what is happening.

First, consider what happened when add_10 is called with the the parameter 12. Internally, 12 is pushed onto the stack and then the function called. Now, consider the function itself. x is a variable local to the function. The strange looking assignment `x=()' simply takes whatever is on the stack and assigns it to x. In other words, after this statement, the value of x will be 12, since 12 will be at the top of the stack.

A generic function of the form

    define function_name (x, y, ..., z)
    {
       .
       .
    }
is internally transformed by the interpreter to
    define function_name ()
    {
       variable x, y, ..., z;
       z = ();
       .
       .
       y = ();
       x = ();
       .
       .
    }
before further parsing. (The add_10 function, as defined above, is already in this form.) With this knowledge in hand, one can write a function that accepts a variable number of arguments. Consider the function:
    define average_n (n)
    {
       variable x, y;
       variable sum;
       
       if (n == 1) 
         {
            x = ();
            sum = x;
         }
       else if (n == 2)
         {
            y = ();
            x = ();
            sum = x + y;
         }
       else error ("average_n: only one or two values supported");
       
       return sum / n;
   }
   variable ave1 = average_n (3.0, 1);        % ==> 3.0
   variable ave2 = average_n (3.0, 5.0, 2);   % ==> 4.0
Here, the last argument passed to average_n is an integer reflecting the number of quantities to be averaged. Although this example works fine, its principal limitation is obvious: it only supports one or two values. Extending it to three or more values by adding more else if constructs is rather straightforward but hardly worth the effort. There must be a better way, and there is:
   define average_n (n)
   {
      variable sum, x;
      sum = 0;
      loop (n) 
        {
           x = ();    % get next value from stack
           sum += x;
        }
      return sum / n;
   }
The principal limitation of this approach is that one must still pass an integer that specifies how many values are to be averaged.

Fortunately, a special variable exists that is local to every function and contains the number of values that were passed to the function. That variable has the name _NARGS and may be used as follows:

   define average_n ()
   {
      variable x, sum = 0;
      
      if (_NARGS == 0) error ("Usage: ave = average_n (x, ...);");

      loop (_NARGS)
        {
           x = ();
           sum += x;
        }
      return sum / _NARGS;
   }
Here, if no arguments are passed to the function, a simple message that indicates how it is to be used is printed out.

9.5 Returning Values

As stated earlier, the usual way to return values from a function is via the return statement. This statement has the simple syntax

return expression-list ;
where expression-list is a comma separated list of expressions. If the function does not return any values, the expression list will be empty. As an example of a function that can return multiple values, consider
        define sum_and_diff (x, y)
        {
            variable sum, diff;

            sum = x + y;  diff = x - y;
            return sum, diff;
        }
which is a function returning two values.

It is extremely important to note that the calling routine must explicitly handle all values returned by a function. Although some languages such as C do not have this restriction, S-Lang does and it is a direct result of a S-Lang function's ability to return many values and accept a variable number of parameters. Examples of properly handling the above function include

       variable sum, diff;
       (sum, diff) = sum_and_diff (5, 4);  % ignore neither
       (sum, ) = sum_and_diff (5, 4);      % ignore diff
       (,) = sum_and_diff (5, 4);          % ignore both sum and diff
See the section below on assignment statements for more information about this important point.

9.6 Multiple Assignment Statement

S-Lang functions can return more than one value, e.g.,

       define sum_and_diff (x, y)
       {
          return x + y, x - y;
       }
returns two values. It accomplishes this by placing both values on the stack before returning. If you understand how S-Lang functions handle a variable number of parameters (section ???), then it should be rather obvious that one assigns such values to variables. One way is to use, e.g.,
      sum_and_diff (9, 4);
      d = ();
      s = ();

However, the most convenient way to accomplish this is to use a multiple assignment statement such as

       (s, d) = sum_and_diff (9, 4);
The most general form of the multiple assignment statement is
     ( var_1, var_2, ..., var_n ) = expression;
In fact, internally the interpreter transforms this statement into the form
     expression; var_n = (); ... var_2 = (); var_1 = ();
for further processing.

If you do not care about one of return values, simply omit the variable name from the list. For example,

        (s, ) = sum_and_diff (9, 4);
assigns the sum of 9 and 4 to s and the difference (9-4) will be removed from the stack.

As another example, the jed editor provides a function called down that takes an integer argument and returns an integer. It is used to move the current editing position down the number of lines specified by the argument passed to it. It returns the number of lines it successfully moved the editing position. Often one does not care about the return value from this function. Although it is always possible to handle the return value via

       variable dummy = down (10);
it is more convenient to use a multiple assignment expression and omit the variable name, e.g.,
       () = down (10);

Some functions return a variable number of values instead of a fixed number. Usually, the value at the top of the stack will indicate the actual number of return values. For such functions, the multiple assignment statement cannot directly be used. To see how such functions can be dealt with, consider the following function:

     define read_line (fp)
     {
        variable line;
        if (-1 == fgets (&line, fp))
          return -1;
        return (line, 0);
     }
This function returns either one or two values, depending upon the return value of fgets. Such a function may be handled as in the following example:
      status = read_line (fp);
      if (status != -1)
        {
           s = ();
           .
           .
        }
In this example, the last value returned by read_line is assigned to status and then tested. If it is non-zero, the second return value is assigned to s. In particular note the empty set of parenthesis in the assignment to s. This simply indicates that whatever is on the top of the stack when the statement is executed will be assigned to s.

Before leaving this section it is important to reiterate the fact that if a function returns a value, the caller must deal with that return value. Otherwise, the value will continue to live onto the stack and may eventually lead to a stack overflow error. Failing to handle the return value of a function is the most common mistake that inexperienced S-Lang programmers make. For example, the fflush function returns a value that many C programmer's never check. Instead of writing

      fflush (fp);
as one could in C, a S-Lang programmer should write
      () = fflush (fp);
in S-Lang. (Many good C programmer's write (void)fflush(fp) to indicate that the return value is being ignored).

9.7 Exit-Blocks

An exit-block is a set of statements that get executed when a functions returns. They are very useful for cleaning up when a function returns via an explicit call to return from deep within a function.

An exit-block is created by using the EXIT_BLOCK keyword according to the syntax

EXIT_BLOCK { statement-list }
where statement-list represents the list of statements that comprise the exit-block. The following example illustrates the use of an exit-block:
      define simple_demo ()
      {
         variable n = 0;

         EXIT_BLOCK { message ("Exit block called."); }

         forever
          {
            if (n == 10) return;
            n++;
          }
      }
Here, the function contains an exit-block and a forever loop. The loop will terminate via the return statement when n is 10. Before it returns, the exit-block will get executed.

A function can contain multiple exit-blocks, but only the last one encountered during execution will actually get executed. For example,

      define simple_demo (n)
      {
         EXIT_BLOCK { return 1; }
         
         if (n != 1)
           {
              EXIT_BLOCK { return 2; }
           }
         return;
      }
If 1 is passed to this function, the first exit-block will get executed because the second one would not have been encountered during the execution. However, if some other value is passed, the second exit-block would get executed. This example also illustrates that it is possible to explicitly return from an exit-block, although nested exit-blocks are illegal.


Next Previous Contents