Generators in C

A “generator” is like a function except that it produces a  series of values, rather than a single return value.  Generators are a nice way to produce the values of a collection, one by one, and are often used with iterator constructs.  Generators have also found use as an event-driven scheduling construct and as the basis for light-weight thread packages. I used them to write a temporal expression package in Oroboro.

An example generator in Python is defined as shown below.  Each execution of the “yield” statement suspends execution of the function and produces a value.

def series_of_ints(n):
  i = 0
  while i < n:
    yield i
    i = i + 1
  return

In Python, it would be used like this.

g = series_of_ints(10)
for i in g.next():
  print "The next value is", i

The actual call to the generator-function “series_of_ints” is a factory function that produces a generator object.  The generator’s “next” method produces the next value in the sequence or halts if there are no more.

C does not have any such notion of generators.  Part of the reason is that generator functions need to save their state in the heap and not on the stack since they may be resumed.

I wanted to do an experiment and see what it would be like to implement something “generator-like” in C.   I do not particularly like what I came up with, but here is a very short explanation of how it works.  In my experiment, a generator is defined to be a function with a single argument of a special type.  All such generators have the same “stack profile” while active.  Since generators have the same stack profile, it is possible to JMP between the instructions of various generators using a technique similar to what’s been called a trampoline.  I take advantage of the special stack profile by writing a “generator helper macro” that sets up the stack and jumps to the next instruction of the generator using SETJMP.

Because all generators in my system must have the same stack profile, no generator function may use any local variables.  (Local variables are allocated on the stack.) Generators have to arrange to store their variables in the heap.  I defined a macro “V()” to help reference variables in the heap in a small effort to make the code prettier.

Here is an example of a counter generator that yields with a series of ints.  Between calls to the generator is instruction pointer and local variable value (“i”) are saved on the heap.

int counter1(slam_gen_p me)
{
  SLAM_GEN_STARTUP;

  for (V(i) = 0; V(i) < 10; V(i)++) {
    fprintf(stderr, "+++ gen:%s i:%d\n", me->name, V(i));
    SLAM_GEN_YIELD(V(i));
  }
 return 0;
}

The generator yields with a value, or terminates execution by returning 0.  Yielding and returning are two different things.

Use of the generator looks like this:

slam_gen_p gen1;

int i1;
gen1 = slam_gen_init(counter1);
do {
  i1 = slam_gen_next(gen1)
} while (i1 != 0);

The call to the “next” function resumes the generator.  The value i1 “returned” is 1 if the generator yielded and 0 if it returned.  The last value that the generator yielded with is available in its heap record through a macro.

You can view an example program here: slam_gen.c

This little experiment explored the following ideas:

  • restricting the stack-profile of all generators to be the same (a single arg) so that a helper could set up that profile
  • using SETJMP to jump from helper, into the generator and back out though YIELD
  • using return values to signify whether YIELD or EXIT was intended, not what the yielded value was
  • moving “local” variables into the heap and accessing them through a macro

In the end, I think this is just to ugly to actually use without actual compiler or interpreter support, and it may not be very portable. C is simply not the right language to use: stick with an interpreter.

Managing Expectations

It is rare that a “soft” product is delivered and deemed complete and correct on the first try.  This was something that I needed to lear after leaving a more academic world.  In contrast to a theory or paper that may be right or not-right, deliverable soft products continually evolve, and the incremental releases produced either meet the expectations of the consumer or do not.  Managing those expectations is as important a piece of the process as is producing the release.

This is where incremental releases can help both producers and consumers of soft products.  A series of incremental releases help both parties align goals and expectations.  Each release may introduce only one or a few features.  In this way the customer is not overwhelmed with information at each release.  If the project has more of the flavor of a joint-engineering project, objections to the way the project is proceeding may be more easily resolved if addressed earlier rather than later.

It is also important to communicate to your customer along the way and let them know what you are trying to do and how you intend to get there…especially if your choices impact the way their work will be done.  Finding a way to allow your customer to feel as if they participated in the process will give them a sense of ownership that will not exist if the release/deliverable process is unidirectional.

One of the best pieces of advice I received about managing expectations regarded the announcement of schedule “adjustments” (usually a schedule slip).  The advice was that if the slip was one week, it should be communicated a week before it would happen.  If it was a two month slip, a two month lead should be allowed, etc.  Schedule adjustments are a reality of engineering projects.  The way they are managed can make all the difference in keeping the relationships healthy and communication open.

Guidelines for Team Projects

A few years ago, I had the oportunity to design and guide the architecture of a fairly large software system developed by a team of engineers.  The code implemented simulation models and test generators for a family of logic components and included software modules written in C, Verilog, Specman and a few other things.  The code base was used internally, and was also released to customers.  As a moderately large software project developed under enormous schedule pressure, I think we succeeded in producing a well-organized, clean product.

Organization up front was the key to our productivity.  We had in place guidelines for where modules lived and their naming conventions.  We employed standardized names for common operations.  The result was that those of us on the team could step into someone else’s role when necesssary and understand what they were working on, because a common style had been observed.  Of course, there was still plenty of room for individuality and ingenuity in solving problems, but little energy had been wasted on inventing things like naming schemes or file organization.

One of the things that was interesting about the C-based portion of this project was that we needed it to be widely portable to a variety of operating systems and simulation environments.  In the end it ran on AIX, Solaris, HPUX and Linux on 32 and 64 bit machines.  It operated as a library under Verilog, MESA, VHDL and Specman and also operated standalone.  We isolated it from the pecularities of memory management and file IO for each of these systems so that porting could be simplified.  UP-front planning of the interfaces that would isolate this system from its environment made porting a relatively small effort, localized to a few files.

In the project mentioned above, we established project-wide conventions, and also established language-specific ones.  Recently, in the development of a new project,  I laid out some guidelines for the organization of its C-based portion.  Most of these “best practices” are things I observed from others in my years developing software systems of various types.  Some are from specific “C”-based experts, and a few are simply idiosyncratic.  ALL are debatable!  Many of them anticipate porting the code collection to new environments. All of it promotes team-based productivity through regularity.

Here is what I wrote down.

Programming Guidelines for SLAM code

PREFIX AND CAPITALIZATION

- All file names and externally visible symbols begin with "slam_".

- External function names use underscores and lower case.  (CamelCase
  is to be avoided.)

  Example: slam_buf_new(n)

PACKAGES

- The header file for a package is named "slam_pkg.h"

- If a package has state and requires initialization it has a "slam_pkg_init()" function.
- A package should be able to be initialized multiple times with no
  ill effects.

COLLECTIONS

- Collections like hash tables, buffers and tuples are supported.
  Where possible, these will use similar names and argument lists for
  consistency.

- New:

  Collections are allocated with a "_new" function, which may require
  arguments describing the collection.

  c = slam_col_new(...)

- Free:

  A collection is freed with the "_free" function.  This function will
  free the storage associated with the collection but not the items.

  slam_col_free(c)

  A deep free may be supplied and if so, will be named

  slam_col_free_deep(c)

- Copy:

  A copy of a collection is creaetd with the "_copy" function.  This
  function will copy the collection container and the elements that
  are held by value.  It is not a "deep copy."

  If a deep copy function is provided, it is named "_copy_deep".

- Length:

  The length or size of a collection is returned with "_len" function.

  len = slam_col_len(c)

- Indexed access: if the collection supports indexing with integers, "0"
  refers to the first element and "len-1" to the last.  Also, in the
  style of Python, "-1" refers to the last element and "-len" refers
  to the first.

  Indexed access functions are "_get" and "_put".  The order of the
  arguments is:

  x = slam_col_get(col, index, data)
  x = slam_col_put(col, index, &data)

- Insert/Append:

  The function "_ins" implies making space for a new value so that it
  may be put at the "index" or "key" specified.  [The insert/append
  functions modify mutable collections (buf, hash), and return new
  copies for immutable collections (tuple).]

  Examples:

  INSERT

    x = slam_hash_ins(hash, key, value) - make space for key and link to  value
    x = slam_buf_ins(buf, 11, value) - expand the buffer and put value at index 11

  APPEND

    x = slam_buf_app(buf, -1, value) - put new value after end of
    current buffer

- Delete

  The function "_del" removes an item from a collection and possibly
  reduces the length of the collection.

  An item is identified for deleting by an integer index that is
  returned by some function that placed the item there, or announced
  its location.

- Return Values: collection functions (except "_free") should return a
  success code.  "-1" is failure.  Non-negative values are success,
  and may have a meaning.

  x = slam_hash_get: returns the index where the key was found

  x = slam_buf_ins: returns the index where the value was inserted
  x = slam_buf_app: returns the index where the value was appended

  (The index may be used with the appropriate "_del" function.)

- Build/Parse (comprehensions)

  The "_build" function may use varargs to take a format string to
  build an entire collection from a list of arguments.

  The "_parse" function may be used to extract elements from a
  collection.

VARIABLES AND THROWAWAY VARIABLES

- Use short names like "i" and "s" for things like integers and
  strings that are examined in the next line or so and forgotten.

- Use short name "x" for variable whose value is simply thrown away or
  overwritten.

- In the future: use "xfoo" to name throwaway variables.

ERROR/WARNING MESSAGES

- Use prefix "+++ SLAM_PKG" for messages that are meaningful mostly to
  the SLAM programmer.

  It is appropriate to use fprintf(stderr) for these messages as they
  may be necessary for low-level debugging.

  It is also acceptable to use "slam_printf()"

- Use prefix "*** SLAM_PKG" for messsages that are intended for users.

  These messages should almost always be printed using "slam_printf()"
  so that they may be redirected to logs or output media.

- Normal user-level messages should NOT use printf/fprintf, but should
  use "slam_printf()" at all times.

- Use prefix "@@@" for messages from Verilog code examples.

PROGRESS MESSAGES

- Messages that are intended for debugging are selectively controlled
  on a package or feature basis with a "_verbosity" flag.  The
  "_verbosity" flag's default value may be overridden with an
  associated "_VERBOSITY" environment variable whose presence
  indicates that its values should override the default.

- Progress messages MUST be disable-able through a "_VERBOSITY" flag.

- The "_VERBOSITY" environment variable should be examined no more
  than ONCE during program execution.

MEMORY ALLOCATION

- Avoid calling malloc/realloc/free and other functions that allocate
  memory directly.  Instead use the functions in "slam_malloc.h".

  slam_malloc()
  slam_free()
  slam_strdup() - because this allocates memory

  This makes it easier to port the SLAM collection to other memory
  management systems.

PARAMETER PASSING

- ANSI C: it is ok to assume that structs may be passed by value and
  that the compiler knows how to copy the chunk.

TYPE NAMES AND STRUCTS AND POINTERS

- Use typedef and the following conventions:

  struct slam_foo_s - is the struct name
  typedef struct slam_foo_s *slam_foo_p - is the pointer to a struct
  typedef struct slam_foo_s slam_foo_t - is the typedef'd name of the
  struct

- Keep structure members simple and use normal-sized types (int,
  double) instead of space-optimal types (short int, float).  This
  will make it easier to port the code and to inspect data structures
  from other language systems.

- Avoid bit-aligned structures.

- Use unions appropriately to handle types whose sizes may be different

- VOID*:  It is ok to assume that an "int", "char" or "char*" fits
  into a "void*".  It is ok to assume that any pointer fits into a
  "void*".  It is NOT ok to put a "double" or "float" into a "void*".

- Use the "typemarker" technique to label structs with what they are
  for debugging purposes and to assert that they are what they are.

UNIT-TEST

- A package should include a short self-test of its functionality.
  This is called its "unit test."

- The unit test is enabled through the flag "#ifdef UNITTEST"

- For a package named "slam_foo" its unittest executable is named
  "slam_foo_unittest"

- If a package supports multiple unit tests, they are controlled with
  flags UNITTEST, UNITTEST0, UNITTEST1, etc. with associated
  executables "slam_foo_unittest", "slam_foo_unittest0",
  "slam_foo_unittest1", etc.

DEBUG AND ASSERT

- Use the "assert()" macro liberally and check consistencies
  everywhere.

- Use the "#ifdef DEBUG" flag liberally, but assume that it is either
  ON for all SLAM code, or OFF for all SLAM code.  I.e., that it is
  not controlled on a module-by-module basis.

- Use a module-level "#define SLAM_MODULE_DEBUG" flag to control
  module-level debugging.

ABORT

- Use the "abort()" function to terminate execution immediately.

COMPILATION

- all code must compile clean silently

RE-ENTRANT CODE

- avoid using static variables to allow a group of functions to work
  together.  Instead, require the programmer to allocate and pass
  state variables explicitly.

SUPPORT FOR 64-BIT

- To do.

Event-driven Analog Nonlinear Feedback

To date, the examples I have shown using switch-level analog (SLA) model systems with sources. switches and resistors.  The SLA package also provides op-amps and dependent sources.  These can be used to model systems that exhibit some interesting feedback behavior.

Flashback

Why use event-driven analog models?  The need stems from the type of response chekers that are easy to write.   A successful test of a system that yields useful information requires three things:

  • a correctness criterion (a response checker, when automated)
  • an appropriate model (the simulation model)
  • a procedure (how do I use the two above things?)

For some types of circuits, it is possible to write a response checker (correctness criterion) that checks a particular value at a particular time.  In clocked digital systems, a logic value may remain constant for an entire cycle, which is why cycle-based checkers are sufficient.  When an analog correctness criterion is specified to the cycle level, it is best that the models also elide sub-cycle non-idealities in the interest of simplicity and runtimes.  Hence, the applicability of event-driven analog models.

Back to the example

This example is illustrative of a feedback techinique used to regulate a supply voltage. A reference voltage is compared to a voltage drop produced over a resistor by a voltage-controlled current-source.  In practice, one might see such structures replicated across a chip, providing locally generated bias voltages.

ex-nonlinfdbk11

This particular subcircuit has a sample-and-hold device that is triggered by the digital logic periodically to re-calibrate the circuit.  When it is not being calibrated it is not tracking voltage changes.  (The selective use of calibration might be desired so that the subcircuit is not tracking changes when an important operation is taking place.)  When it is tracking, the circuit converges on the new value.
The verilog code for the sample-and hold circuit is shown below.

/*
 * Sample/hold
 *  Pass the new value through when on.
 *  Hold the old value while off.
 */

module sample_hold(input sh, inout wire sample, inout wire hold);

   real		sampleval;
   real		holdval;
   real		holdcurrent;

   /* Instantiate components */
   initial begin
      $slam_sla_vprobe("p1", sample, sampleval);
      holdval = 0.0;
      $slam_sla_vsrc("v1", hold, top.GND, holdval, holdcurrent);
   end

   /* Behavior */
   always @(sampleval, sh) begin
      if (sh == 1) begin
	 holdval <= #10 sampleval;
      end
   end

endmodule // sample_hold

Our sample and hold model represents a circuit with a non-zero delay value.   We used a delay of #10.  The example stimulus was set up to calibrate after a reset occurs and again after a period of time.   A waveform dump appears below.

ex_nonlinfdbk1
What we observe is that the event-driven model of this circuit converges to a steady state value after a few cycles.  After convergence is reached, no more events are triggered and all simulation components related to the circuit stop being evaluated. Thus, each evaulation of the feedback circuit is fairly cheap (in simulation terms) and the correctness criterion is simple (the reached voltage is the same as the reference).

What have we checked here?