This is the README file for the new MCell 3.1 testsuite.

Sections:
    1. Running the test suite
    2. Analyzing errors during a run of the test suite
    3. Extending the test suite

=========================
1. Running the test suite
=========================

    The test suite is launched using the top-level test script in this
    subdirectory, 'main.py'.  Run without any arguments, it will attempt to run
    the entire test suite.  The script depends upon some Python modules which
    are checked into the BZR repository under testsuite/system_tests, but it
    will automatically find those modules if the main.py script is in its
    customary location.

    When you run the script, it will look for a configuration file called
    'test.cfg' in the current directory.  A typical test.cfg will only need two
    lines in it:

[DEFAULT]
mcellpath = /home/jed/src/mcell/3.1-pristine/build/debug/mcell

    This specifies which mcell executable to test.  If there are other relevant
    testing settings, they can be placed into this file and accessed by the
    test suite.

    The default configuration file may be overridden using the -c command-line
    argument to the script.

    The script can take a handful of command-line
    arguments, which are summarized briefly in the help message provided when
    you run:

      ./main.py -h

    The script can also take a '-T' argument to specify the location of the
    test cases.  If you are running the script from this directory, you do not
    need to specify the -T flag, as they will be found automatically with the
    default layout.  The argument to -T should be the path to the mdl/testsuite
    subdirectory of the source tree for normal usage (or a copy thereof).

    By default, all test results will be deposited under a subdirectory
    'test_results' created below the current directory.  You may override this
    using '-r' and another directory name.  BE CAREFUL!  Presently, whatever
    directory is being used for test results will be entirely cleaned out
    before the test suite begins to run.

    XXX: Might it not be safer to have it move the old directory to a new name?
         Perhaps worth investigating.  On the other hand, this will, by
         default, result in the rapid accumulation of results directories.
         Still, probably better to fail to delete a thousand unneeded files
         than to delete one unintended...

    main.py also takes '-v' to increase verbosity.  Notable levels of verbosity:

      0:  (default)  Very little feedback as tests are running
      1:  Brief feedback as tests are running   (. for successful tests, F for
                failures, E for errors)
      2:  Long feedback as tests are running    (single lines indicating
                success or failure of each test, color coded if the output is a
                tty).


    Use '-l' to display a list of all of the tests and test collections that
    main.py "knows" about.  Any of these may be included or excluded.  For
    instance, right now, main.py -l shows:

      Found tests:
        - reactions : Reaction tests
          - numericsuite : Numeric validation for reactions
          - tempsuite : Shortcut to currently developing test
        - macromols : Macromolecule tests
          - numericsuite : Numeric validation for Macromolecules
          - errorsuite : Test error handling for invalid macromolecule constructs in MDL files
        - parser : Parser torture tests
          - vizsuite : VIZ output tests for DREAMM V3 modes
          - oldvizsuite : VIZ output tests for ASCII/RK/DX modes
          - fasttests : All quick running tests (valid+invalid MDL)
            - (quicksuite)
            - (errorsuite)
          - errorsuite : Test error handling for invalid MDL files
          - rtcheckpointsuite : Basic test of timed checkpoint functionality
          - quicksuite : A few quick running tests which cover most valid MDL options
          - allvizsuite : VIZ output tests for all modes (old+new)
            - (vizsuite)
            - (oldvizsuite)
          - kitchensinksuite : Kitchen Sink Test: (very nearly) every parser option
        - regression : Regression tests
          - suite : Regression test suite

    The indentation is significant, as it indicates subgroupings within the
    test collections.  Note that some of the test collection names are
    parenthesized.  These are collections which are redundant with the other
    collections in the suite and will not be included a second time, but were
    added to simplify running simple subsets of the entire test suite.

    By default, all tests will be run.  To select just a subset of the tests,
    use:

      ./main.py -i <ident>

    where <ident> is a path-like identifier telling which test collection to
    run.  Given the above output from "main.py -l", valid options include:

        reactions
        reactions/numericsuite
        parser
        parser/fasttests
        parser/fasttests/quicksuite
        parser/allvizsuite
        regression

    To exclude a subset of the tests, use:

      ./main.py -e <ident>

    Note that the -i and -e arguments are processed from left to right on the
    command-line, so you may do something like:

      ./main.py -i parser -e parser/allvizsuite

    to run the "parser" tests, skipping over the "allvizsuite" subgroup.
    Unless some -i arguments are specified, the initial set of included tests
    is the complete set, so you may also do:

      ./main.py -e macromols/numericsuite

    to run all tests except for the numeric validation tests in the
    macromolecules test suite.

    Finally, the list of test suites to include may be configured in the
    test.cfg file by adding a 'run_tests' directive to the test.cfg file,
    consisting of a comma-separated list:

      test.cfg:

      [DEFAULT]
      run_tests=parser/errorsuite,regression
      mcellpath=/path/to/mcell

    This way, the exact set of tests to run can be tailored to the particular
    configuration file.

=========================
2. Analyzing errors during a run of the test suite
=========================

    If errors are reported during a run of the test suite, you should get an
    informative message from the test suite.  In many cases, these messages
    will be related to the exit code from mcell.  For instance, here is an
    example run, edited for brevity, of the regression test suite on an old
    version of mcell:

Running tests:
  - regression/suite
..
..
======================================================================
FAIL: test_010 (test_regression.TestRegressions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "../mdl/testsuite/regression/test_regression.py", line 128, in test_010
    mt.invoke(get_output_dir())
  File "./system_tests/testutils.py", line 332, in invoke
    self.__check_results()
  File "./system_tests/testutils.py", line 350, in __check_results
    assert os.WEXITSTATUS(self.got_exitcode) == self.expect_exitcode, "Expected exit code %d, got exit code %d" % (self.expect_exitcode, os.WEXITSTATUS(self.got_exitcode))
AssertionError: ./test_results/test-0010: Expected exit code 0, got exit code 139

----------------------------------------------------------------------
Ran 10 tests in 49.084s

FAILED (failures=5)

    The significant line to look for is the "AssertionError" line, which tells
    us two things:

      AssertionError: ./test_results/test-0010: Expected exit code 0, got exit code 139

    First, it tells us which subdirectory to look in for exact details of the
    run which caused the failure, and it will give us a message which hints at
    the problem.  In this case, the run exited with code 139, which is signal
    11 (SIGSEGV).

    On a UNIX or Linux machine, the exit codes generally follow the following
    convention:

      0:       normal exit
      1-126:   miscellaneous errors   (MCell always uses 1)
      127:     can't find executable file
      129-255: execution terminated due to signal (exit_code - 128)

    The signals which may kill an execution are (note that the numbering is
    taken from a Linux machine, and some of the signals may be numbered or
    named differently on other systems, though many of these signal numbers are
    standard, such as 11 for SIGSEGV.  Type 'kill -l' or see the signals man
    page on the system in question for more details on the name <-> number
    mappings.)

    sig   exit code    name/desc
      1   129          SIGHUP  - unlikely in common use
      2   130          SIGINT  - user hit Ctrl-C
      3   131          SIGQUIT - user hit Ctrl-\
      4   132          SIGILL  - illegal instruction -- exe may be for a different CPU
      5   133          SIGTRAP - unlikely in common use
      6   134          SIGABRT - abort - often caused by an assertion failure
      7   135          SIGBUS  - bus error -- less likely than SIGSEGV, but similar meaning
      8   136          SIGFPE  - floating point exception
      9   137          SIGKILL - killed by user/sysadmin
     10   138          SIGUSR1 - should not happen
     11   139          SIGSEGV - accessed a bad pointer
     12   140          SIGUSR2 - should not happen
     13   141          SIGPIPE - unlikely in context of test suite
     14   142          SIGALRM - should not happen
     15   143          SIGTERM - manually killed by user/sysadmin

    Higher-numbered signals do exist, though the numbering becomes less
    consistent above 15, and the likelihood of occurrence is also much lower.
    In practice, the only signals likely to be encountered, barring manual user
    intervention, are SIGABRT, SIGFPE, and SIGSEGV, with SIGILL and SIGBUS
    thrown in very rarely.

    Returning to the example above, let's look at the files produced by the
    run.  Looking at the contents of the test_results/test-0010 directory, I
    see:

      total 8
      -rw-r--r-- 1 jed cnl 293 2009-03-13 17:27 cmdline.txt
      -rw-r--r-- 1 jed cnl   0 2009-03-13 17:27 poly_w_cracker.txt
      -rw-r--r-- 1 jed cnl 163 2009-03-13 17:27 realerr
      -rw-r--r-- 1 jed cnl   0 2009-03-13 17:27 realout
      -rw-r--r-- 1 jed cnl   0 2009-03-13 17:27 stderr
      -rw-r--r-- 1 jed cnl   0 2009-03-13 17:27 stdout

    The important files here are:

        cmdline.txt: the exact command line required to reproduce this bug
        realout:     what mcell printed to out_file
        realerr:     what mcell printed to err_file
        stdout:      what mcell sent to stdout  (usually should be empty)
        stderr:      what mcell sent to stderr  (usually should be empty)

    The contents of cmdline.txt from this run are:

      executable: /home/jed/src/mcell/3.1-pristine/build/debug/mcell
      full cmdline: /home/jed/src/mcell/3.1-pristine/build/debug/mcell -seed 13059 -logfile realout -errfile realerr -quiet /netapp/cnl/home/jed/src/mcell/3.2-pristine/mdl/testsuite/regression/10-counting_crashes_on_coincident_wall.mdl

    The full command-line should use absolute paths, so you should be able to
    use it to start a gdb session which should replicate this exact problem if
    it is repeatable.

    Likewise, any files which the test case should have produced will appear
    under this directory.  This should allow you to examine the reaction and
    viz output files for necessary clues.

=========================
3. Extending the test suite
=========================

    Generally, adding tests to the test suite requires three steps:

        1. write the mdl files
        2. write the Python code to validate the resultant output from the mdl
           files
        3. hook the test case into the system

    -----------
    1. writing the MDL files
    -----------
    Writing the mdl files should be self explanatory, so I will focus on the
    other two pieces.  It's worth including a few brief notes here, though.
    First, the MDL should be written to produce all output relative to the
    current directory.  This makes it easy for the test suite scripts to
    manage the output test results.  If additional command-line arguments are
    needed, they may be specified in the Python portion of the test case.
    Generally, I've started each test case in the test suite with a block
    comment which explains the purpose of the test along with an
    English-language description of the success and failure criteria.  For
    instance:

/****************************************************************************
 * Regression test 01: Memory corruption when attempting to remove a
 *    per-species list from the hash table.
 *
 *    This is a bug encountered by Shirley Pepke (2008-04-24).  When a
 *    per-species list is removed from the hash table, if the hash table has a
 *    collision for the element being removed, and the element being removed
 *    was not the first element (i.e. was the element which originally
 *    experienced the collision), memory could be corrupted due to a bug in the
 *    hash table removal code.
 *
 *    Failure: MCell crashes
 *    Success: MCell does not crash (eventually all molecules should be
 *             consumed, and the sim should run very fast)
 *
 * Author: Jed Wing <jed@salk.edu>
 * Date:   2008-04-24
 ****************************************************************************/

      This convention seems like a good way to keep documentation on the
      individual tests.  Some of the subdirectories also have README files
      which contain brief summaries of the purpose of the tests and some
      details about the individual tests, but I consider those to be secondary.
      Still, for any commentary which does not pertain to a single specific
      test, or which is too long or complex to include in a block comment in
      the MDL file, creating or adding to a README file is probably a good way
      to capture the relevant information.

    -----------
    2. writing the Python code
    -----------

      I've written utilities to help with validating most aspects of MCell
      output; the utilities are not comprehensive, but they allow many types
      of validation of reaction data outputs, and make a valiant effort at
      validating viz output types.  I will produce a reference document for
      the various utilities, but will discuss a few of them briefly here.

      a. Python unittest quick introduction

        The MCell test suite is based on Python's unittest system, for which
        I'll give only a quick summary here.  More documentation is available
        in the Python documentation.

        A Python unittest test case is a class, usually with a name which
        starts with "Test", and which subclasses unittest.TestCase:

            class TestCase1(unittest.TestCase):
                pass

        Inside the test case will be one or more methods with names that start
        with "test".  This is not optional (at least for the simple usage I'm
        describing here).  Python unittest uses the method names to
        automatically pick out all of the individual tests.  Test cases may
        also include a setUp method and a tearDown method.  Each test case is
        run, bracketed by calls to setUp and tearDown if they are defined.
        Inside the test cases must be a little bit of Python code which tests
        some aspect of whatever you are testing.  These are tested using
        either Python 'assert' statements, as, or using various methods
        inherited from unittest.TestCase whose names begin with 'fail', such
        as 'failIfEqual', 'failUnlessEqual', or simply 'fail':

            failUnlessAlmostEqual   # approximate equality
            failIfAlmostEqual       # approximate equality
            failUnlessEqual         # exact equality
            failIfEqual             # exact equality
            failIf                  # arbitrary boolean
            failUnless              # arbitrary boolean
            failUnlessRaises        # check for expected exception
            fail                    # fail unilaterally

        The existing tests are (for some reason?) written using assert, rather
        than using the 'fail' methods.  I'll have to ask myself why I did it
        that way next time I'm talking to myself.  This means that if you run
        the test suite with -O, it will not work.  At some point soon, I may
        convert it over to use the 'fail' methods which will not be disabled
        by the -O flag.

        [Another performance tweak that might help the test suite to run a
        little faster would be to enable psyco in main.py.  psyco generally
        improves performance anywhere from a factor of 2 to a factor of a
        hundred.  Obviously, it won't make the mcell runs themselves any
        faster...]

        So, an example test case might be:

            class TestReality(unittest.TestCase):

                def setUp(self):
                    self.a = 1234

                def tearDown(self):
                    del self.a

                def test_a(self):
                    self.failUnlessEqual(self.a, 1234)

                def test_b(self):
                    if 3*3 != 9:
                        self.fail("3*3 should be 9.")

                def test_c(self):
                    # Check equality to 7 decimal places
                    self.failUnlessAlmostEqual(1./3., 0.333333333, places=7)

        Generally, we don't know the order in which the tests are run, but one
        possible order for the method calls above is:

            setUp
            test_a
            tearDown
            setUp
            test_b
            tearDown
            setUp
            test_c
            tearDown

        Traditionally, the bottom of a file containing one or more TestCase
        subclasses will have the following two lines:

            if __name__ == "__main__":
                unittest.main()

        This way, if the file is invoked as a script from the command-line, it
        will automatically run all of the tests found in this module.

        Now, these automatically built test suites are not the only way to
        aggregate tests.  unittest includes utilities for automatically
        scanning a test case for all tests which match a particular pattern.
        To do this, create a top-level function (i.e. not a method on your
        test case) which looks like:

            def vizsuite():
                return unittest.makeSuite(TestParseVizDreamm, "test")

        In the above example, TestParseVizDream is the name of a TestCase
        subclass, and we've asked unittest to round up all of the methods from
        that test case whose names start with 'test' and treat them as a test
        suite.  You may also create test suites which aggregate individual
        tests or other test suites.  To do this, again create a top-level
        function which looks like:

            def fullsuite():
                # Create a new suite
                suite = unittest.TestSuite()

                # add test 'test_viz_dreamm1' from the class TestParseVizDream
                suite.addTest(TestParseVizDreamm("test_viz_dreamm1"))

                # add the test suite 'vizsuite' defined above in a function
                suite.addTest(vizsuite())

                # return the newly constructed suite
                return suite

        These aggregated test suites may be exposed in various ways via the
        top-level generic test runner I've written, and which I'll explore a
        little bit in the next section, and in more detail when I explain how
        to hook the new tests into the system.

        We will deviate from the established formulas in a few minor ways.
        The most important is that:

            if __name__ == "__main__":
                unittest.main()

        becomes:

            if __name__ == "__main__":
                cleandir(get_output_dir())
                unittest.main()

      b. MCell tests introduction

          First, let's take a quick look at one of the regression tests.
          We'll start with the simplest possible test -- one which merely
          checks that the run didn't crash.  To add this, all I did was to add
          the mdl files under mdl/testsuite/regression, and then add a brief
          function to test_regression.py.  The test cases are generally named
          XX-whatever.mdl, though that may need to change as the test suite
          grows.

          The general structure of an MCell testcase is:

              1. construct an McellTest object (or subclass thereof)
              2. populate the object with parameters detailing what we
                 consider to be a successful run
              3. call obj.invoke(get_output_dir()) on the object to actually
                 run the test.

          For many purposes, the McellTest class itself will suffice, but in
          some cases you may find yourself including the same bits of setup in
          a number of different tests, in which case it's probably worth
          moving the tests into a custom subclass of McellTest.

          Here is an example test drawn from the regression test:

            def test_010(self):
              mt = McellTest("regression",
                             "10-counting_crashes_on_coincident_wall.mdl",
                             ["-quiet"])
              mt.set_check_std_handles(1, 1, 1)
              mt.invoke(get_output_dir())

          As in normal Python unit tests, the name of the method doesn't
          matter, but it must be a method on a class which subclasses
          unittest.TestCase, and the name should start with 'test'.

          The arguments you see above in the construction of McellTest are:

            1. "regression": an indication of which section of the test suite
                   the run is part of (not really important, but it allows
                   overriding configuration options for specific subsections
                   of the test suite)

            2. "10-counting_crashes_on_coincident_wall.mdl": the name of the
                   top-level MDL file to run (path is relative to the location
                   of the script containing the Python code, and is generally
                   in the same directory at present)

            3. ["-quiet"]: a list of additional arguments to give to the run.
                   Generally, the run will also receive a random seed, and
                   will have its log and error outputs redirected to a file.
                   I provide '-quiet' to most of the runs I produce so that
                   only the notifications I explicitly request will be turned
                   on.

          The next line:

              mt.set_check_std_handles(1, 1, 1)

          tells the test to close stdin when it starts and to verify that
          nothing was written to stdout or stderr.  This is almost always a
          good idea -- we can be sure that all output is properly redirected
          either to logfile or errfile, rather than being written directly to
          stdout/stderr using printf/fprintf.

          And finally:

              mt.invoke(get_output_dir())

          runs the test.  Unless otherwise specified (see the reference
          document for details), McellTest expects that mcell should exit with
          an exit code of 0, so we don't need to add any additional tests.

      c. Brief introduction to test utilities

          In most cases, our job isn't quite that simple, and in these cases,
          there are various ways to add additional success criteria.  For
          instance:

              mt.add_extra_check(RequireFileMatches("realout",
                                 '\s*Probability.*set for a\{0\} \+ b\{0\} -> c\{0\}',
                                 expectMaxMatches=1))

          This statement says that in order for the run to be considered
          successful, the file 'realout'  (i.e. the logfile output from mcell)
          must contain a line which matches the regular expression:

              '\s*Probability.*set for a\{0\} \+ b\{0\} -> c\{0\}',

          and we've further specified that it must match it at most once.  (By
          default, it must match at least once, so this means it must match
          exactly once.)  Again, see the reference document for details on
          RequireFileMatches and other similar utilities.

          There are also similar utilities for checking various aspects of
          reaction data output and similarly formatted files.  For instance,
          consider:

              mt.add_extra_check(RequireCountConstraints("cannonballs.txt",
                                            [(1, 1, -1,  0,  0,  0),    # 0
                                             (0, 0,  0,  1,  1, -1),    # 0
                                             (0, 0,  1,  0,  0,  0),    # 500
                                             (0, 0,  0,  0,  0,  1)],   # 500
                                            [0, 0, 500, 500],
                                            header=True))

          This represents a set of exact constraints on the output file
          'cannonballs.txt'.

          This matrix:

              [(1, 1, -1,  0,  0,  0),    # 0
               (0, 0,  0,  1,  1, -1),    # 0
               (0, 0,  1,  0,  0,  0),    # 500
               (0, 0,  0,  0,  0,  1)],   # 500

          will be multiplied by each row in the output file (after removing
          the header line and the "time" column), and each result vector must
          exactly match the vector:

              [0, 0, 500, 500],

          The file is assumed to have a header line, though no specific check
          is made of the header line -- the first line is just not subjected
          to the test (because of the 'header=True' directive.)

          This type of constraint can be used to verify various kinds of
          behavioral constraints on the counting.  For instance, it can verify
          that the total number of bound and unbound molecules of a given type
          is constant.  Any constraint which can be formalized in this way may
          be added just by adding another row to the matrix and another item
          to the vector.

          Similarly, equilibrium may be verified for count files using
          something like:

                t.add_extra_check(RequireCountEquilibrium("dat/01-volume_highconc/V_out.dat",
                                        [500] * 26,
            #                            [25]  * 26,
                                        ([25]  * 15) + ([500] * 3) + ([25] * 7) + [500],
                                        header=True))

          The first argument is the output file path relative to the result
          directory.  The second is the expected equilibrium for each of the
          columns (again, excluding the time column).  The third is the
          allowable tolerance.  If, after finishing the run, the mean value of
          the column differs from the desired equilibrium by more than the
          tolerance, it will be counted a failure.  In the above case, you can
          see that 4 of the 26 columns have been temporarily set to a
          tolerance of '500' to prevent the test from failing on 4 cases which
          fail due to known MCell issues whose fixes will be somewhat
          involved.

          For both this and the previous check type, you may specify min_time
          and max_time arguments to limit the checks to particular segments of
          time.  For equilibrium, this will restrict the rows over which we
          are averaging to find the mean value.

          For more details on the specific test utilities I've provided, see
          the test utilities reference document.

    -----------
    3. hooking the test case into the system
    -----------

        The test runner looks at the top-level directory of the test suite for
        a test_info.py file.  test_info.py may define a few different
        variables which determine what tests are included when you run
        main.py, and what descriptions are shown when you run './main.py -l'.
        The first such variable is:

            subdirs = {
                "macromols"  : "Macromolecule tests",
                "parser"     : "Parser torture tests",
                "reactions"  : "Reaction tests",
                "regression" : "Regression tests"
            }

        This specifies that the test suite runner should look in
        4 different subdirectories of the directory where test_info.py was
        found: macromols, parser, reactions, and regression.  It also provides
        descriptions for each of these subdirectories which are displayed in
        the '-l' output.  Each subdirectory should have a test_info.py file
        which indicates which subdirectories contain tests.

        The second such variable is:

            tests = {
                "oldvizsuite"       : "VIZ output tests for ASCII/RK/DX modes",
                "vizsuite"          : "VIZ output tests for DREAMM V3 modes",
                "errorsuite"        : "Test error handling for invalid MDL files",
                "quicksuite"        : "A few quick running tests which cover most valid MDL options",
                "kitchensinksuite"  : "Kitchen Sink Test: (very nearly) every parser option",
                "rtcheckpointsuite" : "Basic test of timed checkpoint functionality"
            }

        This gives several named test suites, and descriptions to be displayed
        in the '-l' output.  These test suites must be imported into the
        test_info.py file.  The parser tests are defined in test_parser.py, so
        I included the following import statement at the top of the file:

            from test_parser import oldvizsuite, vizsuite, errorsuite, quicksuite
            from test_parser import kitchensinksuite, rtcheckpointsuite

        Any test suites included in the 'tests' map will be included in the
        full test suite.

        It may be desirable in some cases to define test suites which run a
        subset of the functionality, and this brings us to the third variable
        in a test_info.py file:

          collections = {
              "allvizsuite"  : ("VIZ output tests for all modes (old+new)", ["oldvizsuite", "vizsuite"]),
              "fasttests"    : ("All quick running tests (valid+invalid MDL)", ["errorsuite", "quicksuite"]),
          }

        This defines two collections of tests.  These suites which are
        collected here are already collected in the tests variable above, but
        we may want to have other aggregations that we can quickly and easily
        run.  Each entry in the collections map has the syntax:

            name : (desc, [suite1, suite2, ...])

        These collections will NOT be included in the top-level test suite by
        default, as the individual component tests are already included; the
        collections would, thus, be redundant.

        Now, the above 'tests' and 'collections' are from the test_parser
        subdirectory.  The test suite system will create several collections
        of tests which may be included or excluded from the run:

            parser            # all suites included in the 'tests' from parser/test_info.py
            parser/oldvizsuite
            parser/vizsuite
            parser/errorsuite
            parser/quicksuite
            parser/kitchensinksuite
            parser/rtcheckpointsuite
            parser/allvizsuite
            parser/fasttests

        This means that you do not need to explicitly create a separate test
        to include all of the suites in a directory.

        Note that more levels of hierarchy are possible -- parser could define
        'subdirs' if we wanted to break the parser tests into several
        different subdirectories, for instance.  But each level of the
        directory hierarchy must have its own test_info.py file.