Jacob Czech 44f1eed601 Fixed problem with previous bugfix. 11 年之前
..
fault_injection 513144e9f4 Changed debugging mode of the malloc fault injector to display gdb invocation 15 年之前
system_tests 1b7338ad11 Fixed bug in test suite. 11 年之前
unit_tests ef4fcf0d71 Eliminating reference to ran4. 16 年之前
README adfeb5e318 Merging my bugfixes into Boris' new surface molecules code. 15 年之前
README.testutils fe9c8a346f Initial checkin of the testsuite documentation. 15 年之前
main.py 44f1eed601 Fixed problem with previous bugfix. 11 年之前

README

This is the README file for the new MCell 3.1 testsuite.

Sections:
1. Running the test suite
2. Analyzing errors during a run of the test suite
3. Extending the test suite

=========================
1. Running the test suite
=========================

The test suite is launched using the top-level test script in this
subdirectory, 'main.py'. Run without any arguments, it will attempt to run
the entire test suite. The script depends upon some Python modules which
are checked into the BZR repository under testsuite/system_tests, but it
will automatically find those modules if the main.py script is in its
customary location.

When you run the script, it will look for a configuration file called
'test.cfg' in the current directory. A typical test.cfg will only need two
lines in it:

[DEFAULT]
mcellpath = /home/jed/src/mcell/3.1-pristine/build/debug/mcell

This specifies which mcell executable to test. If there are other relevant
testing settings, they can be placed into this file and accessed by the
test suite.

The default configuration file may be overridden using the -c command-line
argument to the script.

The script can take a handful of command-line
arguments, which are summarized briefly in the help message provided when
you run:

./main.py -h

The script can also take a '-T' argument to specify the location of the
test cases. If you are running the script from this directory, you do not
need to specify the -T flag, as they will be found automatically with the
default layout. The argument to -T should be the path to the mdl/testsuite
subdirectory of the source tree for normal usage (or a copy thereof).

By default, all test results will be deposited under a subdirectory
'test_results' created below the current directory. You may override this
using '-r' and another directory name. BE CAREFUL! Presently, whatever
directory is being used for test results will be entirely cleaned out
before the test suite begins to run.

XXX: Might it not be safer to have it move the old directory to a new name?
Perhaps worth investigating. On the other hand, this will, by
default, result in the rapid accumulation of results directories.
Still, probably better to fail to delete a thousand unneeded files
than to delete one unintended...

main.py also takes '-v' to increase verbosity. Notable levels of verbosity:

0: (default) Very little feedback as tests are running
1: Brief feedback as tests are running (. for successful tests, F for
failures, E for errors)
2: Long feedback as tests are running (single lines indicating
success or failure of each test, color coded if the output is a
tty).


Use '-l' to display a list of all of the tests and test collections that
main.py "knows" about. Any of these may be included or excluded. For
instance, right now, main.py -l shows:

Found tests:
- reactions : Reaction tests
- numericsuite : Numeric validation for reactions
- tempsuite : Shortcut to currently developing test
- macromols : Macromolecule tests
- numericsuite : Numeric validation for Macromolecules
- errorsuite : Test error handling for invalid macromolecule constructs in MDL files
- parser : Parser torture tests
- vizsuite : VIZ output tests for DREAMM V3 modes
- oldvizsuite : VIZ output tests for ASCII/RK/DX modes
- fasttests : All quick running tests (valid+invalid MDL)
- (quicksuite)
- (errorsuite)
- errorsuite : Test error handling for invalid MDL files
- rtcheckpointsuite : Basic test of timed checkpoint functionality
- quicksuite : A few quick running tests which cover most valid MDL options
- allvizsuite : VIZ output tests for all modes (old+new)
- (vizsuite)
- (oldvizsuite)
- kitchensinksuite : Kitchen Sink Test: (very nearly) every parser option
- regression : Regression tests
- suite : Regression test suite

The indentation is significant, as it indicates subgroupings within the
test collections. Note that some of the test collection names are
parenthesized. These are collections which are redundant with the other
collections in the suite and will not be included a second time, but were
added to simplify running simple subsets of the entire test suite.

By default, all tests will be run. To select just a subset of the tests,
use:

./main.py -i

where is a path-like identifier telling which test collection to
run. Given the above output from "main.py -l", valid options include:

reactions
reactions/numericsuite
parser
parser/fasttests
parser/fasttests/quicksuite
parser/allvizsuite
regression

To exclude a subset of the tests, use:

./main.py -e

Note that the -i and -e arguments are processed from left to right on the
command-line, so you may do something like:

./main.py -i parser -e parser/allvizsuite

to run the "parser" tests, skipping over the "allvizsuite" subgroup.
Unless some -i arguments are specified, the initial set of included tests
is the complete set, so you may also do:

./main.py -e macromols/numericsuite

to run all tests except for the numeric validation tests in the
macromolecules test suite.

Finally, the list of test suites to include may be configured in the
test.cfg file by adding a 'run_tests' directive to the test.cfg file,
consisting of a comma-separated list:

test.cfg:

[DEFAULT]
run_tests=parser/errorsuite,regression
mcellpath=/path/to/mcell

This way, the exact set of tests to run can be tailored to the particular
configuration file.

=========================
2. Analyzing errors during a run of the test suite
=========================

If errors are reported during a run of the test suite, you should get an
informative message from the test suite. In many cases, these messages
will be related to the exit code from mcell. For instance, here is an
example run, edited for brevity, of the regression test suite on an old
version of mcell:

Running tests:
- regression/suite
..
..
======================================================================
FAIL: test_010 (test_regression.TestRegressions)
----------------------------------------------------------------------
Traceback (most recent call last):
File "../mdl/testsuite/regression/test_regression.py", line 128, in test_010
mt.invoke(get_output_dir())
File "./system_tests/testutils.py", line 332, in invoke
self.__check_results()
File "./system_tests/testutils.py", line 350, in __check_results
assert os.WEXITSTATUS(self.got_exitcode) == self.expect_exitcode, "Expected exit code %d, got exit code %d" % (self.expect_exitcode, os.WEXITSTATUS(self.got_exitcode))
AssertionError: ./test_results/test-0010: Expected exit code 0, got exit code 139

----------------------------------------------------------------------
Ran 10 tests in 49.084s

FAILED (failures=5)

The significant line to look for is the "AssertionError" line, which tells
us two things:

AssertionError: ./test_results/test-0010: Expected exit code 0, got exit code 139

First, it tells us which subdirectory to look in for exact details of the
run which caused the failure, and it will give us a message which hints at
the problem. In this case, the run exited with code 139, which is signal
11 (SIGSEGV).

On a UNIX or Linux machine, the exit codes generally follow the following
convention:

0: normal exit
1-126: miscellaneous errors (MCell always uses 1)
127: can't find executable file
129-255: execution terminated due to signal (exit_code - 128)

The signals which may kill an execution are (note that the numbering is
taken from a Linux machine, and some of the signals may be numbered or
named differently on other systems, though many of these signal numbers are
standard, such as 11 for SIGSEGV. Type 'kill -l' or see the signals man
page on the system in question for more details on the name <-> number
mappings.)

sig exit code name/desc
1 129 SIGHUP - unlikely in common use
2 130 SIGINT - user hit Ctrl-C
3 131 SIGQUIT - user hit Ctrl-\
4 132 SIGILL - illegal instruction -- exe may be for a different CPU
5 133 SIGTRAP - unlikely in common use
6 134 SIGABRT - abort - often caused by an assertion failure
7 135 SIGBUS - bus error -- less likely than SIGSEGV, but similar meaning
8 136 SIGFPE - floating point exception
9 137 SIGKILL - killed by user/sysadmin
10 138 SIGUSR1 - should not happen
11 139 SIGSEGV - accessed a bad pointer
12 140 SIGUSR2 - should not happen
13 141 SIGPIPE - unlikely in context of test suite
14 142 SIGALRM - should not happen
15 143 SIGTERM - manually killed by user/sysadmin

Higher-numbered signals do exist, though the numbering becomes less
consistent above 15, and the likelihood of occurrence is also much lower.
In practice, the only signals likely to be encountered, barring manual user
intervention, are SIGABRT, SIGFPE, and SIGSEGV, with SIGILL and SIGBUS
thrown in very rarely.

Returning to the example above, let's look at the files produced by the
run. Looking at the contents of the test_results/test-0010 directory, I
see:

total 8
-rw-r--r-- 1 jed cnl 293 2009-03-13 17:27 cmdline.txt
-rw-r--r-- 1 jed cnl 0 2009-03-13 17:27 poly_w_cracker.txt
-rw-r--r-- 1 jed cnl 163 2009-03-13 17:27 realerr
-rw-r--r-- 1 jed cnl 0 2009-03-13 17:27 realout
-rw-r--r-- 1 jed cnl 0 2009-03-13 17:27 stderr
-rw-r--r-- 1 jed cnl 0 2009-03-13 17:27 stdout

The important files here are:

cmdline.txt: the exact command line required to reproduce this bug
realout: what mcell printed to out_file
realerr: what mcell printed to err_file
stdout: what mcell sent to stdout (usually should be empty)
stderr: what mcell sent to stderr (usually should be empty)

The contents of cmdline.txt from this run are:

executable: /home/jed/src/mcell/3.1-pristine/build/debug/mcell
full cmdline: /home/jed/src/mcell/3.1-pristine/build/debug/mcell -seed 13059 -logfile realout -errfile realerr -quiet /netapp/cnl/home/jed/src/mcell/3.2-pristine/mdl/testsuite/regression/10-counting_crashes_on_coincident_wall.mdl

The full command-line should use absolute paths, so you should be able to
use it to start a gdb session which should replicate this exact problem if
it is repeatable.

Likewise, any files which the test case should have produced will appear
under this directory. This should allow you to examine the reaction and
viz output files for necessary clues.

=========================
3. Extending the test suite
=========================

Generally, adding tests to the test suite requires three steps:

1. write the mdl files
2. write the Python code to validate the resultant output from the mdl
files
3. hook the test case into the system

-----------
1. writing the MDL files
-----------
Writing the mdl files should be self explanatory, so I will focus on the
other two pieces. It's worth including a few brief notes here, though.
First, the MDL should be written to produce all output relative to the
current directory. This makes it easy for the test suite scripts to
manage the output test results. If additional command-line arguments are
needed, they may be specified in the Python portion of the test case.
Generally, I've started each test case in the test suite with a block
comment which explains the purpose of the test along with an
English-language description of the success and failure criteria. For
instance:

/****************************************************************************
* Regression test 01: Memory corruption when attempting to remove a
* per-species list from the hash table.
*
* This is a bug encountered by Shirley Pepke (2008-04-24). When a
* per-species list is removed from the hash table, if the hash table has a
* collision for the element being removed, and the element being removed
* was not the first element (i.e. was the element which originally
* experienced the collision), memory could be corrupted due to a bug in the
* hash table removal code.
*
* Failure: MCell crashes
* Success: MCell does not crash (eventually all molecules should be
* consumed, and the sim should run very fast)
*
* Author: Jed Wing
* Date: 2008-04-24
****************************************************************************/

This convention seems like a good way to keep documentation on the
individual tests. Some of the subdirectories also have README files
which contain brief summaries of the purpose of the tests and some
details about the individual tests, but I consider those to be secondary.
Still, for any commentary which does not pertain to a single specific
test, or which is too long or complex to include in a block comment in
the MDL file, creating or adding to a README file is probably a good way
to capture the relevant information.

-----------
2. writing the Python code
-----------

I've written utilities to help with validating most aspects of MCell
output; the utilities are not comprehensive, but they allow many types
of validation of reaction data outputs, and make a valiant effort at
validating viz output types. I will produce a reference document for
the various utilities, but will discuss a few of them briefly here.

a. Python unittest quick introduction

The MCell test suite is based on Python's unittest system, for which
I'll give only a quick summary here. More documentation is available
in the Python documentation.

A Python unittest test case is a class, usually with a name which
starts with "Test", and which subclasses unittest.TestCase:

class TestCase1(unittest.TestCase):
pass

Inside the test case will be one or more methods with names that start
with "test". This is not optional (at least for the simple usage I'm
describing here). Python unittest uses the method names to
automatically pick out all of the individual tests. Test cases may
also include a setUp method and a tearDown method. Each test case is
run, bracketed by calls to setUp and tearDown if they are defined.
Inside the test cases must be a little bit of Python code which tests
some aspect of whatever you are testing. These are tested using
either Python 'assert' statements, as, or using various methods
inherited from unittest.TestCase whose names begin with 'fail', such
as 'failIfEqual', 'failUnlessEqual', or simply 'fail':

failUnlessAlmostEqual # approximate equality
failIfAlmostEqual # approximate equality
failUnlessEqual # exact equality
failIfEqual # exact equality
failIf # arbitrary boolean
failUnless # arbitrary boolean
failUnlessRaises # check for expected exception
fail # fail unilaterally

The existing tests are (for some reason?) written using assert, rather
than using the 'fail' methods. I'll have to ask myself why I did it
that way next time I'm talking to myself. This means that if you run
the test suite with -O, it will not work. At some point soon, I may
convert it over to use the 'fail' methods which will not be disabled
by the -O flag.

[Another performance tweak that might help the test suite to run a
little faster would be to enable psyco in main.py. psyco generally
improves performance anywhere from a factor of 2 to a factor of a
hundred. Obviously, it won't make the mcell runs themselves any
faster...]

So, an example test case might be:

class TestReality(unittest.TestCase):

def setUp(self):
self.a = 1234

def tearDown(self):
del self.a

def test_a(self):
self.failUnlessEqual(self.a, 1234)

def test_b(self):
if 3*3 != 9:
self.fail("3*3 should be 9.")

def test_c(self):
# Check equality to 7 decimal places
self.failUnlessAlmostEqual(1./3., 0.333333333, places=7)

Generally, we don't know the order in which the tests are run, but one
possible order for the method calls above is:

setUp
test_a
tearDown
setUp
test_b
tearDown
setUp
test_c
tearDown

Traditionally, the bottom of a file containing one or more TestCase
subclasses will have the following two lines:

if __name__ == "__main__":
unittest.main()

This way, if the file is invoked as a script from the command-line, it
will automatically run all of the tests found in this module.

Now, these automatically built test suites are not the only way to
aggregate tests. unittest includes utilities for automatically
scanning a test case for all tests which match a particular pattern.
To do this, create a top-level function (i.e. not a method on your
test case) which looks like:

def vizsuite():
return unittest.makeSuite(TestParseVizDreamm, "test")

In the above example, TestParseVizDream is the name of a TestCase
subclass, and we've asked unittest to round up all of the methods from
that test case whose names start with 'test' and treat them as a test
suite. You may also create test suites which aggregate individual
tests or other test suites. To do this, again create a top-level
function which looks like:

def fullsuite():
# Create a new suite
suite = unittest.TestSuite()

# add test 'test_viz_dreamm1' from the class TestParseVizDream
suite.addTest(TestParseVizDreamm("test_viz_dreamm1"))

# add the test suite 'vizsuite' defined above in a function
suite.addTest(vizsuite())

# return the newly constructed suite
return suite

These aggregated test suites may be exposed in various ways via the
top-level generic test runner I've written, and which I'll explore a
little bit in the next section, and in more detail when I explain how
to hook the new tests into the system.

We will deviate from the established formulas in a few minor ways.
The most important is that:

if __name__ == "__main__":
unittest.main()

becomes:

if __name__ == "__main__":
cleandir(get_output_dir())
unittest.main()

b. MCell tests introduction

First, let's take a quick look at one of the regression tests.
We'll start with the simplest possible test -- one which merely
checks that the run didn't crash. To add this, all I did was to add
the mdl files under mdl/testsuite/regression, and then add a brief
function to test_regression.py. The test cases are generally named
XX-whatever.mdl, though that may need to change as the test suite
grows.

The general structure of an MCell testcase is:

1. construct an McellTest object (or subclass thereof)
2. populate the object with parameters detailing what we
consider to be a successful run
3. call obj.invoke(get_output_dir()) on the object to actually
run the test.

For many purposes, the McellTest class itself will suffice, but in
some cases you may find yourself including the same bits of setup in
a number of different tests, in which case it's probably worth
moving the tests into a custom subclass of McellTest.

Here is an example test drawn from the regression test:

def test_010(self):
mt = McellTest("regression",
"10-counting_crashes_on_coincident_wall.mdl",
["-quiet"])
mt.set_check_std_handles(1, 1, 1)
mt.invoke(get_output_dir())

As in normal Python unit tests, the name of the method doesn't
matter, but it must be a method on a class which subclasses
unittest.TestCase, and the name should start with 'test'.

The arguments you see above in the construction of McellTest are:

1. "regression": an indication of which section of the test suite
the run is part of (not really important, but it allows
overriding configuration options for specific subsections
of the test suite)

2. "10-counting_crashes_on_coincident_wall.mdl": the name of the
top-level MDL file to run (path is relative to the location
of the script containing the Python code, and is generally
in the same directory at present)

3. ["-quiet"]: a list of additional arguments to give to the run.
Generally, the run will also receive a random seed, and
will have its log and error outputs redirected to a file.
I provide '-quiet' to most of the runs I produce so that
only the notifications I explicitly request will be turned
on.

The next line:

mt.set_check_std_handles(1, 1, 1)

tells the test to close stdin when it starts and to verify that
nothing was written to stdout or stderr. This is almost always a
good idea -- we can be sure that all output is properly redirected
either to logfile or errfile, rather than being written directly to
stdout/stderr using printf/fprintf.

And finally:

mt.invoke(get_output_dir())

runs the test. Unless otherwise specified (see the reference
document for details), McellTest expects that mcell should exit with
an exit code of 0, so we don't need to add any additional tests.

c. Brief introduction to test utilities

In most cases, our job isn't quite that simple, and in these cases,
there are various ways to add additional success criteria. For
instance:

mt.add_extra_check(RequireFileMatches("realout",
'\s*Probability.*set for a\{0\} \+ b\{0\} -> c\{0\}',
expectMaxMatches=1))

This statement says that in order for the run to be considered
successful, the file 'realout' (i.e. the logfile output from mcell)
must contain a line which matches the regular expression:

'\s*Probability.*set for a\{0\} \+ b\{0\} -> c\{0\}',

and we've further specified that it must match it at most once. (By
default, it must match at least once, so this means it must match
exactly once.) Again, see the reference document for details on
RequireFileMatches and other similar utilities.

There are also similar utilities for checking various aspects of
reaction data output and similarly formatted files. For instance,
consider:

mt.add_extra_check(RequireCountConstraints("cannonballs.txt",
[(1, 1, -1, 0, 0, 0), # 0
(0, 0, 0, 1, 1, -1), # 0
(0, 0, 1, 0, 0, 0), # 500
(0, 0, 0, 0, 0, 1)], # 500
[0, 0, 500, 500],
header=True))

This represents a set of exact constraints on the output file
'cannonballs.txt'.

This matrix:

[(1, 1, -1, 0, 0, 0), # 0
(0, 0, 0, 1, 1, -1), # 0
(0, 0, 1, 0, 0, 0), # 500
(0, 0, 0, 0, 0, 1)], # 500

will be multiplied by each row in the output file (after removing
the header line and the "time" column), and each result vector must
exactly match the vector:

[0, 0, 500, 500],

The file is assumed to have a header line, though no specific check
is made of the header line -- the first line is just not subjected
to the test (because of the 'header=True' directive.)

This type of constraint can be used to verify various kinds of
behavioral constraints on the counting. For instance, it can verify
that the total number of bound and unbound molecules of a given type
is constant. Any constraint which can be formalized in this way may
be added just by adding another row to the matrix and another item
to the vector.

Similarly, equilibrium may be verified for count files using
something like:

t.add_extra_check(RequireCountEquilibrium("dat/01-volume_highconc/V_out.dat",
[500] * 26,
# [25] * 26,
([25] * 15) + ([500] * 3) + ([25] * 7) + [500],
header=True))

The first argument is the output file path relative to the result
directory. The second is the expected equilibrium for each of the
columns (again, excluding the time column). The third is the
allowable tolerance. If, after finishing the run, the mean value of
the column differs from the desired equilibrium by more than the
tolerance, it will be counted a failure. In the above case, you can
see that 4 of the 26 columns have been temporarily set to a
tolerance of '500' to prevent the test from failing on 4 cases which
fail due to known MCell issues whose fixes will be somewhat
involved.

For both this and the previous check type, you may specify min_time
and max_time arguments to limit the checks to particular segments of
time. For equilibrium, this will restrict the rows over which we
are averaging to find the mean value.

For more details on the specific test utilities I've provided, see
the test utilities reference document.

-----------
3. hooking the test case into the system
-----------

The test runner looks at the top-level directory of the test suite for
a test_info.py file. test_info.py may define a few different
variables which determine what tests are included when you run
main.py, and what descriptions are shown when you run './main.py -l'.
The first such variable is:

subdirs = {
"macromols" : "Macromolecule tests",
"parser" : "Parser torture tests",
"reactions" : "Reaction tests",
"regression" : "Regression tests"
}

This specifies that the test suite runner should look in
4 different subdirectories of the directory where test_info.py was
found: macromols, parser, reactions, and regression. It also provides
descriptions for each of these subdirectories which are displayed in
the '-l' output. Each subdirectory should have a test_info.py file
which indicates which subdirectories contain tests.

The second such variable is:

tests = {
"oldvizsuite" : "VIZ output tests for ASCII/RK/DX modes",
"vizsuite" : "VIZ output tests for DREAMM V3 modes",
"errorsuite" : "Test error handling for invalid MDL files",
"quicksuite" : "A few quick running tests which cover most valid MDL options",
"kitchensinksuite" : "Kitchen Sink Test: (very nearly) every parser option",
"rtcheckpointsuite" : "Basic test of timed checkpoint functionality"
}

This gives several named test suites, and descriptions to be displayed
in the '-l' output. These test suites must be imported into the
test_info.py file. The parser tests are defined in test_parser.py, so
I included the following import statement at the top of the file:

from test_parser import oldvizsuite, vizsuite, errorsuite, quicksuite
from test_parser import kitchensinksuite, rtcheckpointsuite

Any test suites included in the 'tests' map will be included in the
full test suite.

It may be desirable in some cases to define test suites which run a
subset of the functionality, and this brings us to the third variable
in a test_info.py file:

collections = {
"allvizsuite" : ("VIZ output tests for all modes (old+new)", ["oldvizsuite", "vizsuite"]),
"fasttests" : ("All quick running tests (valid+invalid MDL)", ["errorsuite", "quicksuite"]),
}

This defines two collections of tests. These suites which are
collected here are already collected in the tests variable above, but
we may want to have other aggregations that we can quickly and easily
run. Each entry in the collections map has the syntax:

name : (desc, [suite1, suite2, ...])

These collections will NOT be included in the top-level test suite by
default, as the individual component tests are already included; the
collections would, thus, be redundant.

Now, the above 'tests' and 'collections' are from the test_parser
subdirectory. The test suite system will create several collections
of tests which may be included or excluded from the run:

parser # all suites included in the 'tests' from parser/test_info.py
parser/oldvizsuite
parser/vizsuite
parser/errorsuite
parser/quicksuite
parser/kitchensinksuite
parser/rtcheckpointsuite
parser/allvizsuite
parser/fasttests

This means that you do not need to explicitly create a separate test
to include all of the suites in a directory.

Note that more levels of hierarchy are possible -- parser could define
'subdirs' if we wanted to break the parser tests into several
different subdirectories, for instance. But each level of the
directory hierarchy must have its own test_info.py file.