WEBVTT Update, Google Test, debugging unit tests

With the majority of the webvtt parser now written (untested, unrevewied, but written), we've turned our attention back to testing. Previously the class wrote 300+ validation tests. These were simple VTT files that allowed us to check that a parser (or validator) correctly passed or failed a particlar VTT file, each one designed to exercise a different aspect of the spec.

What I wanted to do next was convert all these files into unit tests. We decided to try and use node-ffi so we could write the tests in JS instead of C. This turned out to be a better idea than reality, and left us with innumerable issues wrapping our code (especially the parse callback), build issues on Windows, etc. Undaunted, I suggested we try Python's ctypes, hoping it would be more mature. This was better, and we were able to make progress; but it added a level of complexity that I don't think was helping the majority of students. If you're struggling with C code, and your professor hands you a half-finished Python wrapper and says "write tests for the C code," it's a conceptual leap that's hard to make. So I abandoned that, too.

I tell my students to try things and not panic when it fails, so I have to do the same. At this stage I was looking for a solid solution vs. something hacky that might fail a third time. In the end I followed Ted's suggestion of using Google Test (gtest), which he's often praised from his work on Breakpad. The students are (mostly) familiar with JUnit from other courses, so the xUnit style of Google Test is perfect. Furthermore, the docs are awesome, it works cross-platform (and on TravisCI), there's plenty of examples, and lots of other projects using it, which the students can read. Win!

I added gtest to our autotools build (this template on github is helpful, if you ever have to do the same), and now TravisCI runs all our unit tests and reports back to github for any pull requests people make. In order to test our C code with gtest, Caitlin wrote a C++ wrapper, and Rick wrote a test fixture class, allowing our tests to become very simple. Here's an example:

/**  
* Verifies that the parser correctly parses a "vertical" key, followed by U+003A ':',  
* followed by 'rl' (indicating that the text be positioned vertically, and grows towards the left)  
*  
* This cue should have a Vertical orientation with direction RightToLeft  
*  
* From http://dev.w3.org/html5/webvtt/#webvtt-vertical-text-cue-setting (09/28/2012):  
* 1\. The string "vertical".  
* 2\. A U+003A COLON character (:).  
* 3\. One of the following strings: "rl", "lr".  
*/  
TEST_F(CueSettingVertical,RL)  
{  
  loadVtt( "cue-settings/vertical/rl.vtt", 1 );  
  ASSERT_TRUE( getCue( 0 ).isVerticalRightToLeft() );  
}

This loads uses the cue-settings/vertical/rl.vtt file, parses it, and makes sure there is 1 cue returned. It then makes sure that the vertial-right-to-left attribute is true. Here's what the file looks like:

WEBVTT  
  
00:00.000 --> 00:10.000 vertical:rl  
Payload

Great, right? Even better, our first dozen tests have already found a bunch of bugs in the implementation code. It's nice to be able to instantly show the students the benefits of writing lots of little tests. They also had to quickly learn how to manage an automated build and testing system with the desire to write and land a lot of tests. What do you do with tests that are known to case things to fail or crash?

Google Test provides a way to disable a test without removing, or commenting it out--you simply use the DISABLED_ prefix in the test's name. When you run your tests you get a report that there are disabled tests:

[==========] Running 2 tests from 1 test case.  
[----------] Global test environment set-up.  
[----------] 2 tests from CueSettingVertical  
[ RUN      ] CueSettingVertical.RL  
[       OK ] CueSettingVertical.RL (11 ms)  
[ RUN      ] CueSettingVertical.LR  
[       OK ] CueSettingVertical.LR (0 ms)  
[----------] 2 tests from CueSettingVertical (11 ms total)  
  
[----------] Global test environment tear-down  
[==========] 2 tests from 1 test case ran. (11 ms total)  
[  PASSED  ] 2 tests.  
  
  YOU HAVE 5 DISABLED TESTS

Running these disabled tests is possible with a command-line flag: --gtest_also_run_disabled_tests. Debugging a failing disabled tests requires some more work. Google Test is pretty good at not blowing-up when a test throws or segfaults. To catch such errors in the debugger, you also need to pass the --gtest_break_on_failure flag to your unit test executable. This let's gdb deal with the crash. For example:

$ cd webvtt  
$ ./configure --enable-debug  
$ make && make check  
$ cd test/unit  
$ gdb payloadnode_unittest  
GNU gdb 6.3.50-20050815 (Apple version gdb-1515) (Sat Jan 15 08:33:48 UTC 2011)  
Copyright 2004 Free Software Foundation, Inc.  
GDB is free software, covered by the GNU General Public License, and you are  
welcome to change it and/or distribute copies of it under certain conditions.  
Type "show copying" to see the conditions.  
There is absolutely no warranty for GDB.  Type "show warranty" for details.  
This GDB was configured as "x86_64-apple-darwin".  
(gdb) run --gtest_also_run_disabled_tests --gtest_break_on_failure  
Starting program: /Users/dave/Sites/repos/webvtt/test/unit/payloadnode_unittest --gtest_also_run_disabled_tests --gtest_break_on_failure  
Reading symbols for shared libraries ++. done  
Running main() from gtest_main.cc  
[==========] Running 2 tests from 1 test case.  
[----------] Global test environment set-up.  
[----------] 2 tests from PayloadNodeTest  
[ RUN      ] PayloadNodeTest.DISABLED_NodeCount  
  
Program received signal EXC_BAD_ACCESS, Could not access memory.  
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000008  
0x000000010001247d in WebVTT::NodeFactory::createNode (otherNode=0x0) at nodefactory.cpp:8  
8		if ( WEBVTT_IS_VALID_INTERNAL_NODE( otherNode->kind ) )  
(gdb) bt  
#0  0x000000010001247d in WebVTT::NodeFactory::createNode (otherNode=0x0) at nodefactory.cpp:8  
#1  0x000000010000b2d6 in WebVTT::Cue::nodeHead (this=0x100200ac0) at cue:99  
#2  0x000000010000b2fa in PayloadNodeTest::getHead (this=0x100200ba0) at payloadnode_testfixture:11  
#3  0x00000001000088d5 in PayloadNodeTest_DISABLED_NodeCount_Test::TestBody (this=0x100200ba0) at payloadnode_unittest.cpp:13  
#4  0x0000000100026433 in testing::internal::HandleSehExceptionsInMethodIfSupported (object=0x100200ba0, method={__pfn = 0x21, __delta = 0}, location=0x100035fe3 "the test body") at gtest-all.cc:3413  
#5  0x000000010003155f in testing::internal::HandleExceptionsInMethodIfSupported (object=0x100200ba0, method={__pfn = 0x21, __delta = 0}, location=0x100035fe3 "the test body") at gtest-all.cc:3449  
#6  0x000000010001d9b1 in testing::Test::Run (this=0x100200ba0) at gtest-all.cc:3485  
#7  0x0000000100024748 in testing::TestInfo::Run (this=0x100200090) at gtest-all.cc:3661  
#8  0x000000010002489b in testing::TestCase::Run (this=0x1002004f0) at gtest-all.cc:3768  
#9  0x0000000100024b9d in testing::internal::UnitTestImpl::RunAllTests (this=0x1002001c0) at gtest-all.cc:5591  
#10 0x00000001000268e5 in testing::internal::HandleSehExceptionsInMethodIfSupported (object=0x1002001c0, method={__pfn = 0x100024930 , __delta = 0}, location=0x100035cc8 "auxiliary test code (environments or event listeners)") at gtest-all.cc:3413  
#11 0x0000000100030fe8 in testing::internal::HandleExceptionsInMethodIfSupported (object=0x1002001c0, method={__pfn = 0x100024930 , __delta = 0}, location=0x100035cc8 "auxiliary test code (environments or event listeners)") at gtest-all.cc:3449  
#12 0x000000010001d161 in testing::UnitTest::Run (this=0x10005d280) at gtest-all.cc:5226  
#13 0x00000001000128f6 in main (argc=1, argv=0x7fff5fbff790) at gtest_main.cc:37  
(gdb)

Here I've compiled the code and tests in debug mode, which will produce an executable gtest program I can run for my unittest named payloadnode_unittest. From within the test/unit directory, I run the program under gdb. I tell gdb to run the program with the appropriate flags for testing. It happily runs the tests, including my DISABLED_ tests, and promptly crashes, dropping me back into the gdb prompt. Here I can ask for a backtrace using the bt command, and a quick look at the stack shows me that we're calling WebVTT::NodeFactory::createNode with a null node from WebVTT::Cue::nodeHead. Now I know where to start digging in order to fix this.

As we wind down the semester, we'll churn through all these tests and fix the bugs they find. I'm glad to finally have a working testing infrastructure we can all rely on. Once we have the tests in shape, and bugs shaken out, we can move onto integration in Firefox and the track element.