My GSOC Journey #2: A lot of code, and a lot more tests
It’s just over two weeks into the official coding period now! I just finished the first half of my project, involving a few more PRs. For a summary of my first PR of GSOC, refer to my previous post.
For this PR, I worked on adding the capability to save an entire agent-based model to a file. It’s a pretty useful feature to have, since it allows persisting the state of a model even after the REPL is closed. It also enables running the model for some time, then simulate various branching scenarios from that point on. It’s as simple as saving the model, then loading it up once for each scenario from the same file. The serialization capability is enabled by JLD2.jl. In its own words, JLD2.jl “saves and loads Julia data structures in a format comprising a subset of HDF5”. Basically, it can serialize most things thrown at it. Perfect for saving the model.
Initially, the plan was to use JLD2’s custom serialization capability. What this allows me to do is specify that the ABM
struct should be saved instead as SerializableABM
, and define conversions between the two. This is necessary since the struct contains both non-serializable data (functions) and data that can be saved in a much nicer/more compact format (flattening dictionaries to a list of pairs). Anything that isn’t set to be converted is just saved as-is, which is perfect for not having to rewrite any code if more functionality is added.
Unfortunately, this approach turned out to have a major issue. Take the example of ABM.scheduler
. It’s a function, so it can’t be serialized, so I omit it from SerializableABM
. The scheduler can then just be taken as a parameter during deserialization. The issue comes up when the actual ABM
struct is reconstructed from SerializableABM
. Since JLD2 handles deserialization, it doesn’t allow passing extra parameters to the reconstruction function. I can’t change it after reconstruction since the ABM
struct is immutable. In our weekly meet, Tim, George and I decided the best solution would be to effectively re-implement the custom (de)serialization functionality and allow passing keyword arguments to the functions that reconstruct.
This actually turns out to be a lot easier than it looks. All I needed to do was define two functions:
to_serializable(t) = tfrom_serializable(t; kwargs…) = t
The to_serializable
function takes something as input, and returns the serializable version of it. The from_serializable
function does exactly the opposite, except it also takes keyword arguments. Since Julia uses multiple-dispatch (one of my favourite features of the language), and the default implementation of these functions isn’t type-specific, they both return whatever was passed in. To define a custom serialization format for a type, I just have to create a more specific method. For example, the model struct is of type ABM
, so all I have to do is create a SerializableABM
struct and define methods with the following signatures:
to_serializable(t::ABM)from_serializable(t::SerializableABM{S,A}; kwargs…) where {S,A}
Since these are more specific than the defaults, they will be used when an ABM
/SerializedABM
is passed in. To save a model, I pass it into to_serializable
and let JLD2 serialize the result to a file. To load it back in, JLD2 gives me the SerializableABM
from the file and I can just chuck it into from_serializable
to get back the model. Additionally, I can pass any keyword arguments to from_serializable
and use them wherever necessary. Of course, the user doesn’t have to do all this. They can just use the save_checkpoint
and load_checkpoint
functions, which do the heavy-lifting. Pretty cool, isn’t it? Anyway, after this it was just a matter of defining serializable formats for any struct that needed it.
No new feature is complete without tests. In this case they were really extensive. Models are complex data structures: each model contains a single space, each space has it’s own data (ContinuousSpace
even contains an instance of GridSpace
), not to mention several other complexities. Every combination needed a testset, and each testset checked every single saved parameter. This resulted in nearly 300 new tests, which ironically is more than the lines of code they test. All these tests payed off pretty much instantly, since they revealed several bugs that took a while to fix.
Arguably one of the more underappreciated aspects of writing any open-source software is writing documentation. Writing documentation is easy. Writing good documentation takes thought. As someone who wrote the functionality, I’m well aware of exactly what it can and can’t do, and how it does whatever it does. The user isn’t aware of any of this, so conveying all the necessary information without going into too much detail, and doing so in a succinct, unambiguous manner takes skill. My first attempt explained most of what was necessary, but also had some unnecessary details. Some of it could be rephrased better. After several review comments on GitHub and a discussion over video call later, I managed to write good documentation.
The Agents.jl docs contain examples to illustrate how the library should be used. Particularly, the Schelling’s segregation example is used to show several general-purpose features of the library applicable to all models. We decided to add a section there for saving/loading models. The example was simple enough to write, but for some reason it refused to cooperate with the docs build. While loading back the model, JLD2 was unable to find the SchellingAgent
type, and reconstructed a temporary type with the same data. This issue turned out to be isolated to documentation builds, and couldn’t be replicated otherwise. After a lot of local doc builds, we decided over a video call that the PR should be merged without this section of the example. Meanwhile, the section was put in a tracking issue to resolve later and an issue was filed in JLD2.jl. This turned out to be an issue relating to how Documenter.jl builds docs, and the local modules it creates for the code blocks. The fix was a single line of code to identify the temporary module that Documenter creates, and everything worked perfectly.
There was a surprising amount to learn during all of this, from little things such as the fact that kwargs...
is an iterator over NamedTuple
and not just a NamedTuple
, to much more significant things such as how to approach designing a relatively complex system with a very simple API. I learnt how useful tests are, and also how to write and organize them efficiently. I also learnt to write good documentation, the things to avoid and what I should ask myself while writing it.
This concludes the first two weeks of the official coding period, and the first month of my GSOC contributions. It’s been a roller-coaster ride, and I don’t think I’d have it any other way.