# Tested and Correct ## Intro ### About Me Hello, I’m Adam Dangoor. I’m a Software Engineer at ClusterHQ. We make tools to help developers manage their data, and we try to document those tools well. ### The Plan I’m going to talk about how you can spend your day doing interesting new work instead of fixing newly broken but previously-working old work. In particular I am going to talk about how you can do that by writing guides for software which work and continue to work even when the software changes, which is useful because software always changes. ## The problem I want to solve When you have instructions and examples which don’t work or they don’t match reality it lowers the users trust in your product. It makes people think that the software itself is shoddy, and I know I feel that way a lot about software based just on the documentation. Often it turns out that the documentation is 95% correct, but maybe that isn’t good enough. In many disciplines 95% correct can still be very good. Like, you can’t tell that this curtain pole is actually a couple of degrees off horizontal. And this chicken would have been a little bit better if it had been left in the oven for one more minute, but almost no one will notice. But instructions are different. If I’m following a guide and it tells me at one stage to click a button which isn’t there, or type a command which shouldn’t give an error but does, I might give up. In fact, nearly everyone might give up at this point, especially if it is a getting started guide, or something meant to demonstrate the software’s potential. I’m going to assume that you have mechanisms for writing correct documentation. Maybe you manually follow your own instructions after writing them, or watch a user as they follow your instructions. But most systems for ensuring that examples and instructions stay working over time are broken, or they could at least be improved. The traditional way to keep software documentation working is to rely on three things. The first is memory. That usually means relying on software engineers who are changing code or people reviewing code who might remember to check all of the documentation and make sure to update everything that their changes modify. The second is periodic review. Once in a while we go through the documentation and check that it all works, or fix anything which is now out of date. Finally, we have user feedback. We rely on really nice users to report problems when they spot them, and then we fix those problems. All of these methods are terrible. Very few non-trivial projects exist in a vacuum. Software interacts with other software, which might just mean that an application interacts with the operating system it is running on, or it might mean that your application has an explicit integration system with someone else’s. Things which are out of your organisation’s control will change. Perhaps you link to an external site and the third party changes their website structure so the link in your documentation stops working. Or maybe you describe some interaction with some third party software in your documentation and that software changes. Also, human memory isn’t perfect. We don’t always remember all of the places in the documentation we have to change. Relying on memory also sets the bar very high for new contributors. A new contributor has to know all of the documentation inside out before they touch a line of code, or they will risk making the documentation incorrect. This means that if your company is relying on memory to keep your documentation correct, they are wasting money on salaries paying people to try to memorise things. Software engineers, QA engineers, technical writers and other documentarians have better things to do than remember how a change to each bit of code might impact the correctness of existing documentation. We also have better things to do than to periodically read through documentation checking it for correctness. We should be writing and improving documentation, not just making sure that something that used to work still works. As for relying on user feedback. This might be OK if you have thousands of users testing your betas and pre-releases. But in production, many people will give up on your software before even one person lets you know that there is a problem. There must be a better way. We have to abandon the myth of “write-once” documents. Documentation should be written with a realistic acknowledgement that it will have to be maintained, just like good code is written. Software can help to do this in two ways: 1. We can write tools which tell us as soon as our documentation stops working, and 2. We can write tools which generate working documentation automatically In software engineering we have a methodology called test driven development. It has many facets but one of the outcomes is that after every change automatic tests are run which make sure that existing functionality still works as intended. It allows software engineers to work with less fear of breaking something which exists. We can also test interactions with other software so that we know our code will continue to work in conjunction with other people’s. Most documentarians don’t have the same luxury with the documentation. We can do our best though. Some tools exist already which test parts of your documentation. For example Sphinx has a link checker. I never have to worry that one of the links in my docs has broken. I’ll get told as soon as that happens and can fix it immediately. And it has a spell checker. This is especially nice in the review process because a reviewer can focus on the substance of new documentation and not spelling. But sometimes more complex problems require more complex solutions. What I do is create software tools which automatically test various parts of the documentation we have. I work on a tool called Flocker, which people install via command line instructions on their laptops and servers. This is what we used to have as our Mac OS X installation instructions. This is the source restructured text, and the rendered HTML output. But what if someone changed the software so it now won’t work without another installation requirement. These instructions wouldn’t work. That person would have to: * Notice that their change might break something for a new install. * Remember where the installation instructions were. They could be in many places throughout the documentation. * Manually start a fresh machine, and run the instructions. In fact this might have to be done multiple times: once for each operating system we support. * And then, finally they would have to change all instances of the installation instructions, without making any copy-paste mistakes. I think that this is a pretty common scenario. When it is hard to test a guide, we tend not to change it. Ideally we would have guarantees that our guides work, and after any change those guarantees would remain. The current situation is that documentation is fragile, and that is because we mostly rely on the flawed human memory, periodic review and user feedback. Our installation docs were prone to error, and the cost of improving the instructions was so high, and it was such a mind numbingly dull task that we just didn’t do it. We lived with worse and more complex installation instructions because to simplify them would have meant a lot of manual work. So, we automated the problem away. It took some time but this is what we ended up with. This task in the restructured text documentation source is linked to Python code which is run on every supported platform, after every change. That will report a problem if the installation instructions don’t work. This means that we will know either if new installation instructions don’t work because a change broke them, or if existing installation instructions don’t work any more because something has changed in the desktop and server platforms we support. What is important to note is that the outputted documentation is the same as it was before. But now we are told before shipping any code which would make the documentation incorrect. In the general case, command line interfaces are usually much easier to deal with than graphical interfaces when it comes to documentation testing. There just isn’t so much nuance involved in testing whether something is correct. Tools exist which let you specify some command input and the expected output. The tool that I use is called wordish, and I definitely recommend it if you are using restructured text. ### (Development -> Documentation) now Development + Documentation There is an old but still prevalent model of software creation which involves big splash releases. Perhaps once every year a company will release a new big version, and then spend the next year writing the next version. Particularly in large organisations, we see a waterfall model. Software is written, and then it is handed off to a documentation team to write the docs. When organisations separate documentation writers and software writers, time-consuming handoff processes develop. Modern software won’t be able to wait for that process much longer. We are moving towards models of continuous deployment. Netflix release new versions of their software multiple times per day. It isn’t realistic at this kind of speed to hand off finished software for people to document. I think that the writers of documentation for software have to work closely with the writers of that software. The pattern of software engineer creates software, then technical writer plays with it then technical writer writes about it does not create sustainable documentation. The two disciplines must work together. If you’re a software engineer I recommend that you start thinking about writing some tests to verify your documentation’s correctness, and run them alongside your unit and functional tests. If you are a documentarian working with software engineers, I recommend that you pressure them to do that, and work with them to write documentation which is testable. The only way that this can be done is to tie the documentation to the code in the same software repository, and to integrate coders and technical writers at an organisational level. That means technical writers learning Git or SVN version control systems, and software engineers only shipping code when it comes with appropriate documentation. ### Generating documentation If your software generates your documentation after every change then it is guaranteed to stay up to date. You won’t even need to fix it after a test tells you that it is broken. I’ve seen that there is a later talk on API docs generation so I won’t go too much into that but the idea is that there are tools which allow me to change my API and my API documentation will automatically change. In this example I am using a Sphinx extension in the documentation source to create API documentation in the browser. This means that if I change my API, my documentation will be changed too. API documentation changes aren’t the only ones which can be generated after software changes. Almost anything which is usually a tedious manual translation from code to copy or from code to screenshots can be automated. How this is done depends on the internals of your product, and it often involves custom engineering work but it can pay off in the long run. An example here is documentation I’ve written which includes a screenshot of how the results of a game solver can be displayed. But then I change the colour. Usually I would have to take new screenshots and insert them in the docs, but instead I’ve written a tool which runs the program and takes a screenshot for the docs using Selenium. Selenium is a tool which can automate interactions with web browsers, but there are other tools which can automate interactions with desktop and mobile applications to achieve similar results. I’m working on making this a Sphinx extension so that others can do similar with web based projects. ## Summary So, in summary, software and its documentation are inextricably linked. To improve the quality of documentation, programmers need to spend time writing docs with their code and not after, and working to make those docs continue to work. Problem solved. Now we don’t have to rely on human memory, periodic review and user feedback. So I guess we can all have lunch. ## But wait! But wait! Unfortunately life is never easy. Ideally every character of every word in our documentation would be checked by a robot every minute and after every change. We would have the equivalent of unit tests for our documentation - we would get the same guarantees that our docs work as we have for our software. But we’re not there yet. As I’ve shown, there are a few tools which can help, and we can build tools specific to some cases. We can approximate the ideal scenario, but to get there we probably need strong artificial intelligence. For now, you can catch me around for the rest of the conference if you’d like any tips for automating the testing of your documentation. When I can’t think of a way to write code which automatically tests documentation, I like to write a test which is paired with documentation. That means that if I have a guide which, say, tells a user to install some software and then run some commands, and then expect some output, I might also write a test which does the same thing and asserts that the output is as written in the guide. I then leave comments in the documentation and the test that if either is changed, the other should be changed as appropriate. And this can even show up bugs a human might not find. One interesting bug a test showed up recently is a race condition. A race condition is where software depends on one thing finishing before another, but that isn’t guaranteed. We have a race between two things happening. A guide was written to show how to use Flocker with a log analysis tool called Elasticsearch. Essentially it told you to install Elasticsearch, insert some logs into it and then check that the logs were there. When we wrote the guide, we tested it manually and methodically and confirmed that it all worked. But when we wrote the test, the computer ran things much faster than we had, and everything broke. It turned out that if a fast user installed and ran Elasticsearch with Flocker and then quickly checked for the logs, they wouldn’t show, they had to wait. Sometimes automatically testing your docs can find a bug a human might never have found. Thank you for listening.