New York City voted earlier this year to keep automated decision systems in check by establishing a task force to oversee the use of algorithms by the city. The task force quickly fell apart because city officials could not come up with a definition of the types of tools the task force would examine, or any specific systems the task force could study in detail. Had the city started with a specific automated decision system in mind, they could have built a better task force and put them to work as soon as the policy was in place to support their efforts.
Policymaking processes should deliver policies that are relevant to real world conditions and stand the test of time, but policies don’t always achieve this in the wild. Particularly in technology policymaking, where policymakers tend to have less direct expertise building and less intuition as to how technologies can and will evolve, projects have high potential to falter in their implementation, to be out of date by the time they are complete, or to bring about unintended consequences.
These problems hold true, even with some of the most celebrated technology policy projects. The European Union’s General Data Protection Regulation (GDPR), for example, was a promising step to ensure the protection of personal data, but it neglected to consider the obligations that data controllers like Facebook and Google have to academics and other public interest researchers. Now researchers face serious data access limits that affect the types of projects they can take on and the quality of their analyses. Had researchers and academics been considered as a stakeholder group in the GDPR drafting process, the academic community and public knowledge as a whole would be far more sophisticated.
Technology policies and the processes used by policymaking teams to create them need to be more robust. Public policy scholars define “robust policy” as policy that delivers on its intended functions and objectives over time, even under ambiguous or adversarial circumstances. Most tech policy today is not sufficiently robust.
As technologists, we see parallels between code and policy. Each introduces new rules within a complex system; each needs to handle a range of conditions that are hard to anticipate; and each needs to work every time.
In recent decades, the field of software engineering has introduced process improvements that have transformed software quality and reliability. Could the same processes that made software more robust be applied to technology policy to do the same?
Writing good software is not easy, and computer scientists didn’t always know how to do it. At the first NATO Software Engineering Conference in 1968, Turing award winner Edsger Dijkstra was among those who declared a “software crisis,” arguing that rapid increases in computing power had allowed sloppy, overly-complex computer programs to proliferate. Dijkstra pointed to software development methods as the root of the problem, and called for a revolution in which testing “proofs” and programs might be co-developed, hand-in-hand. This marked a significant leap forward in computer science and the advent of test-driven approaches to developing software that have become the industry’s standard today.
Today’s software development paradigm is an amalgamation of processes and practices, and testing is one of its cornerstones. To ensure software works as intended, engineers break down a program into small pieces of logic and write a corresponding test (itself a small piece of code) for each piece of logic. When an engineer “runs the tests” for a software program, a passing test confirms that the code works as intended. A failed test reveals errors and inadequacies that must be addressed before the new code goes out to users.
For example, a ride-hailing company like Uber or Lyft might test to ensure that the code never matches a passenger with multiple cars at once; Google or Apple might test that multiple driving routes suggested by their code never vary in duration by more than an hour; or an e-commerce company like Amazon may test to confirm that the code that calculates shipping fees works correctly in different geographies.
One popular approach to software testing is called “test-driven development” (TDD), a methodology which emphasizes writing the tests before the corresponding code is written. TDD ensures that requirements for the code are clear and agreed upon up front, and that the software will live up to these requirements in the real world. Importantly, the tests are continually re-run each time the software is modified and improved over time, so engineers can be confident that previously agreed upon requirements were not violated or earlier functionality broken.
Writing software tests takes time and requires creativity. There is no one-size-fits-all formula for writing tests, though one best practice is to write tests that articulate what the code should not do in addition to what it should do. For example, tests can be written to ensure that Uber does not match you with 3 drivers, does not match you with -1 drivers, and does match you with 1 driver.
Test-Driven Development for Policymaking
TDD has enabled software engineering teams to proactively and reliably mitigate errors, and it can be adapted to technology policymaking to do the same. Instead of writing tests as code, policymakers can write tests in the form of plain English. The tests each describe a situation that should or should not be covered by the policy language.
The purpose of a policy test is to anticipate the effects of draft policy language and help ensure that the policy doesn’t deviate from the intended goals as it’s being drafted. “Running the tests” involves a staffer, policymaker, or some other member of the policy team writing down tests in standard word-processing software, then drafting potential policy language, and finally looking through each test manually and considering what the policy language stipulates in the case of this example. The ultimate goal is for the policy language to carry out the requirements expressed in the tests, and for policymakers to proactively consider a variety of situations relevant to the policy.
Concretely, an individual employing TDD for policy will consider the intent of the policy, brainstorm hypothetical and real situations the policy might influence, develop test cases to evaluate the impacts of the policy language on the hypothetical and real situations, and consider appropriate remedies to mitigate negative consequences. Note that unlike TDD for software, TDD for technology policy does not require the use of code or specialized tools in any way.
Benefits of Test-Driven Development in Policymaking
Adopting a test-driven approach to policy development benefits the resulting policy because:
- Test-driven development ensures policymakers consider concrete situations affected before crafting general policy. Concrete situations are easier to debate and reason about.
- Writing policy tests is easier than writing policy itself. Policy tests can be written by anyone; they present new opportunities to engage the public, industry and academic experts in policymaking.
- Tests can be used sporadically, flexibly, and repeatedly and still be effective. Tests can be written for independent phrases or passages within a larger policy document; they can be adopted by just one or two individuals on a policymaking team; and they can be reused when new policy projects come up in the same policy area.
Interested in learning more about test-driven development for policymaking, including thought-starters for writing your own policy tests and worksheets / templates for easy use of TDD? See our slide deck, step-by-step guide, do-it-yourself ‘TDD for Policy’ worksheet, and memo here.