Why Code Rusts

or Why Tests Spontanously Fail

Originally posted; 2022-02-07T16:00:00 on the TDDA Blog.

You might think that if you write a program, and don’t change anything, then come back a day later (or a decade later) and run it with the same inputs, it would produce the same output. At their core, reference tests exist because this isn’t true, and it’s useful to find out if code you wrote in the past no longer does the same thing it used to. This post collects together some of reasons the behaviour of code changes over time.¹

The Environment Has Changed

E1
E2
E3
E4
E5

machine that has been updated (e.g. a database).

E6

(e.g. calling a web service to get/do something).

E7
E8

or under a different compiler or…)

E9
E10

e.g.

deleting a file the code uses
renaming a file the code uses
editing a file the code uses
emoving or renaming a directory the code uses
changing permissions on a file or directory the code uses
creating a file or directory that the code expects to create and is now unable to, e.g. because of permissions.
E11
E12

same place.

E13

of interactively, or in a scheduler).

E14
E15

memory or disk or some other resource; or has a subtle timing

dependency or assumption that fails under load.

E15A

runs faster, causing a race condition to behave differently.

[Added 2022-02-17]

E16
E17

allowed nice levels, a directory service, some permissions or groups…

E18
E19
E20
E21

has appeared in a site-packages or similar location, and was picked

up by your code or something else your code uses.

E22

gets changes in some subtle way that matters (e.g. line endings, blank

lines at the of files, encoding, tabs vs. spaces).

E23

the machine you are running on (e.g. causing it to slow down).

E24

of processing by last update date.

E25

is revoked.

E26

slow, or unreliable at the time the test is run.

E27

code rather than your code itself, e.g. something in a data centre

or library.

E28

changed, or an alias has changed so that the executable you run is

different from before. [Added 2022-02-11]

E29

specify the same path, some file that you are using is different from

before. [Added 2022-02-11]

E30

in a shell startup file. [Added 2022-02-17]

Many of these are illuminated by one of my favourite quote from Beth Andres-Beck:

Mocking in unit tests makes the tests more stable because they don’t break when your code breaks.
— @bethcodes, 2020-12-29T01:26:00Z https://twitter.com/bethcodes/status/1343730015851069440

The Code Has, in Fact, Changed

C1
C2

change the behaviour in the case you’re testing.

C3
C4
C5
C6
C7
someone else had pushed a change
you checked out a different branch
you pulled from the wrong repository.
C8
C9
C10

did change it in one of the other linked locations.

C11

change, the code (or other file or files) it symbolically

linked did.

C12

does matter to your code was not detected by the diff tool (e.g. line

endings or capitalization or whitespace).

C13

from the ones you ran previously, without realising it.

C14

changes to appearance.

C15

tool that had a bug in it and changed the meaning.

C16

has changed, but you are using files that aren’t tracked or are

ignored.

C17

timestamp is wrong or doesn’t mean what you think it means.

C18
C19
C20
C21

Also from Beth Andres-Beck:

If you have 100% test coverage and your tests use mocks, no you don’t.
— @bethcodes, 2020-12-29T01:51:00Z https://twitter.com/bethcodes/status/1343736477839020032

You Aren’t Running the Code You Think You Are

There is another set of problems that aren’t strictly causes of code rusting, but which help to explain a set of related situations every developer has probably experienced, which all fall under the general heading of you aren’t running the code you think you are.

M1

(e.g. you’re in the wrong directory).

M2

you are (e.g. you haven’t realised you’re ssh’d in to a different

machine or editing a file over a network).

M3
M4

system or you think you are/aren’t using it when you actually

aren’t/are (respectively).

M5

or an image or something).

M6

but you’re look at the wrong output (wrong directory, wrong tab,

wrong URL, wrong window, wrong machine…)

M7

not running what you think you are.

M8

think you are not (or are), respectively.

M9

it’s doing its magic, with the result that you’re not using

the libraries/code you think you are.

M10

into a different Python (or whatever) from the one you think it has.³

M11

using, but in fact you did when you updated (what you thought was)

a different virtual (or non-virtual) environment.

M12

a local site-packages and a system site-packages), with different

versions of the same library, and aren’t importing the one you think

you are.

M13

version number, but there was a code change that didn’t cause the

version number to be changed, or the code has multiple version

numbers, or the code is reporting its version number wrongly, or the

version number actually refers to a number of slightly different

builds that are supposed to have the same behaviour, but don’t.

M14

more than once in a language that doesn’t mind such things,

and are looking at (and possibly) editing a copy of the relevant

function/callable/object that is masked by the later definition.

[Added 2022-09-14]

M15

changing or recompiling your code won’t have any effect until

you restart at web server or application server. This is really

a variation of M5, but is subtly different because you

wouldn’t normally think of this as caching. [Added

2024-03-30]

These are the ones that make you question your sanity.

TIP If what’s happening can’t be happening, trying introducing a clear syntax error or debug statement or some other change you should be able to see. Then check that it shows up as expected when you’re running your code.

Almost every time I think I’m losing my mind when coding, it’s because I’m editing and running different code (or viewing results from different code).

Time has Moved On

T1 Your code has a (usually implicit) date/time dependence in it, e.g.

T2 Time is ‘bigger’ in some material way that causes a problem, e.g.

T3

and a measured (local) time interval goes negative.

T4

that Excel doesn’t (or more likely does) recognize.

T5

clock was wrong when you ran it before and is now right.

Resources Used by the Code Have Changed

R1

on the internet, a web service) returns different data from the

data it always previously returned.

R2

e.g. a different text encoding, different precision, different

line endings (Unix vs. PC vs. Mac), presence or absence of a

byte-order marker (BOM) in UTF-8, presence of new characters in

Unicode, different normalization of unicode, indented or

unindented JSON/XML, different sort order etc.

R3

something about the interaction is different, e.g. a different

status code or some extra data you can ignore, or some redundant

data you use has been removed.

Stochastic and Indeterminate Effects

S1
S2

but not other seeds that get used (e.g. the the seed for numpy is

different from Python’s main seed).

S3
S4

in fact, always produce the same answer (order of execution).

S5

system and there is inderminacy, a race condition, possible deadlock

or livelock, or any number of other things that might cause indeterminate

behaviour.

S6

behaviour that is in fact not determinisic or specified,

especially if that result is the same most but not all of the

time, e.g. tie-breaking in sorts, order of extraction from sets

or (unordered) dictionaries, or the order in which results

arrive from asynchronous calls.⁴

S7

two randomly-generated, fairly long IDs will be different from each other.

S8

the sequence of random numbers has changed. This has happened

with NumPy, where they realised that one of the sampling

functions was drawing unnecessary samples from the PRNG. In

making the sampler more efficient, they changed the samples that

were returned for the same PRNG seed.

[Contributed by Rob Moss

@rob_models@mas.to), who “had a

quick search for the relevant issue/changelog item, but it was a

long time ago (~NumPy 1.7, maybe).” He “couldn’t find the

original NumPy issue, but here’s a similar one:

It Never Worked (or didn’t work when you thought it did)

[Added 2024-07-19]

I realised there’s another whole class of errors of process/errors of interpretation that could lead us to think that code has “rusted” despite not having been changed. These are all broadly the same as one of the explanations offered before, but now for the original run when you thought it worked, rather than for the current or new run, when it fails.

N1

correctly, but you are mistaken: you didn’t run it at all, or it

in fact failed but you did not notice.

N2

a previous state, before you broke it, when it did work.

N3

then as now, but you used a defective procedure or tool to

examine the output then, and failed to realise it was

wrong/failing.

N4

wrong parameters/inputs/whatever and are now passing the correct

(or different) parameters/inputs/whatever so it now fails as it

would have done then if you had done the same.

¹ If you have think of other reasons code rusts, do let me know and I’ll be happy to expand this list (and attribute, of course)

² Touching a file (the unix touch command) updates the last update date on a file without changing its contents.

³ For this reason, a lot of people prefer to run python -m pip rather than pip, because this way you can have greater confidence that the module is getting installed in the site-packages for the version of python you’re actually running.

⁴ Most of these kinds of indeterminacy will, in fact, usually be stable given identical inputs on the same machine running the same software, but it can take very little to change that, and should not be relied upon. Why Code Rusts