I hate runtime errors. I love types. This post aims to explain why, and take a dive into my preference around approaches to navigating types. It is definitely worth noting that this is not a suggestion of a “right answer” - in fact I believe quite strongly that one does not exist. My preference is tightly coupled to the context in which I work!
I find that a decent portion of bugs I encounter, are due to “unhandled cases”. Let’s start with the following example:
<?php function randomAnimal(): string{ $animals = ['dog', 'cat']; return $animals[array_rand($animals)];} function noise(string $animal): string{ return match($animal) { 'dog' => 'bark', 'cat' => 'meow', default => throw new Exception('Not an Animal!'), };} $animal = randomAnimal(); // 'dog'|'cat'echo noise($animal); // 'bark'|'meow'
Great, we handle Dogs and Cats - but now we need Fish too. Let’s add them in:
<?php function randomAnimal(): string{ // added here... $animals = ['dog', 'cat', 'fish']; return $animals[array_rand($animals)];} function noise(string $animal): string{ return match($animal) { 'dog' => 'bark', 'cat' => 'meow', default => throw new Exception('Not an Animal!'), };} $animal = randomAnimal(); // 'dog'|'cat'echo noise($animal); // 'bark'|'meow'... or an Exception!
Our code is broken - but our static analyser knows no better. This is a great sign that there is room for improvement in our typing. The problem here is that our Animals aren’t actually strings (…duh). By expressing this in our types, our static analyser can report potential issues:
<?php enum Animal { case Dog; case Cat; case Fish;} function randomAnimal(): Animal{ $animals = Animal::cases(); return $animals[array_rand($animals)];} function noise(Animal $animal): string{ return match($animal) { Animal::Dog => 'bark', Animal::Cat => 'meow', // Match expression does not handle remaining value: Animal::Fish };}
You may be wondering why I’m happy to keep the strings representing animal noises: this leads us to our next section…
I hate magic strings. A magic string is any string , that actually represents something richer. In our earlier example, our Animals were magic strings - so why aren’t the animal noises? I’d argue that due to them only being used for output (a context in which any string is valid) - they’re not magic. Equally, they could have been:
<?php function volume(string $noise): int{ // oh no - back to magic strings! return match($noise) { 'bark' => 10, 'meow' => 7, 'glug' => 2, }}
As always - it depends! Correct typing is about the context in which data is used. The previous issue was that the string type allows for more possibilities than we actually want: our type declaration was not strict enough. The stricter your types, the lower the possible states values can be in. This leads to less necessary context - which is totally underrated: who wants to think? To really hammer this home - imagine we’re dealing with animal noises again.
<?php function randomAnimalNoise(): string{ $animalNoises = ['meow', 'gulp', 'bark']; return $animalNoises[array_rand($animalNoises)];}
You want to add growl as a new Animal Noise . We’ve already established that your type ignorance has rendered your static analyser without a clue here - so you’re on your own. You decide to find an single item - bark in this case - and add growl to any array containing it.
<?php function randomAnimalNoise(): string{ // easy - right? too easy... $animalNoises = ['meow', 'gulp', 'bark', 'growl']; return $animalNoises[array_rand($animalNoises)];} function randomTreePart(): string{ // okay - getting boring now... $treeParts = ['branch', 'trunk', 'bark', 'growl']; return $treeParts[array_rand($treeParts)];}
Did you spot it? Growl is now a valid Tree Part - oh no. Now imagine a large codebase, with files upon files of magic strings and crossovers. No thanks.
Type signatures can be thought of as programming contracts. The function signature function foo(string $bar): int reads “a function (named foo ) that accepts any string , and returns an integer ” - easy right? Signatures concisely describe what inputs a function accepts, and what it will return. If we start doing additional checks inside our function, this contract (and our concise description) is broken. Taking us back a little bit:
<?php // accepts any string, and returns an integerfunction noise(string $animal): string{ // but when we look deeper... // we only accept two very specific strings! return match($animal) { 'dog' => 'bark', 'cat' => 'meow', };}
Not being able to trust type signatures can add significant mental overhead to reading code. Have you ever had to stop and think “can I pass this here?” - your static analyser should do that for you!
Another issue with breaking the type signature contract, is the appearance of resulting errors. Still considering the above function - imagine we pass an invalid string . The error will look something like this:
Uncaught UnhandledMatchError: Unhandled match case 'bark'in function noise
This error is reported as a problem with the function declaration. The actual issue we’re dealing with, is caused by the function call. Consider the equivalent with our Enum implementation instead:
<?php enum Animal: string { case Dog = 'dog'; case Cat = 'cat';} function noise(Animal $animal): string{ return match($animal) { Animal::Dog => 'bark', Animal::Cat => 'meow', };} $animal = Animal::from('bark');// any other code could be in between these two stagesnoise($animal);
Uncaught ValueError: "bark" is not a valid backing value for enum Animalin index.php line: Animal::from('bark')
Lets dive back into function signatures: The function signature function foo(X $bar): Y reads reads “a function (named foo ) that accepts any X and returns an instance of Y ”. I think it is important to understand the difference here between “accepts any X ” and “returns an instance of Y ” here.
Accepts any X , means the function in question should be able to deal with any X (no surprise here right?) - anything passing this type check should be a valid input for this function . Returns an instance of Y is a little trickier. It means that any code calling the function should be capable of dealing with any instance of Y . It may be easier to look at this from the perspective of why not to do it wrong. Consider we have this function:
<?php function add(mixed $first, mixed $second): mixed{ return $first . $second;}
Alarm bells right? The concatenation operator (.) operation is only valid for a handful of types (which we’ll treat as string for now). It’d result in a type error for others: our function does not work for booleans , so they should not be in the signature. Great. So what is wrong with returning mixed here? Let's look at a usage of this function.
<?php function concat(string $first, string $second): mixed{ return $first . $second;} $stringOne = ask('Please enter the first string: ');$stringTwo = ask('Please enter the second string: '); $result = add($stringOne, $stringTwo); // mixedecho $result;
Passing mixed to echo is not type safe - it only accepts strings . So we’d need to narrow it…
<?php $result = add($stringOne, $stringTwo); // mixed if(!is_string($result)) { throw new Exception('Result is not a string');} // $result must be string here echo $result;
This sucks. Having to assert that a value is not of a certain type is a big red flag that the expression producing it has too wide a return type. We know that concatenating two strings , can only return a string - lets reflect that!
<?php function concat(string $first, string $second): string{ return $first . $second;} $stringOne = ask('Please enter the first string: ');$stringTwo = ask('Please enter the second string: '); $result = add($stringOne, $stringTwo); // string echo $result;
Much neater. Setting wide return types leads to unnecessary type detection (or lack of type safety) later on.
By forcing data into the narrowest relevant type at the earliest possible time, we limit the amount of back tracing we have to do when a related error occurs. Consider the following example, in which we’re signing up a new user - but require an email confirmation before storing their details.
<?php function registerUser(Request $request, MailService $mailer): Response{ // extract user details from the request $data = $request->only('name', 'email'); // send email confirmation $verification = $mailer->sendConfirmation($data->email); $verification->onceComplete(function() use ($data) { // Create the user account $user = new User(...$data); $user->save(); }); return response(200);}
Awesome - but our user didn’t enter their name. Application will return a 500, but no big deal right? The user will hopefully just return to the form, see they missed their name and try again. Except the error happened down here…
<?php function registerUser(Request $request, MailService $mailer): Response{ // extract user details from the request $data = $request->only('name', 'email'); // send email confirmation $verification = $mailer->sendConfirmation($data['email']); $verification->onceComplete(function() use ($data) { // Create the user account $user = new User(...$data); // Argument #1 ($name) not passed $user->save(); }); return response(200);}
So our email went out, and the user clicked it - so they think they’ve got an account. Now they’re onto our support team… If we’d instead validated our data at the beginning of the process:
<?php function registerUser(Request $request, MailService $mailer): Response{ // extract user details from the request $data = $request->only('name', 'email'); // Argument #1 ($name) not passed $userData = new UserData(...$data); // send email confirmation $verification = $mailer->sendConfirmation($userData->email); $verification->onceComplete(function() use ($userData) { // Create the user account $user = $userData->toUser(); $user->save(); }); return response(200);} class UserDto { public function __construct( protected string $name, protected string $email, ) {} public function toUser(): User { return new User( name: $this->name, email: $this->email, ); }}
We’d still get our error - which probably wants neatening up before being displayed to the user - but crucially it happens before they receive the confirmation email. Our support team is free to deal with someone else… This doesn’t just apply to user entered data either: narrower typing can be especially helpful for functions that transform data.
<?php function groupByMonth(array $items): array{ $grouped = []; foreach($items as $item) { $month = $item['month']; $grouped[$month][] = $item; } return $grouped;}
This function signature reads “a function (named groupByMonth ) that accepts any array , and returns an array ”. First of all, we’ve got an problem resembling 'magic strings' from earlier - our function accepts any array , but we can actually only handle arrays in which the items all contain a month key.
<?php $cats = [ [ 'name' => 'Cain', 'age' => 4, ], [ 'name' => 'Pumpkin', 'age' => 3, ],]; groupByMonth($cats);// Undefined array key "month"
Lets update our signature!
<?php /** * @param array<array-key, array{month: string}> $items */function groupByMonth(array $items): array{ $grouped = []; foreach($items as $item) { $month = $item['month']; $grouped[$month][] = $item; } return $grouped;}
Awesome, now our static analyser will catch this:
<?php $cats = [ [ 'name' => 'Cain', 'age' => 4, ], [ 'name' => 'Pumpkin', 'age' => 3, ],]; groupByMonth($cats);// Parameter #1 $items of function groupByMonth expects// array<array{month: string}>,// array{array{name: 'Cain', age: 4}, array{name: 'Pumpkin', age: 3}}// given.
For our return type, we’re taking a quick peek back to the contract. Currently, code calling this function should be able to deal with any array - that represents a real variety of data, quite the responsibility! Much like our mixed type earlier, we can narrow this to reduce the responsibility.
<?php /** * @param array<array-key, array{month: string}> $items * @return array<string, list<array{month: string}>> */function groupByMonth(array $items): array{ $grouped = []; foreach($items as $item) { $month = $item['month']; $grouped[$month][] = $item; } return $grouped;}
Now our calling code only has to deal with the specific type of array structure we provide - a much lighter load. You may be concerned at this scary looking type signature - “it’s so complicated” I hear you say. Maybe... until you’ve used them a bit more. But we haven’t actually changed the code, we’ve just added extra information for readers. We’ve highlighted complexity - not added it. Better the devil you know, right?
Thanks for reading. As previously mentioned, these are my preferences - not the answer. I’d be quite surprised if they don’t change over time, as my experience & knowledge grows. As always, any feedback is welcome.