That stops working quickly - namely as soon as you want to test a function A tha...

jdlshore · on Jan 8, 2023

There's nothing difficult about the scenario you're describing at all. I don't have example code for that specific scenario, but I do have an example of the following scenario:

A calls B, which calls an external service. B returns function D, which can be used to cancel the request. When B fails to return within five seconds, A calls D to cancel the request, then calls E to write an error message to stdout.

The test for this scenario checks that the request was made, the request was cancelled, and the error was written to stdout. You can see that test here:

https://github.com/jamesshore/livestream/blob/2020-09-22-end...

valenterry · on Jan 8, 2023

Unfortunately your example situation is not comparable. Try to come up with a test for my example that does not pass any arguments during test that would never be passed during a production run. I guarantee you, that is not possible without mocking. And I'm saying that as someone who really doesn't like mocking.

jdlshore · on Jan 8, 2023

Okay, I have nothing better to do this Sunday morning. Let's play with your example. We have a function A that uses B to send an email and C to store a notification in a database. We want to test that, when A fails, it calls B a few times, then calls C a few times, then fails.

I'm not going to write a full working program, but I'll flesh out your example a bit and explain how it works. I'll use JavaScript and the patterns described in the article.

I'm going to say "A" in your example is the VerificationEmailController class. It has a postAsync() method that handles POST requests. When it receives a POST request, it sends an "verify your email" email, then writes the result to a database.

"B" in your example is SendGridClient. It has a sendEmail() method that uses SendGrid to send email. It does it by making an HTTP call to the SendGrid service.

"C" in your example is a EmailVerificationAuditTable. It has a insertEmailSent() method that inserts a "success" or "fail" record into a database table.

"Failing" in your example involves writing an alert to the application log file. It uses ApplicationLog, which has a logEmergency() method that writes a structured log with the "FATAL" log level.

To summarize, we are writing and testing VerificationEmailController. It depends on SendGridClient, EmailVerificationAuditTable, and ApplicationLog.

SendGridClient, EmailVerificationAuditTable, and ApplicationLog use the patterns in the article. Specifically, they're Nullable, they're Infrastructure Wrappers, they have Configurable Responses, and they use Output Tracking.

Got it? Okay, let's write the test. This test is really doing too much, and should be broken out into multiple separate tests, but I'm going to follow the example you provided.

  it("fails cleanly by retrying email service and database service, then logging an alert", async () => {
    // First, we set up the dependencies. This is the Nullables and Configurable Responses patterns.
    const sendGrid = SendGridClient.createNull({ error: "my email error" });
    const auditTable = EmailVerificationAuditTable.createNull({ error: "my database error" });
    const log = ApplicationLog.createNull();

    // Then we track their output. This is the OutputTracker pattern.
    const sendGridTracker = sendGrid.trackSends();
    const auditTableTracker = auditTable.trackInserts();
    const logTracker = log.trackOutput();

    // Then we instantiate the code under test. This uses normal dependency injection.
    const controller = new VerificationEmailController(sendGrid, auditTable, log);

    // Then we call postAsync(). I'm going to provide realistic code, but not explain it, 
    // because it's not relevant to this example. Normally this would be hidden behind a
    // helper function. (See the "Signature Shielding" pattern.)
    const request = HttpRequest.createNull({ body: JSON.stringify({ email: "my_email" }) });
    await controller.postAsync(request);

    // Now we assert that the controller did what it was supposed to.

    // First, we'll assert that we tried to send two emails.
    assert.deepEqual(sendGridTracker.data, [{
      to: "my_email",
      subject: EMAIL_SUBJECT,
      body: EMAIL_BODY,
    }, {
      to: "my_email",
      subject: EMAIL_SUBJECT,
      body: EMAIL_BODY,
    }]);

    // Then we'll assert that we tried to insert two audit log entries.
    assert.deepEqual(auditTableTracker.data, [{
      recipient: "my_email",
      result: EmailVerificationAuditTable.STATUS.EMAIL_FAILED,
      emailError: "my email error",
    }, {
      recipient: "my_email",
      result: EmailVerificationAuditTable.STATUS.EMAIL_FAILED,
      emailError: "my email error",
    }]);

    // And finally, we'll assert that we logged an alert.
    assert.deepEqual(logTracker.data, [{
      alert: "FATAL",
      code: "L668",
      message: "Email verification failure",
      recipient: "my_email",
      sendGridError: "my email error",
      auditLogError: "my database error",
    }]);
  });

There ya go. Entirely possible, not difficult, and (if I do say so myself), quite a clean and readable test.

valenterry · on Jan 9, 2023

Thank you for taking the timing and writing this up! I appreciate it a lot and that's why I come back to hackernews! :)

Now, your test works and I think I have to apologize in that I should have understood your approach better and write my answer accordingly. The relevant part of my previous answer:

> You WILL end up using a form of "mocking", for example passing the functions B and C as arguments to A and then, under test, don't really pass B and C but different functions that allow you to make assertions in test. That is still mocking.

So my point here is that, yes, you are passing functions into the new VerificationEmailController and the ones you pass in are not the same that are being run in production. This is what I call a mock: you replace a dependency that runs in production with one that runs only in the test.

That's not to say that your way of testing doesn't work. It's just that it comes with the same conceptual issues (but also benefits) that mocks come with.

In particular, 1) if we "misconfigure" the function in our actual production code (i.e. pass the wrong arguments) this won't be covered by the test.

Also, 2) we will reimplement certain logic in tests that are necessary to check the actions. Because different actions might still be valid, such as [add5, add5] or [add10] - they come to the same result, but in your assertions you'll need to handle that knowledge, without checking the state, because the state might live in an external system.

And 3) Forcing dependencies to be explicit (i.e. function parameters) is neither good nor bad per se, but sometimes it's nicer to have them encapsulated and in this case both classical mocking and your approach stop working.

Therefore when it comes to me, I see both classical mocks and your approach as conceptually equal and therefore would call your appraoch mocking too. That's what I wanted to say. I hope that gives you some insight - or maybe you disagree with my 3 points above, then I would be curious why.

djur · on Jan 8, 2023

That `run` function looks to me like setting up test doubles. What makes `stderr` in this code different from a spy?

jdlshore · on Jan 8, 2023

Not being a spy. :-) It's an array that's populated by an event listener.

CommandLine is the actual production code that writes to stdout and stderr (and reads command-line arguments). CommandLine.createNull() creates an instance of CommandLine that's "turned off" and doesn't actually write to stdout or stderr. CommandLine.trackStderr() returns a reference to an array that is updated whenever something is written to stderr (or not, in the case of a nulled CommandLine).

I'm off to bed, but I'm happy to answer further questions in the morning. For free, even.

jdlshore · on Jan 8, 2023

Looks like we've reached max depth, but one last response for @ithkuil:

> Another case where having real production code have parts of it that can be turned off is trunk based development leveraging feature flags.

I've used Nullables to implement "dry run" capability in a command-line tool that did git stuff. Super clean—when I got the --dry-run flag, I just called Repo.createNull() rather than Repo.create().

ithkuil · on Jan 8, 2023

Another case where having real production code have parts of it that can be turned off is trunk based development leveraging feature flags.