How to Manage Drupal Entities Through Webhook Notifications

Tags:

June 8, 2020 | Chris Greatens

How to Manage Drupal Entities Through Webhook Notifications blog image

Many of the enterprise-grade Drupal sites we build at Bounteous rely on lots of data—much of which is often managed and stored in a third-party provider’s system. While conventional APIs—like those that rely on RESTful services—are common sources for pulling external data into a website, you may encounter some third-party providers who dispatch updates via webhooks. Here's how to work with those notifications in Drupal.

While Drupal 8/9 core provides all of the necessary tools for receiving and processing webhook notifications, the lack of an established API, dedicated plugins, or generic contrib modules can make building a custom solution a bit daunting. In this blog post, we’ll walk through a complete top to bottom implementation that’s able to create, update, and delete (CRUD) entities whenever webhook notifications are received.

Note: To try out our example code, you can skip over the lengthier explanations and simply follow the instructions in the blue boxes. You’ll want to begin by downloading the sample Drupal module we’ve assembled from the Bounteous GitHub account. Clone that repository to your Custom modules folder and enable it to follow along with our example.

What Are Webhooks, Anyway?

If you’ve worked with APIs in the past, you’re probably familiar with the general process: make a request to a third-party service to ask for some data and it responds to let you know what—if anything—is new.

In Drupal, we most commonly rely on a cron task to make periodic requests, tailoring the frequency of those calls to the timeliness of the data that’s being retrieved, the ebb and flow of traffic to the site, or both. Think of that process as being akin to calling a friend every evening to find out what’s happened over the past 24 hours. Some days will be slow and they won’t have any updates to share, while others are full of news; either way, you get a complete rundown of their day in one fell swoop.

Webhook notifications are more like that friend who texts throughout the day whenever something happens. While the frequency, urgency, and length of their messages may vary, you always receive their updates on a rolling basis—and only when there’s something that (they feel) you need to know. Webhooks provide a similarly timely heads-up whenever data is modified in an external system, saving you that daily (API) call.

Use Cases for Webhooks

All that’s required in Drupal 8/9 is core—there are no contrib modules required! But, there are a few additional prerequisites to cover before we get started.

A Data Provider That Dispatches Webhook Notifications

This piece of the puzzle will be specific to your particular use case. Learning that your third-party service is capable of delivering webhook notifications likely led you to this post; if you’d like to work with that provider’s actual data, you’ll need to review their API documentation and adapt the code in our example module to accommodate the specific data structure in their notifications.

In order to bootstrap a working example, we’ll be using the Postman app to post webhook notifications to our Drupal site in this tutorial. Whether you plan to follow along with our example or do local development against actual data you’ll want to download and install Postman on the computer you typically use to write code.

Sample (or Actual) Notification Data

In order to develop a custom solution that can act on a provider’s data, you’ll need at a minimum a sample notification that represents what will ultimately be posted to your Drupal site. Example data is useful for any project since it—in combination with Postman—allows you to trigger notifications without logging into your provider’s system and/or modifying any actual data.

Fire up the Postman app on your computer, then follow their guide on importing Postman data to pull in the sample collection (Webhook Entities.postman_collection.json) found in the root directory your downloaded copy of our example module.

If you’re already up and running with a particular provider and would like to use their data but can’t find an example in their documentation, several online tools may provide some help. One, webhook.site, is a particularly indispensable resource. Simply pull up that site and copy the temporary URL it generates, then log into your provider’s system and paste the temporary address into their webhook notification field. At that point, any valid events in the provider’s system should result in a notification being sent to the temporary webhook.site URL you’d copied—and that will allow you to see all of the data received from each new notification that’s generated.

A (Publicly) Accessible Drupal Site

Longer-term, your site will ultimately need to be publicly accessible via the Internet in order to actually listen for any real notifications. While that’s a given for hosted environments (and you can skip the rest of this section if that’s you), most development with modern tools (Acquia Dev Desktop, a Docker container running the Lando D8 recipe, or Drupal VM, etc.) is done locally—and therefore effectively offline. While offering solutions for all possible approaches to local development is beyond the scope of this post, two common approaches have proven to be the most reliable and quickest to get up and running for us at Bounteous:

Exposing a Local Environment on the Internet via ngrok

ngrok is a tool that allows you to create a secure tunnel to a locally hosted site so that it’s accessible via the web. If your local development workflow requires working with actual notifications dispatched directly from your provider, then this tool might be the way to go.

Let Postman Stand in for Your Webhook Provider

If you can access your local or hosted dev environment from a browser on your computer, Postman can post sample notifications to it. We’ll be relying on this approach below since it’s much more tooling-agnostic and the Postman app is freely available for Windows, Mac, and Linux.

Universally Unique Identifiers

The last prerequisite is a Universally Unique Identifier (UUID) that will be used to permanently associate an individual data point in your third-party provider with a corresponding Drupal entity. This value will be distinct from Drupal’s internal entity IDs and is required in order to look up previously imported records whenever future updates are made. Consequently, every Drupal entity type that will be storing webhook data needs a custom field to store the identifier that accompanies each notification.

Log in to your site and navigate to /admin/structure/types, then Manage Fields for the Basic Page content type and add a new plain text field named Webhook UUID. Ensure the generated machine name is field_webhook_uuid) before saving.

While many providers automatically include a unique string that represents a record in their system, others may rely on a specific field. In the rare instance that your notifications don’t contain a dedicated UUID that’s present across all events, you may need to do some additional legwork to concatenate one or more static values into a usable identifier. Check your specific provider’s documentation or use webhook.site to examine notifications and determine which value(s) might be good candidates.

Building the Webhook Entities Module

The sample code you’ve already downloaded has all of the necessary components that are required in order to listen for, receive, and process webhook notifications. It was built to serve as a reusable springboard that can get your own project up and running quickly (in other words, feel free to use our code!). Here’s the overall file and folder structure:

Webhook Entities file and folder structure

We’ll briefly review the key components below.

The Listener Endpoint

webhook-entities.routing.yml

All webhook dispatchers require an endpoint that can receive notifications, so the first step is to define a new route in Drupal.

webhook_entities.listener:
  path: '/webhook-entities/listener'
  defaults:
    _controller: '\Drupal\webhook_entities\Controller\WebhookEntitiesController::listener'
    _title: 'Webhook notification listener'
  requirements:
    _custom_access: '\Drupal\webhook_entities\Controller\WebhookEntitiesController::access'

You might have noticed that the last line in the code from our routing.yml file above looks a bit different—that’s because it enforces custom access checking on the listener endpoint.

Access Tokens

/src/Form/WebhookSettingsForm.php

Security is a critical consideration whenever data makes its way from any external source into Drupal; since webhooks fit that bill, our custom access check validates each incoming notification to ensure it was legitimately dispatched from the actual provider.

In order to facilitate that handshake, our custom module includes a simple form that allows you to specify a secret key that can be used to allow or deny access. The most common security mechanism implemented by webhooks is an Authorization header that’s included in each notification and corresponds to a secret value that only you and your provider know (like an API key).

Log in to your Drupal site and navigate to /admin/config/webhook_entities/settings. Enter the authorization key used by our sample Postman collection: 123456. Then save the form.

Authorizing Notifications

/src/Controller/WebhookEntitiesController.php

In this simple example, we retrieve the config value saved via the form and compare it to the notification header to validate that the notification is legitimate and should be captured in the Drupal database.

/**
   * Checks access for incoming webhook notifications.
   *
   * @return \Drupal\Core\Access\AccessResultInterface
   *   The access result.
   */
  public function access() {
    // Get the access token from the headers.
    $incoming_token = $this->request->headers->get('Authorization');

    // Retrieve the token value stored in config.
    $stored_token = \Drupal::config('webhook_entities.settings')->get('token');

    // Compare the stored token value to the token in each notification.
    // If they match, allow access to the route.
    return AccessResult::allowedIf($incoming_token === $stored_token);
  }

Be sure to check your specific provider’s documentation to confirm this is the correct authorization method, since some services implement more robust security measures. For example, one CDN that we worked with required combining a notification-specific signature with a timestamp, hashing that value, and then comparing it to another header value. Clearly they’re a bit more serious about not letting anyone spoof their notifications!

Updating the Notification URL

Now that we have a dedicated path that can be used to listen for incoming notifications, we’ll need to instruct our provider to direct your implementation of their webhook API to that URL.

Your mileage will vary here, since the actual method for accomplishing this varies from provider to provider—however, in most cases, it’s a simple change that can be accomplished by logging into the control panel associated with your account. To facilitate local development we’ll make this change in Postman.

With Postman running and our sample collection imported as described above, expand the collection folder named “Webhook Entities” to find three sample requests. One at a time, you’ll need to click on each one and update the POST value found at the top of the Params tab to point to your development environment.

For example, if you access your local development site via your browser at http://mysite.local, you’ll need to update the POST URL in all three of the requests to http://mysite.local/webhook-entities/listener.

Additional Data Concerns

Before completely moving away from the topic of security, it’s worth discussing some additional measures that are often overlooked when processing notifications. While it’s probably unlikely that your provider will intentionally deliver malicious code, it’s possible that a bad actor could gain access to their system and inject something nasty or get ahold of your authorization token and spoof legitimate notifications.

In order to safeguard against those risks, we’ll follow two golden rules of working with someone else’s data:

Only keep what you’re actually going to use;
Sanitize everything before using it.

Since Drupal typically sees us capturing all user input as it’s entered and sanitizing on output (and Twig’s autoescaping facilitates that to a large extent), we’ll focus primarily on working with only a limited subset of incoming data in the queue worker (below). However, the extra-cautious among us might also consider the addition of a generic service capable of sanitizing individual data points in each webhook notification or escaping HTML entities on markup-rich fields like body text.

Handling Notifications as They Arrive

/src/Controller/WebhookEntitiesController.php

The controller referenced in our routing.yml file (above) primarily serves as a gatekeeper that receives incoming notifications, determines whether or not to act on them (via the authorize method), and then shuttles them along to their final destination.

/**
   * Listens for webhook notifications and queues them for processing.
   *
   * @return Symfony\Component\HttpFoundation\Response
   *   Webhook providers typically expect an HTTP 200 (OK) response.
   */
  public function listener() {
    // Prepare the response.
    $response = new Response();
    $response->setContent('Notification received');

    // Capture the contents of the notification (payload).
    $payload = $this->request->getContent();

    // Get the queue implementation.
    $queue = $this->queueFactory->get('webhook_entities_processor');

    // Add the $payload to the queue.
    $queue->createItem($payload);

    // Respond with the success message.
    return $response;
  }

For maximum efficiency, we’re not doing anything with the data as it rolls in—but instead handing everything off to Drupal’s queue API for actual processing.

Processing Notification Data

/src/Plugin/QueueWorker/WebhookEntitiesQueue.php

Relying on the queue to process notifications in batches helps prevent your site from becoming overloaded in the event that it’s inundated with an influx of webhook notifications (for example, a bulk update that’s triggered when you upload a CSV file to your third-party provider).

Our custom module tells Drupal to queue notification data for processing later alongside any number of other notifications that might have come before or after it; the queued notifications (or a portion thereof, depending on how full the queue is) are processed during each cron run.

While authorization has already occurred by the time a notification reaches the controller, we perform several additional verifications to ensure the data we’ve received is usable and speed up processing time. We start by checking to ensure the notification body actually contains data and isn’t empty, then further validate that it contains the necessary UUID identified during our preparatory steps above (for simplicity we assume the UUID is a simple value contained within the headers for each notification).

Assuming both of those checks pass, we then implement the previously mentioned security tactic of stripping out anything we won’t be using. This step has the added benefit of simplifying the data we’ll be working with later as well as potentially gaining some efficiency by not passing along unused values that might end up being processed unnecessarily.

Remember that all-important UUID you’d identified in your incoming notifications? Here’s where it finally comes into play. Since your third-party provider probably doesn’t know anything about Drupal (most webhook notifications are purposely written to be generic), we’ll need a way to cross-reference the incoming data with any entities that Drupal already knows.

Since two of our CRUD actions (updating and deleting) will require database queries to find existing nodes—and considering there’s a good chance some of your other custom code will also need to identify those entities—we’ve abstracted this functionality out into a service (/src/WebhookUuidLookup.php) that other components of our Drupal site can leverage in order to more easily work with the entities managed via webhooks.

public function findEntity($uuid) {
    $nodes = $this->entityTypeManager
      ->getStorage('node')
      ->loadByProperties(['field_webhook_uuid' => $uuid]);

    if ($node = reset($nodes)) {
      return $node;
    }

    return FALSE;
  }

The last step is to shuttle each notification on to its final destination according to the action it represents. We’re managing create events a bit differently from the others since they’re the only occasion where we specifically don’t want to have an existing record in the Drupal database.

// Handle create events.
if ($entity_data->event == 'create') {
  // Create a new entity if one doesn't already exist.
  if (!$existing_entity) {
    $this->entityCrud->createEntity($entity_data);
  }
  // Otherwise log a warning.
  else {
    $this->logger->warning('Webhook create notification received for UUID @uuid but corresponding entity @nid already exists', [
      '@uuid' => $entity_data->uuid,
      '@nid' => $existing_entity->id()
    ]);
  }
}
// Handle other modification events.
else {
  // Ensure a Drupal entity to modify exists.
  if ($existing_entity) {
    switch($entity_data->event) {
      case 'update' :
        // Update an entity by passing it and the changed values to our CRUD worker.
        $this->entityCrud->updateEntity($existing_entity, $entity_data);
        break;

      case 'delete' :
        // Call the delete method in our CRUD worker on the entity.
        $this->entityCrud->deleteEntity($existing_entity);
        break;
    }
  }
  // Throw a warning when there is no existing entity to modify.
  else {
    $this->logger->warning('Webhook notification received for UUID @uuid but no corresponding Drupal entity exists', [
      '@uuid' => $entity_data->uuid
    ]);
  }
}
}
// Throw a warning if the payload doesn't contain a UUID.
else {
$this->logger->warning('Webhook notification received but not processed because UUID was missing');
}

Ultimately this is yet another component that will be specific to your provider and data model; the sample notifications in our Postman collection contain an event key, the corresponding value of which indicates which action should be taken when that particular notification is posted to your Drupal site.

Managing Drupal Entities

/src/WebhookCrudManager.php

Now that we have a tool for recalling data that’s already been sent to Drupal, we can build out the logic required to handle each type of event that can be triggered by one of our notifications.

Since our sample postman collection contains short and simple notification data, all of our example CRUD components have been defined as separate methods within a single service class—however you might want to consider breaking yours out into separate services, since operations on actual data will almost certainly be more complex.

Rather than diving into the specifics of the CRUD manager service in our sample module, we’ll wrap up our code explanations by pointing out some general observations for best practices worth considering when you modify the examples to your own needs.

Our create() method offloads the handling of incoming notification data to a separate mapFieldData() function, which in turn constructs an array of values corresponding to Drupal field data that are required for creating a node. We’ve taken the approach of only mapping those values that might also be included in other events (such as updates) in order to prime the pump for future code reuse. We also ensure the notification payload contains a title value before creating a new node—since that’s the one value required for the basic page content type.

The update() method implements a series of simple checks to determine which values exist in the notification data—since unlike API calls that often return complete records, webhook notifications typically only contain modified values. This allows us to only act on those fields that have actually changed, rather than updating every value for a given node.

And finally, the delete() method does simply that. Like the update() method it’s receiving the complete node entity as an argument—so we’re able to call that entity’s built-in method in order to remove it from Drupal.

Seeing it All in Action

Go back to Postman and post the sample notifications in the same order as the actions listed above, returning to Drupal in-between each post:

Create: after posting the create notification and running Drupal cron, you should find a new node listed on your content overview page. View that node and you’ll see that all of its values correspond to those in the notification data (excluding the one we removed before handing the create notification off to the CRUD worker).

Update: post this update notification, run cron, and reload the Drupal node and you’ll find the title and body fields have been updated.

Delete: Finally, post the delete notification and run cron a third time to remove the sample node from Drupal.

Webhook Processing Done Simply

And there you have it—a simple yet functional example of processing webhook notifications. While this tutorial has touched on all of the key pieces that are required to manage one type of core-provided entity (nodes), you’ll find that your own specific application might warrant additional considerations such as:

Locking down (or hiding) any Drupal fields populated from webhook data. This helps to preemptively stave off frustration for content admins since they won’t be able to edit any values that might be programmatically updated via future notifications.
Creating additional CRUD managers: Each distinct entity type—particularly any custom ones you might create to store webhook data—will require its own set of field mappings. This is especially true if you aim to manage Media entities as we did for a recent project, since that task also requires parallel management of File entities. Be sure to leverage that UUID!
Handling duplicate entries: While our example module simply throws an error—and ignores create operations—whenever a representative node already exists, your use case might warrant a different approach to safeguard against data loss. For example, you might want to instead hand off the incoming notification data to your update method.

Finally, despite how powerful webhooks can be it’s important to give some consideration to what they can’t do:

Perhaps most critically, Drupal won’t receive any notices when your provider’s system goes offline and stops dispatch notifications—since your site is passively listening, the running assumption is that no news means nothing has changed. Unless the third-party service that’s broadcasting your notifications is capable of queuing and re-sending notifications, that gap will translate to missed updates that won’t be made to your Drupal entities.
In a similar vein, webhook notifications are a one-and-done setup—so if your custom code contains a bug that prevents the changes specified in a payload from being saved, that update is lost forever once the queue believes it’s been processed. Be sure to test your code thoroughly with sample data that are highly representative of the actual notifications you’ll be receiving!
Additionally, there’s always the possibility that your provider might modify the data structure of their notifications. Hopefully, they’ll be considerate enough to give you a heads-up if they do so, however it’s not a bad idea to wrap any functions that parse that data in try/catch statements so you’ll see some indication that things aren’t being processed in your Drupal logs.

Suggested Search Terms