02 februari 2023 4 min. leestijd

URL Tracker DevLog #2: Designing the backend

Dennis Heutinck
Dennis Heutinck
Back-end developer | Umbraco master

The backend has had a fair amount of rework already, but I'm not done yet. The new design also comes with new requirements, not only visually, but also internally. What I aim for, is a system that is not just URL redirecting and 404 aggregating: I want to build an engaging URL management platform. I want to build the system so that any developer can improve any aspect when they have the need.

The data model

Let's compare the model of the current URL Tracker to the model that I aim to use.

The current URL Tracker data model

If we look at this model, we can make the following observations:

  • We can only display recommendations for client errors.
  • Redirects have a field for both a source URL and a source regex. They also have a field for both a target URL and a target node. It's not possible to match or redirect any other way.

The data model that I aim to introduce looks like this

Comparing that to the current model, we can make these observations:

  • Recommendations are separated from the client errors
  • Recommendations have no information about what they recommend, apart from the "strategy" field.
  • Recommendations are accompanied by different "scores", more on that in a future blog post.
  • Redirects now have a source and target strategy. They are no longer coupled with any particular implementation.

There are several benefits to this new approach. First of all, recommendations are no longer limited to client errors. This allows us to provide a wider range of recommendations and allows developers to create custom ones. The same goes for redirects: Since redirects are no longer coupled with specific implementations, it's easier to offer more diversity and extendibility.

This amount of variability also offers some challenges though:

The code architecture

The URL Tracker consists of a bunch of packages, roughly separated in "library packages" and "functional packages". A library package is a package that offers a service, but does not integrate into the system. For example: The core library provides services to manage redirects and client errors, but does not redirect, nor register client errors. A functional package adds a function to Umbraco. For example: the backoffice notifications package adds content event tracking and automatic redirect creation.

The following diagram shows roughly how the internals of the URL Tracker are designed. For simplicity, I use 'Library' for all library packages and 'Functional' for all functional packages.

The code architecture model of the URL Tracker

Diving into the redirect flow of the URL Tracker, I identified the following extension points:

  1. Filters for incoming URLs to bypass the redirect middleware
  2. Preprocessors for early analysis on incoming URLs
  3. Matchers to match incoming requests
  4. Mappers to map cacheable versions of matched models to rich equivalents
  5. Redirect mappers for mapping the source strategy and target strategy to rich equivalents
  6. Handlers for applying the rich model to the incoming request and potentially generating a custom response.

Though it's cool that all these parts are extendable, they add a lot of complexity to something that is supposed to be rather simple. I expect this flow to be the main focus for extension developers, so extending this flow must be straightforward. I'm still thinking of ways to reduce the complexity. These are some things that I'm considering:

I don't need the preprocessors. Especially in the new redirect model, preprocessing has become unnecessary. On top of that, preprocessing actually decreases the performance. Values should instead be calculated lazily by matchers and cached in some store.

Not everything needs to be implemented. For a simple extension, it should be enough to implement a matcher and a handler. The URL Tracker should enable you to do just that and provide meaningful defaults for the rest.

I feel like there is something to win in the mapping phase. Besides redirects, the URL Tracker also provides matching for no-longer-exists responses. For that reason, I made a distinction between 'mapping cacheable match models' and 'mapping redirect strategies'. I worry that a developer might find this confusing. I'm wondering if perhaps I'm trying to do too much at once and maybe I should focus on handling redirects, rather than 'everything'. Handling no-longer-exists responses could also stand on its own after all.

Closing thoughts

I feel like the logic could still use some improvements, though it's a good start and definitely an improvement over previous versions. I'm very happy with the proposed data model though and I see a lot of potential. Thank you for reading!