问题描述:

In DDD, where should validation logic for pagination queries reside?

For example, if the service layer receives a query for a collection with parameters that look like (in Go), though feel free to answer in any language:

// in one file

package repositories

type Page struct {

Limit int

Offset int

}

// Should Page, which is part of the repository

// layer, have validation behaviour?

func (p *Page) Validate() error {

if p.Limit > 100 {

// ...

}

}

type Repository interface {

func getCollection(p *Page) (collection, error)

}

// in another file

package service

import "repositories"

type Service struct {

repository repositories.Repository

}

// service layer

func (s *Service) getCollection(p *repositories.Page) (pages, error) {

// delegate validation to the repository layer?

// i.e - p.Validate()

// or call some kind of validation logic in the service layer

// i.e - validatePagination(p)

s.repository.getCollection(p)

}

func validatePagination(p *Page) error {

if p.Limit > 100 {

...

}

}

and I want to enforce a "no Limit greater than 100" rule, does this rule belong in the Service layer or is it more of a Repository concern?

At first glance it seems like it should be enforced at the Repository layer, but on second thought, it's not necessarily an actual limitation of the repository itself. It's more of a rule driven by business constraints that belongs on the entity model. However Page isn't really a domain entity either, it's more a property of the Repository layer.

To me, this kind of validation logic seems stuck somewhere between being a business rule and a repository concern. Where should the validation logic go?

网友答案:

The red flag for me, is the same one identified by @plalx. Specifically:

It's more of a rule driven by business constraints that belongs on the entity model

In all likelihood, one of two things are happening. The less likely of the two is that the business users are trying to define the technical application the domain model. Every once in a while, you have a business user who knows enough about technology to try to interject these things, and they should be listened to - as a concern, and not a requirement. Use cases should not define performance attributes, as those are acceptance criteria of the application, itself.

That leads into the more likely scenario, in that the business user is describing pagination in terms of the user interface. Again, this is something that should be talked about. However, this is not a use case, as it applies to the domain. There is absolutely nothing wrong with limiting dataset sizes. What is important is how you limit those sizes. There is an obvious concern that too much data could be pulled back. For example, if your domain contains tens of thousands of products, you likely do not want all of those products returned.

To address this, you should also look at why you have a scenario that can return too much data in the first place. When looking at it purely from a repository's perspective, the repository is used simply as a CRUD factory. If your concern is what a developer could do with a repository, there are other ways to paginate large datasets without bleeding either a technological or application concern into the domain. If you can safely deduce that the aspect of pagination is something owned by the implementation of the application, there is absolutely nothing wrong with having the pagination code outside of the domain completely, in an application service. Let the application service perform the heavy lifting of understanding the application's requirement of pagination, interpreting those requirements, and then very specifically telling the domain what it wants.

Instead of having some sort of GetAll() method, considering having a GetById() method that takes an array of identifiers. Your application service performs a dedicated task of "searching" and determining what the application is expecting to see. The benefit may not be immediately apparent, but what do you do when you are searching through millions of records? If you want to considering using something like a Lucene, Endeca, FAST, or similar, do you really need to crack the domain for that? When, or if, you get to the point where you want to change out a technical detail and you find yourself having to actually touch your domain, to me, that is a rather large problem. When your domain starts to serve multiple applications, will all of those application share the same application requirements?

The last point is the one that I find hits home the most. Several years back, I was in the same situation. Our domain had pagination inside of the repositories, because we had a business user who had enough sway and just enough technical knowledge to be dangerous. Despite the objections of the team, we were overruled (which is a discussion onto itself). Ultimately, we were forced to put pagination inside of the domain. The following year, we started to use the domain within the concept of other application's inside of the business. The actual business rules never changed, but the way that we searched did - depending on the application. That left us having to bring up another set of methods to accommodate, with the promise of reconciliation in the future.

That reconciliation came with the fourth application to use the domain, which was for an external third-party to consume, when we finally conveyed the message that these continual changes in the domain could have been avoided by allowing the application to own its own requirements and providing a means to facilitate a specific question - such as "give me these specific products". The previous approach of "give me twenty products, sorted in this fashion, with a specific offset" in no way described the domain. Each application determined what a "pagination" ultimately meant to itself and how it wanted to load those results. Top result, reversing order in the middle of a paged set, etc. Those were all eliminated because those were moved nearer their actual responsibilities and we empowered the application while still protecting the domain. We used the service layer as a delineation for what is considered "safe". Since the service layer acted as a go-between, between the domain and the application, we could reject a request at the service-level if, for example, the application requested more than one hundred results. This way, the application could not just do whatever it pleased, and the domain was left gleefully oblivious to the technical limitation being applied to the call being made.

网友答案:

"It's more of a rule driven by business constraints that belongs on the entity model"

These kind of rules generally aren't business rules, they are simply put in place (most likely by developers without business experts involvement) due to technical system limitations (e.g. guarantee the system's stability). They usually find their natural home in the Application layer, but could be placed elsewhere if it's more practical to do so.

On the other hand, if business experts are interested by the resource/cost factor and decide to market this so that customers may pay more to view more at a time then that would become a business rule; something the business really cares about.

In that case the rule checking would certainly not go in the Repository because the business rules would get buried in the infrastructure layer. Not only that but the Repository is very low-level and may be used in automated scripts or other processes where you would not want these limitations to apply.

Actually, I usually apply some CQRS principles and avoid going through repositories entirely for queries, but that's another story.

网友答案:

At first glance it seems like it should be enforced at the Repository layer, but on second thought, it's not necessarily an actual limitation of the repository itself. It's more of a rule driven by business constraints that belongs on the entity model.

Actually repositories are still domain. They're mediators between the domain and data mapping layer. Thereby, you should still consider them as domain.

Therefore, a repository interface implementation should enforce domain rules.

In summary, I would ask myself: do I want to allow non-paginated access to abstracted data by the repository from any domain operation?. And the answer should be probably not, because such domain might own thousands of domain objects, and it would be a suboptimal retrieval trying to get too many domain objects at once, wouldn't be?

Suggestion

* Since I don't know which language is currently using the OP, and I find that programming language doesn't matter on this Q&A, I'll explain a possible approach using C# and the OP can translate it to any programming language.

For me, enforcing a no more than 100 results per query rule should be a cross-cutting concern. In opposite to what @plalx has said on his answer, I really believe that something that can be expressed in code is the way to go and it's not only an optimization concern, but a rule enforced to the entire solution.

Based on what I've said above, I would design a Repository abstract class to provide some common behaviors and rules across the entire solution:

public interface IRepository<T>
{
      IList<T> List(int skip = 0, int take = 0);
      // Other method definitions like Add, Remove, GetById...
}

public abstract class Repository<T> : IRepository<T>
{
      protected virtual void EnsureValidPagination(int skip = 0, int take = 0)
      {
           if(take > 100)
           {
                 throw new ArgumentException("take", "Cannot take more than 100 objects at once");
           }
      }

      public IList<T> List(int skip = 0, int take = 0)
      {
           EnsureValidPagination(skip, take);

           return DoList<T>(skip, take);
      }

      protected abstract IList<T> DoList(int skip = 0, int take = 0);

      // Other methods like Add, Remove, GetById...
}

Now you would be able to call EnsureValidPagination in any implementation of IRepository<T> that would also inherit Repository<T>, whenever you implement an operation which involves returning object collections.

If you need to enforce such rule to some specific domain, you could just design another abstract class deriving some like I've described above, and introduce the whole rule there.

In my case, I always implement a solution-wide repository base class and I specialize it on each domain if needed, and I use it as base class to specific domain repository implementations.

Answering to some @guillaume31 comment/concern on his answer

I agree that it isn't a domain-specific rule. But Application and Presentation aren't domain either. Repository is probably a bit too sweeping and low-level for me -- what if a command line data utility wants to fetch a vast number of items and still use the same domain and persistence layers as the other applications?

Imagine you've defined a repository interface as follows:

public interface IProductRepository
{
      IList<Product> List(int skip = 0, int take = 0);
}

An interface wouldn't define a limitation on how many products I can take at once, but see the following implementation to IProductRepository:

public class ProductRepository : IRepository
{
     public ProductRepository(int defaultMaxListingResults = -1)
     {
          DefaultMaxListingResults  = defaultMaxListingResults;
     }

     private int DefaultMaxListingResults { get; }

     private void EnsureListingArguments(int skip = 0, int take = 0)
     {
          if(take > DefaultMaxListingResults)
          {
               throw new InvalidOperationException($"This repository can't take more results than {DefaultMaxListingResults} at once");
          }
     }

     public IList<Product> List(int skip = 0, int take = 0)
     {
            EnsureListingArguments(skip, take);
     }
}

Who said we need to harcode the maximum number of results that can be taken at once? If the same domain is consumed by different application layers I see you wouldn't be able to inject different constructor parameters depending on particular requirements by these application layers.

I see same service layer injecting exactly the same repository implementation with different configurations depending on the consumer of the whole domain.

Not a technical requirement at all

I want to throw my two cents on some consensus made by other answerers, which I believe that are partially right.

The consensus is a limitation like the one required by the OP is a technical requirement rather than a business requirement.

BTW, it seems like no one has put the focus on the fact that domains can talk to each other. That is, you don't design your domain and other layers to support the more traditional execution flow: data <-> data mapping <-> repository <-> service layer <-> application service <-> presentation (this is just a sample flow, it might be variants of it).

Domain should be bullet proof in all possible scenarios or use cases on which it'll be consumed or interacted. Hence, you should consider the following scenario: domain interactions.

We shouldn't be less philosophical and more ready to see the real world scenario, and the whole rule can happen in two ways:

  1. The entire project isn't allowed to take more than 100 domain objects at once.
  2. One or more domains aren't allowed to take more than 100 domain objects at once.

Some argue that we're talking about a technical requirement, but for me is a domain rule because it also enforces good coding practices. Why? Because I really think that, at the end of the day, there's no chance that you would want to get an entire domain object collection, because pagination has many flavors and one is the infinite scroll pagination which can be also be applied to command-line interfaces and simulate the feel of a get all operation. So, force your entire solution to do things right, and avoid get all operations, and probably the domain itself will be implemented differently than when there's no pagination limitation.

BTW, you should consider the following strategy: the domain enforces that you couldn't retrieve more than 100 domain objects, but any other layer on top of it can also define a limit lower than 100: you can't get more than 50 domain objects at once, otherwise the UI would suffer performance issues. This won't break the domain rule because the domain won't cry if you artificially limit what you can get within the range of its rule.

网友答案:

Probably in the Application layer, or even Presentation.

Choose Application if you want that rule to hold true for all front ends (web, mobile app, etc.), Presentation if the limit has to do with how much a specific device is able to display on screen at a time.

[Edit for clarification]

Judging by the other answers and comments, we're really talking about defensive programming to protect performance.

It cannot be in the Domain layer IMO because it's a programmer-to-programmer thing, not a domain requirement. When you talk to your railway domain expert, do they bring up or care about a maximum number of trains that can be taken out of any set of trains at a time? Probably not. It's not in the Ubiquitous Language.

Infrastructure layer (Repository implementation) is an option but as I said, I find it inconvenient and overly restrictive to control things at such a low level. Matías's proposed implementation of a parameterized Repository is admittedly an elegant solution though, because each application can specify their own maximum, so why not - if you really want to apply a broad sweeping limit on XRepository.GetAll() to a whole applicative space.

相关阅读:
Top