Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

First of all, we were talking about validation of data in the database, specifically.

> 'validation' generally implies aspects which are inherently application specific

Not at all. Taking this at face value implies that some app can write data to the database that is valid according to that app, and then another app can read data that is invalid from its perspective, and have to deal with it. That doesn't make sense - data is data, it's either valid, or it's not. That's why the schema is about the data, not about the app.

> Validation in almost every case must be done on the app layer

For UX reasons, mostly, yes. But it's usually much more basic than what e.g. triggers would do in the DB itself.

I'm not saying that there's nothing to validate outside of the DB, either. But for the data that is in the DB, the DB itself can usually do a better job.



The general idea I've seen (for both SQL and NoSQL databases) is two apps never should write to the same database to ensure separation of concerns. Some API layer instead handles all write operations.

Disclaimer: MongoDB employee. All opinions are my own.


That's generally a good practice (though not always, many people do blue-green deployments, for instance), but it's rarely an assumption you want to make. Lots of deployment blunders can happen that render assumptions like that incorrect.

Even if your team executes perfectly and never runs into this, the biggest problem IMO is that you can't really enforce most of your guarantees w/ any degree of confidence w/o a typed schema. Even if you work within a typed language that perfectly validates all the invariants of your application before storing anything, the second you need to perform work that does not strictly funnel data through your application (i.e. an update query), you are effectively gambling on whether or not those invariants will hold. This kind of "read-modify-write" flow of data doesn't really perform well (or even hold validity) for a lot of common use cases, so in reality you need your database to ensure these things for you.

Also the two deployed apps problem is just a special case of two people interacting w/ a database who aren't working under the same assumptions as to what invariants should hold. That can happen in single code bases, even with a lot of care taken.


You only need a separate API layer if the database can't enforce constraints properly.

The database schema (along with stored procedures, views etc.) is an API and database engines are designed to have multiple concurrent writers. Multiple applications and users needing access to the same data is largely why databases exist in the first place.


It doesn't matter in this case - all that matters is that they share the same data store, regardless of how concurrent access to it is organized. The problem here isn't concurrency, but the semantics of data stored - if one app can change it such that the other app can later retrieve data that it considers invalid according to its business constraints, what is the other app supposed to do?


"data is data, it's either valid, or it's not. That's why the schema is about the data, not about the app."

This is not true.

The objective of the overall app/system (i.e. front/back/middle/DB/storage/services etc.) is to carry out some kind of business logic. A DB schema cannot fully validate stored data against the logic.

Otherwise we wouldn't write backend code, we'd just write a bunch of schemas and be done with it.

Let's use a crude example: a password. (Of course, we would never in reality store a password as a string in the clear, but just as an example ...). When a user sets a new password, we have to validate that it meets specific requirements in terms of format, and then some others rules which are more complicated such as: "can't be the same password as the last 5".

Those 'password rules', for example, cannot be encapsulated in the schema of the DB and yet must be applied in order for the data to be 'valid' from the perspective of the app, or 'overall system'.

The DB may only care that it's UTF and max 20 chars. But the system requires more validation than that.

Re: Your statement about 'one app writing data, and the other app not knowing what to do with it'. This is not true, because all apps operating on such data must understand it data in the context of business/logic context in which it was designed. Even 3rd party users of such data, via API's, must understand this data from the level of business logic - not merely 'schema validation'.

When you query data from Google Geolocation, the 'city' field may be a valid string of a certain length, but that's not very useful: it must actually be the name of a city! Any 'app' using this data must operate with the explicit understanding that this is in fact the name of a city - and not just a string that met a DB schema validation requirement.


>A DB schema cannot fully validate stored data against the logic.

Postgres actually lets you run triggers and similar that can validate data arbitrarily. You can even do web requests with the right extension.

If that is not enough, you can run Python code in your database instead and do the same thing with a slightly more powerful language for general purpose computation.

You could write the entire logic of any business app in a PG database and only use the app as a shiny view layer.


> The DB may only care that it's UTF and max 20 chars. But the system requires more validation than that.

Are you familiar with SQL constraints, triggers, user-defined functions, stored procedures?..


Yes + Postgres has Domains which are very nice, especially if you use only Functions for data insert (which i do) This gives you more granularity, than a domain on a Column. Domians are like Dependent Types, offering very fine grained control, enforced by RegEx, functions, enums, even lookup functions are ok so long as lookup tables are stable.


lmao, “my schema is my app layer”.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: