Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There is nothing wrong with using sequential ids in and of themselves.

The typical web app has the concept of a validated user session per request. How hard is it really to

  Select ... From Documents where documentid = ? and userid = ?

So even if the user does a

  GET /Document/{id+1}
No documents would be returned.

Every web framework that I am aware of let’s you add one piece of middleware that validates a user session and won’t even route to the request if the user isn’t validated.



No, nothing wrong with it intrinsically. But if UUIDs were used instead, the lack of authentication or authorization checks wouldn't be as catastrophic. That would be somewhat comparable to having a reset password token which doesn't expire. Still bad, but not as bad.

The other commenter's point about leaking information is also correct. In the finance industry one of the basic tricks to obtaining alternative data is to scrape it from private APIs which expose sequential IDs corresponding to a source of revenue. For example, a publicly traded car company might have its revenue extrapolated from an open API which sequentially increments an ID every time a vehicle is sold. Research groups will reverse engineer mobile apps from companies with only one or two dimensions of revenue, find the private API endpoints (reversing request signing as needed), and then look for object IDs which can be thrown into a timeseries on a quarterly basis.

Generally speaking the risk and compliance department of a hedge fund disallows this kind of data if it's gathered from an actual security vulnerability (e.g. leaks PII). It needs to be "only" a neutral information side channel without sensitive data, so that doesn't really apply in this specific scenario. But it does apply for people considering using integer IDs for user-facing APIs.


Having done a few assessments in the last year where I was forced to downgrade sev:hi findings because nobody is realistically going to guess a 128 bit random number, I have to grudgingly acknowledge that UUID object keys are a meaningful security improvement. Which I hate to admit, because I'm generally of the opinion that "defense in depth" is a design cop-out, and here's a pretty potent counterexample.


I agree with you. Let me emphasize this explicitly: the real failure here is the utter lack of authn and authz. But it is meaningful that the integer IDs are being used.


One reason I <3 HN is that complex scenarios like this get described so clearly, succinctly like this.

I couldn't say it better myself when I'm speaking to management that makes these kinds of decisions. Now I can quote throwawaymath verbatim to drive the detailed point home.

Thanks!


> I agree with you. Let me emphasize this explicitly: the real failure here is the utter lack of authn and authz.

Bingo.


Nice. This reminds me of the German Tank problem in WWII, where the allies used samples of serial numbers from captured nazi tanks, to estimate their population. The tanks and their parts used sequential serial numbers. It could also be used to determine production rates too I guess.

The idea pre-dates web APIs many decades :-)

See https://en.m.wikipedia.org/wiki/German_tank_problem


That's how you can get a self-referencing tweet as well. https://twitter.com/spoonhenge/status/2878871344


Maybe not "wrong", but there are some very obvious downsides to exposing sequential IDs vs a randomized token:

- It exposes the count you have of a particular item

- It exposes your growth rate of those items

- If a developer accidentally breaks your authentication (or somebody hacks it), it becomes trivially easy to download all your items very quickly

And it isn't like using a randomized token is hard. In the most common implementation, it is just one additional column that gets filled with a random string and an index on the column.


In that simple scenario. What are some ways that a hacker could break your front end API to allow it to serve requests for multiple users without having access to multiple account logins? I understand that they could possibly get access to your database but that’s a different threat.

If they could somehow change your code, all hope is already lost.

But I do agree with it does allow someone to determine rate of growth which would be valuable more from a business intelligence side than a privacy violation.

The larger issue is that a developer forgets to add the “and userid = ?”

I guess the work around for that is to have a database that ties user authentication to records in the table/object store directly like DynamoDB or S3.


In my experience, many tables don't have a userid on the table that would be associated with the user. It would be a table join or two or three away.

So the developer may think it is safe to say select value from stock positions left join account on account.id = stock position.id left join user_accounts on user_accounts.accountid == account.id left join users on user_accounts.userid == user.id where user.id == session.userid.

Safe right? We checked userid. But then clicking on the position to drill in on the position data, they just select * from stock_position where stock_position.id = params.stock_id... there's no "and stock_position.userid" on that table, and the developer might be too lazy to spin up the entire join again especially if you don't need account data for this view. Whoops, suddenly a vulnerable page query.

I imagine there are other ways to screw up. Like insecure cookies, and just checking cookie.userid, ah yes, you're the right user. Whoops, didn't realize cookies could be spoofed.


If the cookie is spoofed and someone got another clients authorization token, then they would get any documents that user was authorized to see anyway.

But you don’t do cookie.userid.

You send the username and password to an authentication service which generates a token with a checksum. The token along with the username and permission is cached in something like Redis.

On each request, middleware gets the user information back using the token.


I'm familiar with that process. I was trying to illustrate a picture of how a poor developer might stumble their way into this situation. It's technically possible to store the userid in the cookie rather than using JWTs, but obviously it's not secure in the slightest.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: