Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It always amazes me how people are so free to judge other's situations.

It's great that your web app only costs $10/month but others may have web apps that are more computationally intensive e.g. video processing or ML inference etc or simply can't join everything they need at runtime.

And it's great that you're willing to deny those 50k users a day access to your service when that cheap VPS inevitably falls over. But others may be monetising that traffic and will want a HA solution so their revenue isn't impacted.

All of those add complexity and cost to an architecture.



TekMoi is right. I have an Alexa top 6k site which is vastly more complicated (media hosting, load balancing, multiple VMs, DDOS protection, transactional emails, automated backups) which costs $200 / month on AWS.

The fact that this person is spending nearly as much to support 50k users a day as I do to support more than 4 million cannot be hand-waived away by "people are so free to judge other's[sic] situations". The matter is worsened by the fact that the application is so simple that it doesn't even support user accounts. There is room for discussion here about efficiency in application architecture. More importantly, an article billing itself as "Costs of running a Python webapp for 55k monthly users" is silly because there is no way this is representative of anything. I'm afraid new hackers will be scared by the high costs listed here and be discouraged in their own efforts.


If you support 4 million monthly visitors on a media site, and have multiple EC2 instances running, I'd love to see a cost breakdown structure, because in my (obviously incomplete and possibly naive) calculations, the bandwidth alone would cost more than $200/mo.


CloudFlare covers the media bandwidth costs for a mere $20 / month. The uploading and media conversion is the difficult and costly (in terms of CPU) portion.


>CloudFlare covers the media bandwidth costs for a mere $20 / month.

Your media files must be extremely small.

I'm guessing they're less than 20MB on average, because a) CF hasn't shown you the door yet b) they don't even cache anything bigger than half a gigabyte.


Also to be clear I'm talking about image uploads, not video. Still more complicated than this app.


Similar situation (sub $1k/month) to the setup im doing for a start up in indonesia (top 7k in alexa, top 150 in indonesia), except we have to pay extra for media because we need to process the logs (long tail pdf/data/docx hits, and we need to control the dns and how other domains route to use so we cant use cloudflare). And still plenty of room to cost optimize.

Another consultant that came in and try to do this, made our costs go up by 10x per month… so i'm not surprised when I see stuff like this here…

The knowledge of ones tools available at ones finger tips and the relative costs of such seem to make the difference for these things.


If $171 month is discouraging, let me reassure them.

There’s statistically a 95-99%+ chance you’ll never get to 55k monthly users with your app so don’t worry!


Let's leave this kind of snark out of this community. It's comments like this that slippery slope a community from helpful to harmful. You see it each time a reddit community gets too large.


I don't think its snark. Its, don't worry about optimization to soon.


No, what's harmful is "oh, just spend $50,000 on managed Kubernetes to run a Django web app". That costs real time and real money and makes young engineers think that a phpBB forum is impossible with a five-digit AWS bill.


Both you and TekMoi should probably value your own time higher. The cost savings are great, but once you spend 4 hours on configuring a database server, that'll probably be $200+ worth of your time and, thus, wipe out most of the savings.


It doesn’t cost anything to spend 4 hours of your time doing anything, so it doesn’t really wipe out any savings. Reducing a bill from $400 month to $200 is real money, not some theoretical time/value judgement.

And, as others have mentioned, there is enormous value in knowing how to operate on a fairly lean tech stack. It makes it so much simpler to scale effectively while keeping costs down.


This is only true, if you do not value your free time. In your example, you've spent 4h to save 200$, so your work was worth 50$/hour. A freelancer with a 100$/hour rate might do 2 hours of work instead, spend the money, and gain 2 hours of free time.

There are other factors of course, but in general, many people come up with a rate for their own free time (which is often higher than the actual rate they charge clients)


It’s not only true if you don’t your free time. It’s only true if you’re a freelancer that is turning down hours at a higher rate than what you’re saving.

Many people working on products, both on their own and within a company, aren’t turning down other profitable work to optimize existing solutions.

If I watch a two hour movie instead of spending two hours saving $100/month, it doesn’t matter how much I value my time, no one is paying me $100/hour to watch a movie.


There's a difference between the initial example and watching a movie. That's what I summed up with "there are other factors (than money)". If I'm only interested in the _outcome_ and I treat the way to achieve it as work, I definitely weight the time cost vs benefit (and I am not a freelancer). If it is something I enjoy doing (which may or may not be true for the initial example), I'll take this into account as well. Time is a limited resource, and I treat it as such.


Except that now you know how to configure a database server, and know how your database server is configured. So if you ever get a problem with the database in the future, you'll know how to solve it faster. Which will save you time and money in the future.

And you're less likely to make mistakes like upgrading your server instance to try and solve a problem that can't be solved like that.

And the more you do it, the cheaper and faster it gets. That knowledge and skill has value.


I value four hours of my time at considerably more than $200 and the reason I'm able to do that is because I know things like how to configure a database server.

But your argument is nonsensical because a one-time investment of time (much less than four hours for me, but I've been doing this for 20 years) can save you several hundred dollars a month. AWS AuroraDB, for instance, (which I also know how to configure, by the way) has much higher latency than a hand-rolled instance and will cause bottlenecks throughout your code in a DB-driven application. If I hadn't experienced the difference firsthand, or had failed to profile my app's performance adequately, I might assume I need to solve the problem by spinning up more ec2 instances to distribute the load. I've had the misfortune of working with a company that had exactly that problem and knowing how to spin up a new DB server saved the company thousands of dollars a month and took considerably less than four hours. Transferring a 3TB database to a new server without downtime did take considerably longer, however, but I was being paid hourly anyway, and it was still a worthwhile investment for the company which saved considerably more than my fee.

Any tradesperson should know their tools. A programmer is no different and if you don't know how to use your tools because your "value your own time higher" then thank you: you're the guy who ends up getting me called in to fix things at a much higher hourly fee.


the main issue I have with the DIY mentality is that, for me, it's endless, and even more so, it's an unpaved path, ad-hoc, with few integrated well defined paths. we've got lots of open source software, but the ops of being online is hard fought experience.

* `apt-get install postgres-server` would have worked fine for my needs on my VPS. oh but i need roles, so let's start an ansible playbook. maybe let's tweak some settings. fine, still all short, easy to do.

* ahhh i should probably have backups. how am i going to manage that storage, where is that going to live?

* then i introduce a new feature & my database is running slow. explain query helps, but i also could use some metrics for these boxes, so probably need to start thinking about prometheus & node-exporter, &c.

i am radically in favor of a) personally facing these challenges and b) open-sourcing the operational knowledge & tools for setting up AND OPERATING systems.

yet at the same time i also think spending $171/mo for a year is an exceedingly wonderful option to have on the table. running my own servers is, to me, a lifelong project, something i want to deeply invest in. there's plenty of ways to go about it that aren't so arduous (k8s+postgres-operator+rook+tbd monitoring+tbd directory-services), but that willingness to keep engaging, supporting, maintaining, scaling things can be a very serious concern that extends well past the time it takes to set a database up: it's an ongoing "giving a shit" burden even when (seemingly) working fine.

being willing and able to hack through is great, and i am all for the coallition of the willing who elect to march through, hopefully not getting bogged down along the way. but wow if you are trying to start a business, it sure is nice being able to pay someone to spin up, back up, monitor, scale some services for you.

i hope some day "we" are better at such things, systematically, i hope open source ops helps give us better paths to doing these kind of things easily, safely, observably, resilliently. we're not there yet. but wow, this challenge to me- how we move open source from an older "software" model to an online service model, that empowers people to set up online systems as easily as opening an editor, that's the challenge at the heart of open source today. it's one that needs a lot more effort, a lot more work, such that we have good ways to stand up & keep up a database server.


Where and how did you aquire this knowledge?


>$200+ worth of your time and

Yeah, from your perspective, for others monthly savings of 10$ is allot, and not everyone earns 200$ for 4 hours of work.


For OP it might be a good idea though, given that the infrastructure is their biggest cost and they have not revenue.


Being well versed in setting up your own database server bereft of cloud provider hand holding easily pays for initial time investment over time. I don't think most understand just how much a mastery of the basics is capable of generating in value when you're essentially vendor lock-in proof. There is so much blindness to voluntary hanging of one's arose out the window created by vendor overreliance.


Yes, without clear leadership, it is easy for a dev team to flounder around in a cloud provider’s offerings.


It's 50k MONTHLY users. It's just 3000 a day


It says in the article that the author gets 34k daily users, and 50k unique users/month. It would have been clearer if the author had talked about sessions (which are therefore > 1M/mo) for sure, but you're still making a very big (and invalid) assumption.

EDIT: Please disregard the above. I need an eye test, or maybe just to put my glasses on! Daily users are 3.4k (3400), not 34k. My apologies, I take it all back!


It's 3.4k (3400) daily users. Not 34k (34000). Dont worry, I nearly missed the dot at first, as well :)


I'm such a klutz - sorry. This is what I get for reading HN when I'm still in bed and without putting my glasses on first. This is a terrible habit that I need to break.


Username now in question


Tough to argue against that under the circumstances.


And 2.5 per minute


Or 0 per minute and 25,000 per hour for two days a month. Traffic can be bursty; don't assume that X/month means they're getting exactly X/30 per day..


What are the right resources, you would suggest someone if he had to setup his servers properly. Will really appreciate if you can refer some books/videos/articles. Thanks.


If you're talking about serving many requests cost-effectively, then really the problem is not the servers (except over-provisioning, which is rampant. Learn to use tools like AWS' auto-scaling system instead) - it's the code.

If you understand the basics of algorithmic time complexity (that's your Big-O notation) and profiling your code then you're ahead of 98% of other developers in practice. I'm constantly amazed at how many developers think adding more libraries, newer frameworks, or more layers of tooling will magically speed up their code because "it's so fast". If you actually time things you'll find out doing it the "slow" way is frequently an order of magnitude faster.


Wow! Congrats, and what is your site? Would love to read more about it.


To be fair, the post show that about 2/3rds of that spend is from the decision to run twice the needed capacity to be able to do green blue deployments, and to cloud host their metabase analytics.

And it explains there's currently zero revenue.

So it seems fair to judge the situation there based off those pieces of information we've been given.

As I posted elsewhere, the OP's choice to run dual redundant green/blue capable instances and cloud hosted metabase might have good reasons, but right now those reasons are not "wanting a HA solution so revenue isn't impacted"...


That's not how green/blue deployments work. You don't keep both colors up unless you have completely failed to understand the concept.

Green/Blue is all about saving resources and costs, not keeping them around. You misread the cause here. It has nothing to do with deployment strategies.


That's not how I read what the OP's doing in the article. Sure, maybe he's not doing "proper blue/green", but that is what he uses to explain running a duplicated pair of web/app servers full time...


I have a web app that's struggling if there are more than 2 concurrent users per CPU core. It's displaying incompressible large resolution images with <100ms latency

EDIT: not sure why I'm downvoted, I'm just presenting my use case. They are multigigabyte images encoded with custom wavelet compression that are cut into tiles (think google earth), each user needs 5-10 tiles every second


My guess would be that your webapp is serving images in a blocking fashion, meaning every time a use requests an image it will fetch the image for the user AND block the http serving thread until the image is uploaded. Can you provide more context (eg: tech stack)?


What's blocking is number of http connections between browser and server. Most browsers only allow 6, each tile takes about 100ms so getting 10 in under second doesn't always happen.


But this itself wouldn't cause the "struggling" you mention with more than 2 concurrent users per core


Do you know if the old open street map trick of having multiple tile servers (that are each just aliases of the same server) still works? I think this was how they tried circumventing the 6 connection limit


That's no longer necessary thanks to HTTP/2

Your webserver still has to spin off a thread for each request if you want to do substantial CPU work for each request, but rest assured you'll get all the requests at once from the browser. Not 6 at a time like in the dark ages

The RFC recommends at least 100 streams. See SETTINGS_MAX_CONCURRENT_STREAMS https://tools.ietf.org/html/rfc7540#section-6.5.2


Have you tried combining the tiles on-the-fly as image sprites?

https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Images/...


That's a lot of CPU usage per user! Is it doing some super-intensive process to generate the images on the fly?


Is it doing anything weird or fancy with those images?

If you're just serving them (eg no image manipulation), that sounds like there's a problem somewhere.


What http server do you use ? Are they static images ?


The article does not mention video processing or ML. It describes their setup for what they describe as a generic web app.


If you want to provide uninterrupted service to your clients you’ll have to spend some $. You want to have redundancy, machines hosted in different different locations, backup prod servers, monitoring, analysis tools. Even if it is for 1k monthly users, if you want reliability - it will increase the costs.


I beg to differ.

In my experience, the complicated setups that are justified by the argument of "reliability" have more downtime then a single VPS. The reason is probably that there are more moving parts and more has to be maintained / can go wrong.

These days, a single VPS in the right datacenter has excellent uptime.


Agreed. I'll probably be downvoted, but these setups strike me as people who prefer to drink the cool aid than be pragmatic and only use what they need.

I've also had very high reliability rates with a single VPS. They've actually given me less downtime than AWS services at times.


At work, I aim for four nines. (We put three nines in the legal paperwork).

I can't hit four nines reliably with single VPS platforms on my typical workloads, I need load balancers and redundant app servers. I could quite likely hit three nines using single VPSes. But if a client wants 99.9% SLAs, they'll be paying for HA and I'll deploy redundant ec2 instances, multi region RDS, and an ELB. And charge them 3 or 4 times what the OP is spending for it. (And I'll almost always deliver 99.99% availability.)

For my stuff or friends or people I'm doing cost saving favours for, I'll explain how much extra it costs to guarantee less then an hour of downtime a month, the realistic expectations and historical experience of how much downtime an non-HA platform might have in their use case, and often choose along with them a single VPS (or even dirt cheap cpanel hosting) while understanding and accepting the risks associated with saving upwards of a couple of hundred bucks per month.


I think ec2 gives 99.99% availability in their SLA, no need to scale across regions or even AZs. Multi AZ RDS is 99.95%. We have a simple ELB/EC2/RDS/S3 stack on us-east-1 and need high availability for a very small amount of users and run very cheap.


A single VPS set-up might be OK for serving content over web, but in my experience, the pain begins when your software starts doing async processing - long-running cron jobs, queue processing. If you're doing it on your web server machine, there will be downtime.

I know this, because I have gone through these issues with each of my projects. Just recently an infinite loop bug in a cron job ground my "single VPS" setup to a halt (and took the web server with it).


I have a VPS that has not gone down in 5 years.

It still has a redundant slave because I'm not going to bet my reputation on everything going right.


Me too, with the cheapest Kimsufi server from OVH (something like 3€ a month).

To be fair I can't be sure because a less than 5 minutes downtime would probably go unnoticed, but the fact is I never hear about this server.


> I beg to differ.

> In my experience, the complicated setups that are justified by the argument of "reliability" have more downtime then a single VPS. The reason is probably that there are more moving parts and more has to be maintained / can go wrong.

> These days, a single VPS in the right datacenter has excellent uptime.

Again, maybe in your experience but that's not universal. There's literally no redundancy with running everything off a single VPS and if that datacenter has network or hardware problems, then your service is down.

Is redundancy necessary for the scale of OP's app considering it provides 0 income? Most likely not, but that's a decision they've decided on and there's nothing wrong with that.

What does excellent uptime mean in your book? With Digital Ocean's AM2 region I had regular downtime every few weeks and while I'm alright with it, if I had another VPS in another datacenter it would've had next to no effect on the customer experience. But an hour or more of downtime every two weeks isn't excellent.


If Digital Ocean is giving you an hour of downtime every two weeks (two 9s and a 7) then they're breaking their SLA (four 9s).


    What does excellent uptime
    mean in your book?
Something like less then an hour of downtime per year.

Over the last years, across multiple datacenters, I have seen maybe one or two short downtimes per year. None ever lasting more then 5 minutes.


https://aws.amazon.com/message/41926/ this lasted hours and affected almost everyone using us-east-1, large portion of internet was unavailable because they had no multi-region setups.


That was a "S3 Service Disruption". A perfect example of a problem you do not have with a single VPS setup.


Two Hetzner CPX31 boxes sounds like it'd do just fine here too, providing the redundancy you mention for a fraction of the cost. Or get the boxes from different companies, for the same sort of overall price.

Yes some of the other tools could arguably be worth paying for, but if the author's concern is that he's short on money and $140 is a lot, why didn't they KISS and only use what they need? Then scale as and when needed in the future.

Y'know, do what HN regularly preaches?


My guess is he's doing resume engineering.

And $140/month is pretty good value there probably... Even if that's just being able to point potential employers/recruiters at this blog post as evidence of experience building and running an HA website with more-advanced-than-free-Google-Analytics user behaviour tracking.


I was just thinking along similar lines.

Article mentions in a few places that money could be saved - but lowering the cost doesn't seem to be the driver.

Makes sense to me that if you're building the site as a hobby/practice/demo it makes sense to "build it properly" - and it's fun.


If your 55k MAU want uninterrupted service, they need to be paying for it (in dollars or monetisable attention and/or privacy).

On a site currently generating zero revenue, I hope the OP is happily enough paying most of that $145/month as a learning experience or for resume bullet points (which are perfectly valid was to spend your money). They've admitted elsewhere in the comments that the two $40/month droplets are way oversized (from an attempt to solve a problem that turned out not to be droplet size/resource related) - so without redundancy and without AWS hosted metabase, this would be about $100/month less expensive to run.

I still think that's over provisioned or under engineered. Like others have commented, I'd be surprised if the features you can see on the site require any more than the $15/month the FAQ claims it costs to run, plus perhaps the $10/month Discus expenses. That seems about where a hobby/side-gig project should sit for a lot of devs before you start thinking about how to make it pay for itself... YMMV, especially if you're not comfortable earning at least junior dev salary already in some reasonably well paying part of the world.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: