Hacker Newsnew | past | comments | ask | show | jobs | submit | noahnoahnoah's commentslogin

This already exists and is used in much of the US and extensively in Europe for airlines. Look up Controller Pilot Data Link Communications (CPDLC).


I wasn't quite a software engineer, but a data analyst/scientist/engineer/term-du-jour at a brand-ish name software company for ~8 years, so pretty close in terms of the day-to-day work and culture.

1) I'm a professional cartographer, sort of. I make wooden topographic maps.

2) A bunch of reasons. I was never "supposed" to work in software -- I went to school in mechanical engineering, and wanted to get closer to something like that. My side biz was becoming viable, I wanted to do something entrepreneurial, and even though I had a pretty good gig, no company is perfect if you're there long enough.

I don't know if I'll go back to data or software some day. Things were great in the map business before the pandemic, they're ok now, and hopefully they'll be great again in the future. I still do a lot of data analysis and write a lot of software for my business, it's just interspersed with a lot more sweeping, sanding, etc.



big fan of what you're doing @ elevated woodworking! gorgeous stuff


I actually sort of did this (but I cut them w/ a CNC router rather than 3D printing them).

https://imgur.com/a/fMy19 is the finished product; http://www.thingiverse.com/thing:1524543 are the files I generated and used. They aren't truly accurate maps -- they do have the elevation exaggerated to be more visually interesting, but it's consistently scaled. Some people have gone on to print them w/ 3d printers (rumor is they're on the wall at Makerbot now).


Basecamp | REMOTE | INTERN

Basecamp is hiring programming, design, marketing, operations, and data interns for summer 2016.

Interns at Basecamp work on real projects and are mentored one-on-one by a member of our team who will guide you throughout your time at Basecamp. The projects you'll work on as an intern at Basecamp are all derived from real problems we face as a business, and we expect you'll have a meaningful impact during your time here. You'll leave Basecamp with new technical, creative, and business skills and having accomplished something significant.

Internships at Basecamp are remote -- you can work from anywhere you want, provided there's some overlap in time zones with your assigned mentor. We'll fly you to Chicago once or twice during the summer to get together with your mentor and the rest of the intern class, and you'll talk regularly with your mentor via phone, Skype, or Google Hangouts. You'll also participate in some of our dozens of Campfire chat rooms every day.

All internships are paid and require a commitment of 8-12 weeks of full time work between May and August 2016 (we're flexible on start/end dates, planned vacations, etc.).

Learn more and apply via https://basecamp.com/internships. Apply by Wednesday, February 24th.

If you have any questions, you can email me directly at noah@basecamp.com.


The next two parts will have more detail, but the short answer is "yes" to both a raw word list and a Bayesian filter in terms of techniques we've tried here. One of the simplifying things that makes the problem a little easier is that we don't try to classify beyond "does this need an immediate reply or not."


We don't use the same backend as Etsy's original implementation, but you should be able to take any script, etc. that emits statsd measurements for use with the original implementation and point it at Batsd. This means that the 50+ statsd clients that are out there, as well as lots of custom instrumentation, should "just work", even though it's stored and accessed in different ways after it's received.


(I work for 37signals and wrote batsd)

This is really just one piece in a bigger set of things to track performance, usage, etc.

You can think of it as: Emitters --> Statsd (or in this case, Batsd) --> dashboards, alerts, etc.

We have emitters coming from Nginx, HAproxy, bluepill, postfix, etc. log files, a gem within all of our Rails apps, and a variety of other scripts that gather data. Those all point to batsd, which aggregates and stores them. We then extract the data into graphs on our dashboard, and use it extensively for Nagios alerting as well. There's a basic sample client included in this repository that we use for those purposes, though you're right, it just gets you raw numbers out of the box.

We're planning on releasing more of both the "emitters" that gather data, as well as a major part of our graphing/dashboard interface "soon".

And point well taken about making it more obvious how to get started and what you can use it for. I'll work on improving the documentation.


Could you explain briefly why you chose to write a replacement for statsd, rather than improve on it? What aspects of statsd were you not happy with?

(I don't have a horse in this race, I haven't used statsd before -- but I am planning to deploy some sort of statistics gathering soon and I wonder why I would choose your implementation over Etsy's, apart from the obvious appeal of the 37signals brand.)


Briefly, probably not (everyone here at 37signals got treated to a 3000 word treatise on our statsd journey a few weeks ago). I did write up a few reasons at https://github.com/noahhl/batsd/blob/master/doc/why-not.md.

In short: we as a company have a ton of Ruby experience and comparatively little Python/Node.js experience (both in terms of understanding the tools that we use, which we like to do, and simply just in being able to confidently manage dependencies, etc.), and we knew we were going to want to build our own UI eventually anyway, which limited the utility of Graphite itself.

Edited to add: I can't say it enough, Etsy and Graphite are both fantastic pieces of engineering, with fantastic communities and support behind them (there's a fascinating writeup about Graphite in particular at http://www.aosabook.org/en/graphite.html).


I briefly read the chapter on persistence -- basically you're doing what RRD originally did (one file per metric), except without actual round-robin storage, before RRDcache was born. The long-term performance implications could be worrisome. Unless you're backing this with solid-state storage, if you have many thousands of metrics, the seek capability of the disk may not be able to keep up with the I/O flush rate.


You're witnessing second degree dilettantism at work.

Remember, we started out with a rock-solid reference impl called RRDTool. RRDTool is 13 years old and about as mature as it gets. It's also surprisingly usable and relatively wart-free.

However, its documentation is not written as a narrative "guide", so inevitably some kid eventually found it too complicated and decided to reinvent it, without realizing how far out of his depth he went. That's how graphite happened.

Now 37signals sees graphite, and goes full Dunning Kruger with yet another knock-off, this time leaving out everything that would acknowledge the slightest understanding of the problem domain. While graphite at least tried to mimic the RRDTool file-format 37signals just skips over that whole "complicated binary-stuff" and writes the data as newline-delimited ascii-text...


I believe Graphite/Whisper were created to address some inabilities in RRDTool: http://readthedocs.org/docs/graphite/en/latest/whisper.html#...

Are you saying that graphite is somehow deficient? How is/was the author "out of his depth"?


While graphite at least tried to mimic the RRDTool file-format 37signals just skips over that whole "complicated binary-stuff" and writes the data as newline-delimited ascii-text...

What benefit lies in trying to mimic RRDTool's file format?


Scalability.


That makes sense, and speaks to me (I'm more of a Ruby guy myself.) Thanks for taking the time to reply.

Edit: and, "it looked like it would be easy" made my day :-)


Thank you, that's very helpful. Looking forward to the emitters and dashboard whenever they're ready - I suspect they will help drive adoption of Batsd and encourage development of additional emitters.


Definitely. There's a teaser screenshot of Flyash (the big, reusable chunk of our dashboard) towards the bottom of http://37signals.com/svn/posts/3091-pssst-your-rails-applica... (that post also details some of the major emitter components we use).


"457,739 different metrics in Flyash" Oh...kay... Flyash looks very nice from the screenshots, but I'm interested to learn how you solve the discoverability problem with that much data. (Too much of a good thing?)


Looks great, can't wait to see it open sourced. I've been dealing with the clunkiness of statsd/data -> graphite -> graphene for a dashboard, and have more than a handful of times almost started writing exactly what it looks like you already have done.

Any idea when we'll be able to use/contribute to Flyash?


A couple of weeks, probably, depending on how much I feel like working on it. It was designed to be modular and easily extracted, but still needs some cleanup work and has a few nasty bugs I'd like to fix first.


(I work at 37signals, though not as a sysadmin or developer)

Just to clarify - we do have a 24/7 on-call system administrator who is the first line of defense for when things go wrong. They're the ones who get phone calls when things do go 'bump' in the night, and they're fantastic in every way.

Our "on call" developers fix customer problems; rarely do these arise suddenly in the middle of the night, but our software has bugs (like most pieces of software) that impact customers immediately, and we've found it helpful to have a couple of developers at a time who focus on fixing those during business hours rather than working on a longer term project. Most companies probably don't call this "on call", but rather something like (as a commenter on the original post pointed out) "second level support". This is what Nick was describing in his post.

Of course, fixing root causes is the best way to solve bugs, and we do a lot of this too. We've taken a significant dent (>= 30% reduction) out of our "on call" developer load over the last 6-12 months by going after these root cause issues.

Hope that clarifies the situation some.


This is the second long-form piece about Caro and his new book I've seen this week (the other is in tomorrow's New York Times magazine - http://www.nytimes.com/2012/04/15/magazine/robert-caros-big-...). Interesting how similar the facts they tell are, but the angle is very different in each.

Great PR by his agent no doubt, but both enjoyable reads -- well worth the time to read them both.


http://www.robertacaro.com/NYer.html

A sample of Caro's writing. His work was unknown to me before reading the links here.


The LBJ bios are among my favorite political bios ever. Classics of American History.


Thank you for supplying this second link - I read it too and enjoyed it very much.

From these pieces, Caro reminds me a little of Joseph Frank, who finished his great five-volume biography of Dostoevsky when he was in his 90s and whom I got to study with for a little while.


(author of the original post)

I'm not aware of anyone offering it as a service, and I don't know how well it would work. Since it's all UDP, there's risk of data loss. We experience essentially no loss within our network (less than one one-hundredth of a percent in testing), but not sure what things would look like going elsewhere.

I know what you mean about fearing the maintenance -- in particular, getting the node.js statsd daemon, Graphite, and Whisper (the default storage and front-ends, in Python) running can be pretty tricky and intimidating. I'd suggest taking a look at the alternate implementations out there -- etsy has a list at https://github.com/etsy/statsd/wiki, and there are a few more not on there as well -- and see if any of them seem more your speed. For us, we really didn't want to introduce a whole extra set of infrastructure requirements to run a Django app and accouterments, which is how we landed on using our own implementation (in Ruby).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: