Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Goircd: Minimalistic simple IRC server written on Go (cypherpunks.ru)
124 points by stargrave on May 8, 2020 | hide | past | favorite | 66 comments


500 lines of code IRC server I wrote in Tcl in 2004:

https://github.com/antirez/tclircd/blob/master/ircd.tcl

IRC simplicity is magical.


What impact did IRC have on the design of Redis? You have spoken about the simplicity of Tcl, are there other systems that you think have interesting design properties?


I believe IRC and other old school text based protocols definitely inspired the idea of the protocol itself. But there is also kinda of a complementary aspect to that: in the past I had to write, for work, things like a non blocking POP3d for busy mail servers. The way the normal implementations worked was pretty terrible, so I developed a lot of appreciation for event-driven designs like the one in Redis, that tend to be very efficient. About IRC and the other text based protocols of the early days of Internet, I always thought that their inefficiency for certain tasks was never about the fact of being text-only, but because of the lack of prefixed length information. No prefixed length means two terrible things, an EOF signal of some kind but especially a way to quote such EOF in case it is present in the data part itself. Then HTTP arrived and showed that it was possible to do great things with text protocols, and now, the irony, it is a binary one as well :-D So Redis uses a prefixed-length but otherwise textual protocol for this reasons.


That makes sense, when I was first starting to program and parse data formats, the quoting the quote problem baffled me for awhile until I understood in-band vs out-of-band encodings.

I recently realized that multipart text encoding could be used as a container format like tar. If one embraces web technologies as foundational, lots of things become easy. I have a fondness for the IFF/RIFF format as being a kind simpler binary XML (before I knew what XML is) that encodes those sized, tagged blocks data, no EOF scanning.

Later I found netstrings [1,2], I think they share lots of properties with the systems you mentioned. I do think having hybrid encodings/protocols can really advantageous in terms of understanding and portability.

[1] https://cr.yp.to/proto/netstrings.txt

[2] https://wiki.tcl-lang.org/page/netstrings


> No prefixed length means two terrible things, an EOF signal of some kind but especially a way to quote such EOF in case it is present in the data part itself.

For SMTP and NNTP messages, a CRLF.CRLF terminator is used to indicate the end of the message. If the message itself contains a line that started with a ., then the client would prefix that line with another . (dot-stuffing). The receiver on the other end would remove the extra dots from the line before storing the message. So, quoting part of the EOF is one viable solution and doesn't result in too much overhead in those cases.


Unfortunately that's a lot of overhead: you want to read an email as a whole blob of data, not splitting it by lines (that is effectively processing every byte) just to tell if there is a ".." prefix.


What I do in perl is to set the line terminator to CRLF.CRLF, read from the socket into a buffer, and then try to read a line (where a line is now defined as something that ends in CRLF.CRLF) from the buffer and check whether it ends in the terminator. Once I've read the data including the line terminator into the buffer, I run a s|\r\n\.{2}|\r\n.|g on the buffer and then write it to disk.

Though it does consume more memory since I have to store the message in a buffer, it doesn't involve splitting the string line by line. It just processes the entire email message at once.

With a size prefix, I would still have to read into a buffer and check its size to see whether I've read enough data.

Though checking the length of a string or size of a file involves less work compared to checking for a line terminator and removing the dot-stuffing, if the amount of data we're dealing with is not large, then there probably isn't really that much difference in the amount of work done for those checks.


Another old but good one that seems to be forgotten is ASN.1 (BER/DER). It's the basis of X509 certificates, and just "deals" with that.


Isn't ASN.1 also source of like half of SSL implementation vulnerabilities?


It's more a consequence of the bad design and implementation of the DER parser in OpenSSL, which is what most people use.

ASN.1 isn't actually a message format--it's the syntax for feeding to an ASN.1 parser generator, similar to protobufs.[1] There are multiple binary formats (BER, OER, PER, XER), but for X.509-based specs the binary format is DER. OpenSSL doesn't have a parser generator; just generic, low-level routines for slicing DER blobs.

If you used something like asn1c, you can simply feed it the X.509 ASN.1 specification and it generates a C-based parser and composer. Manipulating an X.509 certificate largely just becomes a matter of manipulating strongly typed C data structures.

For a recent Lua project I implemented an X.509 certificate parser using LPeg. Because PEGs are so expressive and I only cared about DER, it was much easier to skip dealing with ASN.1. DER is a TLV encoding, which isn't context free and thus not strictly compatible with PEGs, but LPeg has some extensions--e.g. match-time captures (http://www.inf.puc-rio.br/~roberto/lpeg/#matchtime)--which make it possible to create grammars for TLV encodings. Which, incidentally, is one reason not to use length-prefixed encodings: they're not a context-free grammar, which makes it more difficult and sometimes impossible to use a parser generator; length prefixing is a mitigation for a specific class of bugs that are common when open-coding a parser, but parsers for complex formats (i.e. many typed, compound objects with deep nesting) shouldn't be open-coded if you can help it.

[1] Except ASN.1 is better designed--it lacks all the ambiguous cases. See http://lionet.info/asn1c/blog/2010/07/18/thrift-semantics/ The downside to ASN.1 is that, especially at the time OpenSSL was originally written, documentation was expensive, and because the open source ecosystem typically used line-based protocols which can more easily be implemented with open-coded parsers, it's not surprising developers made some poor choices in the first open source X.509 certificate parser; choices which have haunted OpenSSL and open source projects more generally ever since.


Tcl is awesome. You get that *nix feeling in a programming language, instead of the shell. It's quirky, but in a good way.


> 500 lines of code IRC server

Without comment lines there are 488 lines of code ;)


Thanks for the Tcl memories.


IRC-anything gets thumbs up from me, I always thought that IRC bots are a great way to learn a new language, just a nice project that you could do in a day that would teach you the basics of the language, string manipulation, networking and some stuff here and there, an IRC daemon is taking it a step further. This looks very nice but it really is minimal as the title claims, personally I've been working on one to learn Rust which is more featureful at this point but as it turns out the IRC protocol and server handling is much more complex once you get into the details than it normally seems for such a simple protocol.


That's what I've been doing since I was 12 years old :)

My first "programming" experience was with mIRC scripts. I even wrote an IRC client using mIRC sockets at some point, and a dummy webserver / SMTP client. When you do things the wrong way, you pick up a lot of new skills. I learned to "reverse engineer" (my implementations were trivial, not fully fledged) protocols with a proxy/sniffer pretty early :)

From there, the first use case I had for every new language I wanted to learn was write an IRC bot. I had written one in VB6, VB.NET, Python and even C/C++ at some point.

It's funny to see this comment on HN, as this has been my mantra for a while now. On top of string manipulation and networking, another important piece it teaches you about a language is using data structures and manipulating data. IRC client is super stateful. For example, it needs to maintain the list of channels you're in, the list of users in those channels, their op/voice status, etc.

Highly recommended as a project for picking up a new language.


Me too! I will never forget adding "runtime loadable moduels" to my PHP IRC bot at age ca. 14. Oh my was that functionality a travesty: a user might say something like `!load poll` to activate the poll module. What would my code do? Oh it'd `include poll.php` (or whatever the syntax is, this was both my first and my last PHP project) in the main loop. On every iteration. Because hey, how else would the poll code be able to react to stuff?

I learned a lot of lessons and still have fond memories 20 years later :-)

Edit: Man, just thinking at this has Metallica playing as background music in my head even though I haven't listened to them in this decade. It's so strange how childhood memories can be so incredibly strong and cover so many sensory modalities. These days I can hardly remember what I did last week.


I suggest you check out http://brainrules.net/ by John Medina, he covers the fidelity of our memories as we age and much more.

https://www.youtube.com/watch?v=NujSdn1bg5k


> My first "programming" experience was with mIRC scripts. I even wrote an IRC client using mIRC sockets at some point, and a dummy webserver / SMTP client.

Same here! The mIRC scripting language was incredibly rich, and the ability to create native Windows dialogs with it was a powerful tool.

Back in the 90s, a friend and I reverse engineered the protocol behind a popular program used to play Magic the Gathering online called Apprentice. The protocol was insecure and allowed users to cheat by controlling the outcome of coin flips and other elements of the game. The purpose of the research was to prove definitively that the client could be manipulated, in order to resolve some cheating accusations that had been thrown around.

I set out to write a program that would hook into Apprentice through the Win32 API and allow you to change the program's output, while my friend created an mIRC man-in-the-middle server that you could configure through a dialog box to connect to Apprentice and modify incoming and outgoing protocol messages.

My project never shipped and his reached widespread availability, eventually leading to the development of better clients [1]. The fact that he was able to get so much out of mIRC script (dialog boxes, sockets, event handling, etc) and was able to make it simple for others to use was impressive.

[1] https://en.wikipedia.org/wiki/Apprentice_(software)#Backwash


May I promote the bot I have been running for two weeks? An IRC text game described there: https://pink-dragon.surge.sh/


I really wish I kept my mIRC scripts somewhere. It'd be interesting to see how I progressed.


im still trying to write my perfect irc client (the code has to be good). my latesy approach is using functional programming a lot


> I always thought that IRC bots are a great way to learn a new language

I personally use the project “writing an authoritative DNS server” as mine. Forces you to think about data structures, TCP and UDP, bit packing, file/DB handling, parsing, concurrency, serialization, etc. If you don’t know DNS really well, wouldn’t be as easy, but once you know it backwards and forwards it’s a really a great way to push a language your learning.


meh.. i wrote my first IRC bot when i was like 12. I don't think i'd get very far writing a DNS server as a 12 yr old


If you can write an IRC bot from scratch, you can do an authoritative DNS server, don’t sell yourself short.


IRC is more immediate and self gratifying. Very important things for a self-taught 12 yo. Not that dns isnt cool, just probably not the best recommendation for a young programmer.


How would one start with learning those concepts first? I've been wanting a good way to learn networking that I can code along to in my preferred language


> I've been wanting a good way to learn networking

1) Do you learn best by reading books, following tutorials, or watching videos?

2) What programming language do you know best / second best?

Happy to dig up a couple sources for you if you limit the scope a bit via the two questions above.


I tend to combine books and tutorials... the more in-depth the better.

I know Python best but am very quickly getting just as comfortable with Go. Would probably want to do this work in Go

Also maybe Rust cuz I want to get around it someday? For Rust I would need more “here’s the literal code”. Python or Go I could probably do it just from pseudo code or high level descriptions


Ok, I’ll try to send you some stuff this week. Python/Go/Rust is easily doable.


Thank you! much appreciated


Shameless plug for https://robustirc.net/, which is also written in Go, and solves netsplits (between servers, and also between client and servers when clients use the RobustIRC bridge or a compatible client) :)


Had to check the authors file[0] to realise you're one of the authors. Wow! I'd like to say a resounding thanks.

I've been looking at this system recently and I'm going to be including robust IRC access to my network (shameless plug[1]).

The code is clean, I think I have a good understanding of how it works and from my testing it does seem to work really well!

That said, have you looked at how much of your original problem space is solved by IRCv3?

[0]: https://github.com/robustirc/bridge/blob/master/AUTHORS#L10

[1]: ircs://irc.darkscience.net:6697/darkscience


Clarification, I'm the one of the maintainers of Oragono [1], an IRCv3 server project, but I don't speak for IRCv3 as a whole. In fact, I think my take is somewhat controversial in the IRCv3 community. But here goes:

1. Most of the big problems with IRC as a protocol (nicknames, ghosting, missed messages) come from the assumption that the core of the network should be stateless. If the core of the network is stateless, you can't have nickname reservation, you can't transparently attach two clients to the same nickname, and you can't replay missed messages to people who were disconnected (hence the need for clients to maintain a TCP connection to the server at all times, hence netsplits being disruptive, etc.).

2. Conventional IRC setups solve some of these problems by adding a single privileged node, the "services framework" (Anope and Atheme are examples), that stores some persistent state (typically, nickname and channel reservations). This actually abandons the high-availability properties of the original IRC design: it is theoretically possible to design a highly available services framework (using a clustered database or whatever) but AFAICT none of the frameworks that currently exist are HA.

3. Oragono is a single instance that provides an integrated IRC server, services framework, and "bouncer" (history retention and playback). Client connectivity problems are solved by allowing transparent reattach to the original nickname after authentication with SASL, then automatically replaying history. (Client support for this is still patchy [2], but anything that supports znc.in/playback [3], like Textual, will work well; you can also configure Oragono to try and track what you missed and replay it on reconnection.)

4. To make Oragono highly available, it can be deployed in Kubernetes (virtualizing the embedded database file and the external IP, spinning up a replacement instance on failure).

[1] https://github.com/oragono/oragono

[2] https://github.com/ircv3/ircv3-specifications/pull/393

[3] https://wiki.znc.in/Playback


Thanks, I’m very glad to hear!

I had looked at IRCv3 briefly when starting the project, but it looked like IRCv3 didn’t solve netsplits.

Have I overlooked anything (or has something changed), or did you just ask out of curiosity?

Keep me posted regarding how RobustIRC is going for your network!


> ...but it looked like IRCv3 didn’t solve netsplits

Darned CAP theorem.


Another IRC server written in Go, probably less minimalist though: https://github.com/oragono/oragono

Been running it for quite a while now without troubles (but only few internal users, so not particularly challenging)


I'm one of the Oragono maintainers. Thanks for the kind words :-)

You're correct, we're not trying to be minimal at all, more "batteries-included". The largest deployments so far are in the low hundreds of users, with no performance or stability problems. Based on my stress testing I'd say an instance can scale comfortably to 10k clients, with at most 2k clients per channel.


Thanks for making Oragono! I've been meaning to upgrade my setup with some of the newer features, but honestly the binary I installed years ago just runs so it hasn't been a priority :)


A very nice thing with oragono is that you can configure it for replaying last messages again when an user (re)connects.


I think that's now a standardised feature in IRCv3 compatible servers.


I get an invalid certificate, as the root ca is ca.cypherpunks.ru, which is not trustworthy.

You might want to look into that, because I like the idea. :)


I also like the idea of being my own CA instead of pretending a bunch of random companies are trustworthy.


What's the point ? This effectively makes the whole encryption worthless, might as well serve your site in plain HTTP


The visitor cannot be sure that he really is connected to git.cypherpunks.ru. But doesn't encryption work anyway?


Encryption is active with a host you don't know. It's TOFU, which is OK if you can verify the identity through persistent uses (ie with SSH you'll have connections with the same server over a long period of time, or with IM you'll talk to those people over a long period of time). In the case of HTTP you'll only get the content now and potentially not visit the site anymore for a long time, especially if it's a personal site.

A few years ago self-signed certificates made some sense, today with Let's Encrypt there is absolutely no good reason to continue doing this on the open web


We have been running this for several years for the IRC service in our closed community "cloud". It is unfortunately subject to many bugs, some of which affect its stability. It crashes regularly.


Any specific reason why you keep up with that instead of just using another one of the very minimal servers like https://ngircd.barton.de?


The BOFH is a Go fanboy.


Then why not oragono?


He may not be aware of it. I will forward the project name to him.


Are there any startup or community out there who are still using IRC? If so, what are the reasons?

I wonder what are the advantages of that over newer technologies like Slack, Discord, or even XMPP.


Realistically it's hard to beat IRC for low barrier to entry. Most major networks have web clients now, you pick a handle, type a hashtag, and you are in. The signup process for Discord makes me roll my eyes every time I have to use it. Also, the textual nature of the IRC interface and lack of Fisher-Price widgets and visuals to entertain the short attention span millennial makes for a more pleasant conversational experience.


> If so, what are the reasons?

It's simple, not proprietary, just works, and there are clients to please everyone.


After IRC we switched to SILC for many years. It was quite nice and very IRCish (+ crypto), but after a long time without new development and maintenance, we went with matrix. Two years ago it was still a bit bumpy. A lot of work went into it and now it's mostly smooth sailing -- can recommend.


community leader here, we have 500 or so people on the network, I can give my reasons.

IRC vs Slack:

No brainer, Slack is a hosted application that can change its client on a whim. Bot functions are nice but moderation tools are severely limited and the client is very heavy.

IRC vs Discord:

Similar to Slack in nearly all regards, however it's better at being what slack was supposed to be I think. It definitely is "eating my lunch" compared to Slack. But the same rules apply, it's very heavy and you have no control of the client.

IRC vs XMPP:

XMPP is great. The clients for it are relatively good, the protocol, while XML-y is good, but the bot tools are not very good, and federating is hard given the XEP fiasco.

Additionally, most existing clients treat XMPP as a IM platform and do not focus on chatrooms.

Why IRC then:

Well, the truth of it is that if you're sitting on a wired, stable connection then IRC as a text based instant messaging chatroom system is really good. It has no "frills" like emoji responses or threading. No avatars, nor does it force your client to render anything in a specific way. You're free to use what you want.

If you want inline images, there's a client that does that.

If you want to have a persistent presence, searchable backlog then there's bouncers, or a managed service that does that[0].

If you want a terminal experience that is lean and stripped, or a big fat client that grabs gravatar emails from nickserv registered accounts to display them as avatars, it's possible, anything is possible. And that possibility also exudes the hacker culture.

IRC also does not require an account of any kind, this can lead to some measures of abuse, but due to the very good (though, admittedly hard to use) moderation tools you can have many different ways of managing the community, from shadow banning people, to muting channels, to invitation only channels or passwords- all the way to channel or network level blocks on usernames/IPs.

Also: nostalgia and I've configured the client how I like it[1]

[0]: irccloud.com

[1]: https://xkcd.com/1782/


> [1]: https://xkcd.com/1782/

That made me LOL so loud. Thanks for your very thoughtful response. Appreciated.


Lots of things going on with Go. Do you people recommend using Go for back-end, despise its controversial design decisions?


Reach for golang when performance is the priority. Reach for a dynamic lang when speed of feature iteration is the priority.


Any situation where you would use Go, use Java.


Imagine all the bandwidth that could have been saved over the years if Jarkko had called PRIVMSG MSG instead.


an irc message usually fits in an mta, i doubt you can justt count 4 bytes for every message


Cool!

I wrote an IRC server in Python Twisted some years ago, can’t seem to find the code anymore but I love how simple IRC is. I hope the protocol fragmentation has improved somewhat with efforts like IRCv3.


Link appears broken, but hopefully this is it: https://github.com/ThomasHabets/goircd


Repo linked in OP (git.cypherpunks.ru) has commits from 2019. Last commit to the GitHub repo was 6 years ago.


Last commit: 6 years ago


I wrote an IRC client with one line in a shell script back in the 90s.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: