Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Native Minecraft servers with GraalVM Native Image (github.com/hpi-swa)
190 points by fniephaus on Sept 2, 2022 | hide | past | favorite | 89 comments


"As such, it is supposed to require fewer CPU and memory resources, provide better startup times, and be easier and cheaper to deploy."

So we don't even know if it actually makes things faster? Startup are a none issue, CPU / memory is but you need proof for that.

Graal does not support ZGC or Shenandoah so it's hard to say if the G1 version from Graal is up to speed.


Disclaimer: I work on the GraalVM team.

The students "measured noticeable reductions in terms of memory footprint of up to 43%" [1] in some preliminary experiments. More from the accompanying blog post:

"We also hope that the Minecraft community builds on our work and helps benchmark different configurations for native Minecraft servers in more detail and in larger settings."

Please feel free to share any numbers on CPU/memory usage with us!

[1] https://medium.com/graalvm/native-minecraft-servers-with-gra...


Note that the memory usage _could_ potentially be significantly improved for the JVM by just using an alternative allocator, such as jemalloc. In our system, we saw, in some instances, native memory usage decrease by about 60%, and it also resolved a slow "leak" that we saw, since glibc was allocating memory, and not returning it to the OS. In our case it was because we were opening a lot of class loaders, and hence zip files, from different threads.


I can second what you wrote about jemalloc. Some internal services at Amazon are using it with solid outcomes. I also recommend trying out 5.3.0 version released earlier this year.


Last I did benchmarking, a vast majority of memory allocations were strings that were typically all dereferenced right away and cleaned up in GEN1 GC. I had contemplated whether string pooling would be useful or not but never got around to it. Would be interesting to see if you could get reduced memory usage and potentially better performance by decreasing pressure on the GC during the GEN1 phase.

(Side note: this was when I was co-maintaining MCPC so was typically with mods installed and they heavily use NBT which I suspect is where a lot of that string allocation was happening.)


This is very interesting. Could you share more details on this particular issue in glibc? Jar files get mapped so I'm really interested where glibc failed to release memory.


No the OP, but we had similar issue — our service was leaking when allocating native memory using JNI. We onboarded Jemalloc as it has better debugging capabilities, but the leak dissapeared and performance improved. We never got around to root causing original leak.


It's probably the same thing prestodb encountered: https://github.com/prestodb/presto/issues/8993


For performance reasons, glibc may not return freed memory to the OS. You can increase the incentive for it to do so, by reducing MALLOC_ARENA_MAX to 2. https://github.com/prestodb/presto/issues/8993


I was under the impression that most builds of the JVM used jemalloc by default.


Why is this? I thought the JVM already did somewhat decent JIT compilation ...

If I understand the article correctly, you're preempting all possibly unoptimized/expensive code paths (reflection) by attempting to literally execute all of them? While it's a cool experiment, isn't it a bit error-prone (besides being a lot of effort of course, but playing Minecraft on the side does sound pretty fun!)?


The JVM is likely to beat AOT compiled java code in almost all cases - but due to Graal having a closed-world assumption (e.g. no unknown class can be loaded, so a non-final class knows that it won’t be overridden allowing for better optimizations, limited reflection allows for storing less metadata on classes, etc) it does allow for significant memory reduction. Also, escape analysis is easier in an offline manner.


can't that all be done speculatively with de-optimization /s


JIT compilation requires additional CPU and memory resources at run-time, which AOT compilation can avoid. This also means that for a native executable, the compilation work only needs to be done once at build-time and not per process.


This is the first time I see someone bring up extra cpu and memory usage as a downside of JIT. It might matter in the embedded world but it's Java we're talking about so the cost is minuscule compared to what you're getting for it.


You’re not wrong, but it is funny how we got here from Gosling’s Oak addressing set top boxes.

The thing was built to address the burgeoning embedded w/ a little horsepower market with its variety of hardware and OSes.

Now it runs Enterprise server software… and Minecraft.


Well, it does make sense - a controlled runtime failure is much better than a segfault, or worse, a silent failure corrupting heap. Pair it with decent performance even back than, increased developer productivity and the best observability tools, which is again helped by the VM-semantics.


Those are usually pretty trivial as they are judiciously handed out based on hot code paths by the JVM.

There are certainly pathological cases where it could cause major issues.

AOT suffers from not having runtime information, so anything involving dynamic dispatch (which is REALLY heavily used in java) will be a lot harder to optimize. JITs get to cheat because they know that the `void foo(Collection bar)` method is always or usually called with an `ArrayList`. PGO is the AOT world's answer to this problem, but it generally explodes build times and requires real world usage.

In java land, there's also the option of "AppCDS" which can cut down a large portion of that compilation time between processes.


GraalVM does have a better optimizer in certain conditions than C2 in vanilla JDK which can can lead to better performance. Basically the only way to know if GraalVM will give you better performance or not is to try it and/or run benchmark your code.

https://www.graalvm.org/22.2/examples/java-performance-examp...


Is there any benefit to simply running/JIT the client and server on GraalVM instead of JVM?


It is much worse than this because the free version of graalvm only supports the serial garbage collector. Minecraft servers and clients should be using ZGC to get rid of garbage collection pauses.


startup time is SUPER important

lets you spawn new game instances on the fly, reduce time spent loading the chunks and game data

you save a lot of money when you scale, and you improve latency, people don't complain with huge loading time and stutters for fresh servers

ask anybody working on the industry

fun fact, that's the first thing Riot did when they acquired Hytale

They rewrote their C# client to C++ for portability

And they rewrote their Java server to C++ for performance (and cost saving)

Tech Change: https://hytale.com/news/2022/7/summer-2022-development-updat...


Are they using Graal enterprise? Last I checked Community Edition of native image uses the serial collector not G1.


The students used both, the community and enterprise editions of GraalVM. Indeed, G1 is an enterprise feature: https://www.graalvm.org/22.2/reference-manual/native-image/o...


>Startup are a none issue

Yes it is. Developing any short term job -- that runs multiple seconds and goes away -- like lambda or k8s jobs with Java is meaningless for exactly this reason. The startup time is longer than the run time.


A Minecraft server is not a short term job.


I guess it can be in a specific case: minigames servers (such as Hypixel), which are just a bunch of servers "connected" together. Players start into a "lobby" server, where they can choose a minigame, and are then sent to another server where they spend a few minutes.


The game servers don't restart after the end of a round, though, do they? I'd imagine they kick the players back to the lobby, reset the in-server game, and then tell the lobby to send the next batch of players.


You assume that load is constant, it isn't. And load varies not only with amount of players on minigames server, but with changes in distribution of players between minigames also.


There's usually more than one server per minigame. You could see it in the url you were redirected to; they had more servers running the more popular minigames. Each minigame has a player limit, so the maximum load on any given minigame server is known (within the bounds of the minecraft sub-superset that makes up that minigame- but usually the minigames are deliberately limited/bounded in how much computation they need, as opposed to vanilla Minecraft). Extra players get sent to the next available server. If there's consistent overflow, at that point you might turn on a whole new server, or change a server's gamemode (I don't know to what degree Hypixel actually did/does this, or how often it's actually necessary).


This is fairly certainly how it used to be done. Besides, you can have the server idle a good bit before actually letting players in.


The JVM can start up in less than 0.1 seconds. Depending on the amount of classes being loaded it is not an issue even for lambda and k8s jobs.


The VM starts up plenty fast. The slow part is when people use reflective dependency injection containers that take seconds to scan the class path before before executing.


This is why frameworks like Micronaut exist.


Micronaut, Quarkus, Avaje Inject, CDI Lite. Plenty of solutions if people would stop reaching for Spring.


You clearly did not deploy enough classes on Lambda to have more than 10 seconds warmup on a trivial Java based Lambda function.


Warm-up is more a function of invocation count rather than time as you seem to be suggesting here


This is a Minecraft server, so it's going to be running 24/7.


I see you aren't familiar with modern state of Minecraft servers. Due to Minecraft being limited to a one core big servers actually aren't a single instance. They use proxy servers(such as BungeeCord and it's forks) which distributes load between several lobby servers and from there people join one of custom gamemodes(Skyblock, Bedwars, etc). This allows for tens of thousands of people to play simultaneously, but not in the single world, while SMP(Survival Multiplayer) servers can run couple of hundreds at most. These giant servers are heavily containerized and automatically scale under load, so spinning up and shutting down servers is a pretty normal thing. And there have been some attempts to make Minecraft to run a single world on multiple instances(MultiPaper and some private ones), so even for usual SMP server it can be a commonplace soon as player join and leave.


> And there have been some attempts to make Minecraft to run a single world on multiple instances(MultiPaper and some private ones)

First time I hear about MultiPaper, another idea I had which I din't know someone was already working on LOL. It's a pretty promising idea considering the current performance problems of the game. This could possibly allow thousands of players in the same server which would be AMAZING, almost a completely different game. Imagine if MultiPaper was compiled to native using GraalVM.


I encourage my competitors to keep thinking this.


I don't think this repo provides any value. It compiles only Vanilla server, doesn't provide any benchmarks while spending whole paragraph on GraalVM Enterprise and Oracle Cloud(a single worst cloud experience I've ever had, it took me two dozen attempts to register until I finally gave up) Free Tier promotion


Shout out for Cuberite as an alternative Minecraft server project that desperately needs more volunteers

https://github.com/cuberite/cuberite

"Cuberite is a Minecraft-compatible multiplayer game server that is written in C++ and designed to be efficient with memory and CPU"

Cuberite has been demoed running on old ARM Android phones and hosting multiple players off it at once. Its performance absolutely annihilates the Java based 'vanilla' server


In a similar vein, there is also a Rust-based Minecraft server implementation:

https://github.com/feather-rs/feather


Can the same trick be used with the java client? My son runs minetest on the raspberry pi 400 as minecraft is to slow. I'll do everything for a bit more fps.


There are mods which heavily optimize the java client performance, these would have a greater effect than just ahead-of-time compilation. For instance, the modpack at https://github.com/Fabulously-Optimized/fabulously-optimized packages several of these performance mods together (see https://github.com/Fabulously-Optimized/fabulously-optimized... for the list).


Check out the sodium mod [1], if you haven't already. I've had great success eeking out a few more precious frames with it on older hardware. IIRC, it works on both x86 and ARM processors.

[1] https://github.com/CaffeineMC/sodium-fabric


> I'll do everything for a bit more fps.

Native compilation usually makes things a little slower, not faster. Using the closed-source Enterprise version, and using PGO gets it back to around the same speed as the VM version I believe currently.


Minecraft Bedrock edition runs better. It has feature parity but is not compatible with java server, and requires a new purchase IIRC.


> Minecraft Bedrock edition runs better. It has feature parity but is not compatible with java server, and requires a new purchase IIRC.

That's no longer the case. If you have one, you can "purchase" the other for free. See https://www.minecraft.net/en-us/article/java---bedrock-editi... and https://help.minecraft.net/hc/en-us/articles/6657208607501 for details.

Also, there are mods for the Java server which allow both Java and Bedrock clients to connect to the same server and play together. I don't know the details, but I have played in a server which used these mods.


> Also, there are mods for the Java server which allows both Java and Bedrock clients to connect to the same server and play together.

This is correct. I am running a vanilla SMP for my son, and he plays primarily on the switch. I use a Java server running Fabric and Geyser/Floodgate in order to allow his switch to connect to the server. Everything runs smoothly, so far.


> Also, there are mods for the Java server which allow both Java and Bedrock clients to connect to the same server and play together.

How exactly does that work? Afaik there are quite a few behavioral differences between the two, especially for technical things like redstone and pistons.


Most of these behavioral differences are in the server. So what happens, is that it behaves as if you were playing the Java edition, even when using a Bedrock client.


That sounds like the best of both worlds. Features of Java but performance of Bedrock.


This is misleading, vanilla Bedrock edition allows for a bigger render distance but has a much smaller simulation distance. There's a whole miriad of differences that they're not at all in feature parity.


Hit Shift + F3 to see a frame time breakdown, then you can determine if it is slow graphics or cpu. If it is CPU, maybe graal helps, but it's hard to tell upfront. Also check out some mods dedicated to improving performance like Sodium.


It's always CPU with Minecraft. 1 thread can't do much more.


I don't think the student looked into that at all, but I guess it depends on what the Java client uses for drawing. GraalVM Native Image currently doesn't support AWT on Linux/JDK17+, but we are working on fixing that soon.


> it depends on what the Java client uses for drawing

AFAIK, the Java client uses LWJGL, which is a native library.


Thanks for the info! Seems like it's worth trying to compile the Java client with GraalVM Native Image then, given that this exists: https://github.com/chirontt/lwjgl3-helloworld-native


Apparently, someone has managed to compile the Minecraft client to native: https://medium.com/@kb1000/what-youve-done-with-the-server-i...


I've always had some questions about graalvm so I'd like to hijack this thread, forgive the out of topic comment please!

I've got a number is spring web applications from which I create an uberjar (jar file with all dependencies) and run them in a Centos server using something like java -jar server.jar (it's a little more complex than this but you get the idea).

Would I be able to use graalvm to create native binaries from these jars? Is there some kind of tutorial describing the procedure?

Is this possible without a license/paying big money?

Finally, is this worth it? Will the apps become any faster?


Spring Boot 3 is expected to support native/graal. There is a milestone release I think.

There is a Graal Community Edition, which is free. Search for graal and spring pet clinic demo, you will likely find an article reducing startup time 100x (starting pet clinic in 15ms), and reducing memory 2-3x.

I don't know about 'faster', but in my experience most spring applications are RAM bound, not CPU bound. So the native binaries can result in scaling back to smaller and cheaper cloud instances, or smaller VMs. Imagine halving your monthly cloud instance bill, if you are looking for 'worth it'.

If you want to play with a framework where the native part works pretty okay, and still be able to use your dependent injection and dependencies, have a look at Quarkus. They even have some spring 'polyfills'.


Startup time is the major problem I have with spring, it can be ~ 1 minute in some apps. I'll definitely check Quarkus thank ypu!


I think right now this Isn’t possible with “normal” Spring because Spring and various other libraries you’ll normally use make heavy use of reflection.

Frameworks like Quarkus and Micronaut have been written with native in mind and I think Spring is also working on it (Spring Native).


Thank you for the suggestions I'll take a peek at them!


You would likely not be able to turn them to native binaries without a ton of work — spring uses reflection very heavily, so you would have to list every class that would get reflectively checked (including spring internals).

There is spring native that will solve it for the most part, but I’m not sure how hard it is to change an existing spring web app to that.

GraalVM has a community edition, which is free, I’m not sure about the license.

And it is likely not worth it, performance will likely be worth, but memory usage and startup speed will decrease. It can be worth it for command line apps or some tiny microservice that is mostly idle.


Thank you for the information! Doing some research myself I found some things about the license and integration with spring here: https://www.graalvm.org/faq/ ; it seems that no license is needed for graalvm!

I'll also take a peek on spring native it seems to be available in beta: https://github.com/spring-projects-experimental/spring-nativ...


Is Graal VM a silver bullet? Ignoring startup times, will Graal VM out perform classic JVM (IBM/Oracle etc'). I guess the optimization of the classic JVM are hard to beat. Also, cross compile is not working with Graal VM (which makes it harder to deploy than a good old Jar file).


Startup times (especially for 'on demand' cloud workloads) are kind of the point of GraalVM. Effectively, it shifts optimisation to the compile phase. GraalVM build take much more time than classic Java. But they run a bit faster (on some workloads dramatically) and use less memory. It's no silver bullet for development, if you want fast turnaround after changing your code you want the classic JVM. GraalVM can help to cut your production load a bit (although Oracle seems to keep the heavy performance gains behind for their licensed GraalVM enterprise customers)


> But they run a bit faster (on some workloads dramatically)

That’s not true. For the majority of applications the JIT compiler will be much faster (either Graal’s JIT compiler or Hotspot). Startup time, and memory reduction is true though for AOT.


I haven't noticed compile times to be any worse when using GraalVM to build Java projects.

Caveat: I also haven't been using Native Images yet, though. So I can't comment on if it'll be dramatically different for that build target.


GraalVM is multiple projects and I feel there is often a bit of a mix-up around these:

GraalVM is first and foremost a JIT compiler written in Java that can be plugged into OpenJDK. Due to it being written in a higher level language than the original Hotspot compilers (written in C++) they are easier to write/maintain/experiment with. This mode of operation is used extensively by Twitter for example, because on their workloads it provides better performance than Hotspot, but the two trades blows in general. But this uses the standard javac compiler so it is basically just a slightly different JVM implementation.

Since a JIT compiler outputs machine code it can be “easily” modified to do so in an offline setting as well — this is Graal’s AOT/native compilation mode. This will take a long time compared to some other compilers (I don’t exactly know the reason for that, probably Java’s dynamic nature requiring more wide-reaching analysis?), but will have lower memory usage and faster startup speed compared to the traditional execution mode (but rarely better performance).

There is also Truffle, which turns “naive” language interpreters into efficient JIT compiled runtimes and allowing polyglot execution, which is a whole other dimension.


Wow, yes this definitely was not clear to me as a (longtime) user of GraalVM.

Thanks a lot @kaba0, big-O would be smart put your comment as part of the GraalVM site FAQ for "What is GraalVM".

Cheers.

EDIT: One request for a small clarification

> But this uses the standard javac compiler so it is basically just a slightly different JVM implementation.

What is "this"? Are you referring to TFA?


I use GraalVM as my standard non-native JDK (OpenJDK replacement) and I'd say the performance is somewhat better.

There are a lot of non-biased benchmarks you can find online, most of them showing that Graal (both CE/EE, though particularly EE) are more performant than OpenJDK.

You then also have the option to compile to native, or to embed/run code in other languages baked in.

It's a no-lose scenario IMO.


Are there no downsides?


It usually needs a bit longer warmup period in my experience. But for long-running processes it can be ideal, Twitter for example uses it for quite some time in production.

Also, not every GC is available, or only in the enterprise version.


Anecdotally I found that recent releases of OpenJDK with Hotspot were a bit faster. Both on my machine and for web services. If you don't need native images or truffle, the huge installation size isn't really justified.

There are multiple benchmarks that show marginal gains using GraalVM CE for big data workloads; it might make sense if you're still stuck on Java 8 or 11. The enterprise edition shows more significant gains.


Especially as, the last time I checked, more "modern" garbage collectors (e.g. G1) are only available in the enterprise edition.

The community version has only serial or nothing, which are ok for small heaps or short lived processes.


On some micros Graal beats C2, on some others it doesn't. It's not a silver bullet.

GraalVM is regular OpenJDK with the compiler switched out, AFAIK.


> GraalVM is regular OpenJDK with the compiler switched out, AFAIK.

Do you have a source for this? Or how do you know?


Just look at the 22.2 release notes [1]:

> Updated the OpenJDK release on which GraalVM Community Edition is built ...

and

> Updated the Oracle JDK release on which GraalVM Enterprise Edition is built ...

[1] https://www.graalvm.org/release-notes/22_2/


Got it, thank you @fniephaus. Really appreciate the info, and please keep up the fantastic work!


The whole advantage of GraalVM is startup time, which is important for containers, Lambda jobs etc, because it doesn't have to compile bytecode on startup. It isn't supposed to be faster than regular JVM, which has the advantage of being able to analyze and recompile hotspots.


> The native executable sometimes fails on startup. Restarting it a few times usually helps.

How would this be possible for a static native executable?


It fails for some reason when reading user data from disk. The error also goes away if you nuke the user data but that's less convenient.


No need to go for the client, it's working fine on my machine, nearly 60fps with a 12-core, 32gb + RTX2080TI with Iris, Sodium, Phosphor and Lithium </i>.


Nearly 60, lol. Also, fps are not the real problem for the client. I had a modpack with 16GB assigned crashing due to OOM errors. Forge is awesome, but modding the hell out of MC requires extreme specs.


I upgraded my computer to 32 GB RAM just to play Minecraft.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: