Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> there's virtually no reason you would be exposed to its language implementation choices

Um, doesn't it still have a client that starts up a persistent server and you then have to wait for that server to start up (which, being Java, takes forever) and then deal with those server processes hanging around?

That sounds like being exposed to its language implementation choices.

As opposed to, say, having fast start-up times and not using a server design like basically every other build too.

But my information might be out of date. That's what always soured me on it previously, though.



Bazel doesn't start a server for Java JIT reasons.

It starts a server for (1) concurrency control (2) management of worker processes some languages use (3) caching the build graph (recall that Bazel works with very large code bases).

These reasons are independent of implementation in C++, Go, Rust, Java AOT, etc.

(And yes it doesn't have to use a persistent process to solve these problems. That is the solution it chooses.)

> not using a server design like basically every other build too

Buck, Pants, sbt, Gradle


> (And yes it doesn't have to use a persistent process to solve these problems. That is the solution it chooses.)

How much did the fact that Java is very slow starting up influence the decision to use a persistent process instead of some other solution?


Java starts up in ~1s, something like Graal could make this faster. But I find this Java criticism more a symptom of Java-derangement syndrome, because python and node build systems have even worse startup times and no one says anything.


> python and node build systems have even worse startup times and no one says anything.

Noop build with Waf (build system written in Python) takes 0.13s on my system. Waf reports that the build actually takes 0.04s, so I guess 0.09s is Python's start-up time (and some other overhead).


Java Hello World takes 0.132s on my Macbook pro. If I turn on -Xshare:on to use class data sharing, then it drops to 0.119s. Ergo, Java startup time is non-factor. Graal could make this even quicker, for example, a GraalVM AOT helloworld can be reduced to .008s startup, see https://github.com/graalvm/graalvm-demos/tree/master/java-ko... for example.

"time bazel" returns 0.098s

Running a null build took 0.84s, but Bazel does significantly more work, as it's working on a big monorepo, as both a build tool and a package manager.

In short, it is not a problem. But I've had npm take many many seconds for simple operations. "npm list" takes 2.9 seconds.

Python can be slow, depending on the tool. Waf sounds like it is simple and fast, but there are lots of other examples of slow python frameworks out there.


> "time bazel" returns 0.098s

For me it returns 1.6s on first run, .9s on second (bazel 0.29.1).


    $ time (echo 'quit()' | python)

    real    0m0.025s
    user    0m0.000s
    sys     0m0.031s

    $ time (echo 'quit()' | python3)

    real    0m0.036s
    user    0m0.031s
    sys     0m0.016s
(both are after it was already cached)


Huh; I see < 0.15s.


> so I guess 0.09s is Python's start-up time (and some other overhead)

That is probably mostly Waf startup time. Python itself starts way faster than that (on the order of 10-20ms on my machine).


Startup time in any interpreted language is very quickly simply proportional to the amount of code loaded.

   time python -c ''                               # 0.024s
   time python -c 'import argparse'                # 0.036s
   time python -c 'import argparse, json'          # 0.040s
   time python -c 'import argparse, json, httplib' # 0.065s


Indeed, but it isn't a cool toy among this crowd.


My guess is that startup time is not the issue, but loading a large data structure (its cache) from disk can be.

Especially if you can't or don't want to use memory mapping because it's hard to do well.

But more importantly, I think that the server can monitor file changes ahead of time with things like inotify, saving the time of stat()-ing files when the user wants to perform a build.


Change monitoring with inotify is going to hit limits quickly.

I recently wanted to take a look at VSCode and it immediately shat itself over the maximum number of inotify watches being too low (the kernel I'm running restricts this to 8k for non-root users).

I then bumped the limit to the max (512k, I think, or about 7 copies of linux.git)…still no luck. Now imagine you have a Google-scale number of files.


> Especially if you can't or don't want to use memory mapping because it's hard to do well.

Why do new languages never address this issue?


In-kernal block caching of files is pretty crude - since the kernel has no knowledge of your data structures or which bits of your files you're going to be reading next, you end up having hundreds of page faults requiring single-sector disk reads for most workloads.


That’s often the case if your file format evolves without taking a mmap use case into account. But for a format that’s designed with mmapping in mind, it is often ridiculously faster and more effective.

But most languages and environments don’t let you do that easily.


Aren't the hot compilers kept around under this mechanism to speed up the next incremental build round? I think it's not just caching of intermediate build output but also not letting the tools start from a cold start.


Yes, and I included that in my point. Certain compilers: Java, Swift, TypeScript are faster with long-running worker processes.

To be fair, that doesn't require a long-running Bazel process to manage them, but that does become a natural choice.


IDK.

I do know that Go wasn't an option :)

(Because it didn't exist)


JVM startup is sub-second. Still not great for e.g. small command line utilities, but completely fine otherwise.


JVM startup for `java -version` is around 150ms on my machine:

    $ time java -version
    openjdk version "11.0.1" 2018-10-16
    OpenJDK Runtime Environment 18.9 (build 11.0.1+13)
    OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode)

    real 0m0.159s
    user 0m0.116s
    sys 0m0.052s

This is how long it takes for a program like git to do actual work. Once you start running an actual program like Maven, it blows up to 500ms for `mvn -version`:

    $ time  mvn -version > /dev/null

    real 0m0.570s
    user 0m0.431s
    sys 0m0.145s

Bazel takes 150ms to do nothing:

    $ time bazel --version > /dev/null

    real 0m0.144s
    user 0m0.028s
    sys 0m0.063s
As a baseline, make takes under 30ms to do nothing:

    $ time make -version > /dev/null

    real 0m0.025s
    user 0m0.002s
    sys 0m0.016s


Seems similar on my system. Some more comparisions:

Waf written in Python:

    $ time ./waf --version >/dev/null

    real 0m0,068s
    user 0m0,056s
    sys 0m0,012s
Ninja written in C++:

    $ time ninja --version >/dev/null

    real 0m0,001s
    user 0m0,001s
    sys 0m0,000s


Java 13 takes 97ms on Windows 10, and I haven't bothered to produce a Java native image before attempting it.

    PS C:\Workdir> Measure-Command { java -version }
    openjdk version "13" 2019-09-17
    OpenJDK Runtime Environment (build 13+33)
    OpenJDK 64-Bit Server VM (build 13+33, mixed mode, sharing)


    Days              : 0
    Hours             : 0
    Minutes           : 0
    Seconds           : 0
    Milliseconds      : 97
    Ticks             : 975572
    TotalDays         : 1.12913425925926E-06
    TotalHours        : 2.70992222222222E-05
    TotalMinutes      : 0.00162595333333333
    TotalSeconds      : 0.0975572
    TotalMilliseconds : 97.5572
About 60 ms more than make.

Hurray Make still wins, now what are we going to do with those 60ms... /s


FWIW, I don't think "java -version" is a good benchmark of java startup time - why should it start up the JVM just to print the version?


I think that measuring anything less than 1s for CLI is silly, school playground measuring competition.


The original comment was "JVM startup is sub-second. Still not great for e.g. small command line utilities, but completely fine otherwise."

For tooling it matters. Some scripts will have backticks and $() and each use of these could be up to a second and be acceptable? Not really. And if you want to run a command like `go fmt` every time you save a file in $EDITOR then you want it to be fast. Maybe when we use the Language Server Protocol everywhere, it will be a different world. But a lot of editors shell out for these features still - and you would never dream of shelling out to e.g. `mvn dependency:tree` because it's too slow.


Unless I am missing something, 97ms looks pretty much sub-second to me.


With Graal AOT you can make command line utilities that start on the order of 0.008s

https://github.com/graalvm/graalvm-demos/tree/master/java-ko...


> make takes under 30ms to do nothing:

But in large code bases, make can take several seconds to do nothing, depending on the number of `$(shell find)`.

Whereas Bazel's time remains constant. Could it have been constantly 30ms instead of 150ms?

Sure. But that's nowhere near to being a problem for the usual code bases Bazel operates on.


$ time ./java -version openjdk version "14-ea" 2020-03-17 OpenJDK Runtime Environment (build 14-ea+19-824) OpenJDK 64-Bit Server VM (build 14-ea+19-824, mixed mode, sharing)

real 0m0.074s user 0m0.063s sys 0m0.025s


Does the initial startup time matter?

I think the server is kept around for a number of reasons that have nothing to do with Java startup time. Probably the biggest one is file-watching, using inotify, FSEvents, etc.

https://stackoverflow.com/questions/57982028/how-does-bazel-...


The server shuts itself down after some idle time, I’ve never had to actively manage any instances of it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: