Well, this just highlights why Emacs needs to grow a namespace feature, because things have changed, and a lot of the Emacs packages being developed today are small, modular, and decoupled (even if the Emacs core isn't and maybe never will be).
Is there a real fundamental difference between the two approaches? It seems to me that it's all about API, not underlying concepts. A filesystem may organize things hierarchically, such that the full path to a file is composed of the names of the various folders that contain it plus the name of the file itself, glued together with a path separator. Or a filesystem may organize things as a pure key-value store, where the file's name is the full path. Code that uses the filesystem could then treat it either way. You can address files purely by full path on a hierarchical system, and you can pick a "path separator" and treat it like there were directories on a pure key-value store.
Applied to code, it's really a matter of how you look up names. Prefixing is just the situation where there's no language assistance in looking up names. Namespacing is just prefixes plus some automatic help in looking up partial names.
> Is there a real fundamental difference between the two approaches?
Yes. With true directories, you can implement directory rename() with O(a+b) time complexity, where a and b are the depths of the paths to the directories. With a flat key/value store, directory rename() requires at least O(n) time complexity for n directory children, since each child must be renamed.
I'd be curious to know how to make it more efficient. Either the descendants of a directory include the directory's name in their identifier, or they don't. If they do, then to implement rename() you need to do something such that all descendants of the directory can be queried by a different identifier than the one they were given upon creation (where the difference is exactly the name change of an ancestor directory). You can:
* rename all descendants on directory rename (expensive--takes O(n) time for n descendants).
* log each rename() and translate a key with a new directory name into a key with the original directory name (requires O(k) space for k renames and O(d) time in the worst case for translating a path of depth d whose directories have all been renamed; introduces the need to garbage-collect log records when all descendants of a renamed directory get deleted).
Granted, this is a more nuanced than what I said before about rename() requiring O(n) time complexity when implemented on top of a key/value store, but the second option is still more costly than rename() with true directories. I'm not sure (but cannot prove at this time) that we can do better than the second option above, but would love to hear your thoughts.
I'm not debating that it's inefficient if you use the same data structure that is used now. That's why I suggested using a different one.
Instead of a flat list of file names for each directory, you store the file names in a trie, so that files with a shared prefix in their name have a shared ancestor node in the trie. To do a directory rename, you just have to find the node in the trie that corresponds to the directory, move that node to a different location in the trie and change the portion of the name that is stored in that node.
> Instead of a flat list of file names for each directory, you store the file names in a trie, so that files with a shared prefix in their name have a shared ancestor node in the trie.
But then, if they behave like directories, and the software treats them like directories, aren't you effectively implementing directories in everything but name? (I mean, of course current software doesn't actually treat directory names this way; but it seems like changing the data structure is going a long way to maintain a directory-less fiction.)
No, because you could also rename any shared prefix of file names, not just the parts that end with a slash. Instead of directories having a special significance, now any arbitrary prefix in a file name has the same significance.
I'm not seeing how this is any different from true directories in terms of time and space complexity, as far as rename and path resolution are concerned. Sure, the trie nodes don't have to be implemented as key/value pairs in the data store, but don't they otherwise serve the same purpose in the affected algorithms--pointing to key/value pairs or other directories? Doesn't this still mean that path resolution is O(d) for a path of depth d in the trie, and rename is O(a+b) for two paths of depth a and b?
The time and space complexity aren't different. That's the whole point I have been arguing, i.e. that you can support the paths as an arbitrary string semantics without losing efficiency for a directory rename operation.
> No, because you could also rename any shared prefix of file names, not just the parts that end with a slash.
I think that
$ mkdir -p a/bc; touch a/bc/{d,e}
$ cd a; mv bc Bc
(changing just the 'b', not the whole 'bc') amounts to the same behaviour for directories. I guess you could argue that it's different because you still have to mention the `c`, even if you don't change it.
What if instead of just one file starting with b, you have multiple files starting with b?
If there are files with names bc, bd and be, `mv b B` now renames all of those to Bc, Bd and Be. With normal directories, there is no way to do that as one operation.
There"s no efficiency requirement on atomicity. Who said we need POSIX filesystem semantics anyway? Is it so important we should trade scalability? Obviously that wasn"t the s3 design choice.
Filesystems users (developers) already assume that renames are fast. Changing the mode now would mean that some of the applications would seem like blocked or could behave strangely.
S3 users already know and should handle that.
S3 as object storage doesn't have a hierarchical structure compared with most file system and it won't support all the operations expected in a POSIX file system (eg, no write append, renaming is implemented as copy-then-delete, ...).
S3 works closer to a key/value store than to a file system.
To be honest my comment was based on OpenStack Object Storage and I can't say if S3 works exactly like that, but being OpenStack kind of a clone I suspect it does.
Heh, and it doesn't change the syntax, either, except that `ls a//b/ has the same effect as a bash-style `ls a//b//*`. Maybe free recursive globs is a plus!
Oops, sorry, all but one of my stars got eaten as formatting. I meant "`ls a/[star]/b/[star]` has the same effect as a bash-style `ls a/[star][star]/b/[star][star]/[star]`."
The idea of package has both advantages and disadvantages:
Advantages:
* packages prefix can be omitted if the symbol is accessible
* package is an actual data structure. This allows me to do calculations with it: for example I can find all symbols in a package or I can find all packages.
Disadvantage:
* more complicated
* conflicts possible when I want to use symbols from some other package in a package
I've been using packages for so long in CL and I never think about them. I think about other stuff. Like, how do I get format to do that thing I want it to do? I always have to look it up. Packages? They just work and work well.
I think RMS is having the same problem really smart people have: if it can't be made perfect I don't want it. I regularly see this in some smart people. They reject stuff because it's not the way they want it, without them being able to articulate why they don't want it.
In the early 90's I (and others) wrote not a small package and we used prefixes to avoid problems. It was ugly though, and a real package system would have been nice. For those interested, it's here: https://github.com/franzinc/eli
The more code I write, the more I fully write out namespace names (thus answering the question "where does this function get defined"), so the code starts to look like prefix based naming anyway.
(Not that I am against namespacing - I think being able to have namespaced packages is a really good idea).
Many namespace/module systems let you do something like
import really_long_name as rln
You still get the answer to your question, without sacrificing terseness. For the record, I also tend to use fully-qualified names when it makes sense. Some languages make this impossible though (I'm looking at you, Java and your files with twenty or thirty lines of IDE-generated imports).
Lots of module/namespace solutions manage the "prefix or conflict" issue by allowing aliased imports (as well as qualified imports), so you can manage conflicts that would otherwise occur without dragging the package name around with every use of the problem symbol.
If they use prefixes, which they might not. I agree that these should, in order to make this kind of programmatic environment manipulation possible. The fact that these don't have prefixes or some other regular distinguishing feature is a design flaw: you either have namespaces, or you commit to prefixing/suffixing/metadata attached to everything and document how it works.
Although, `tcp-` and `window-` seem like they might be used in the way you expect.
RMS's reasoning seems pretty honest. He thinks namespaces are probably meant for small modules with few names referring to other small modules. He seems to think emacs is nowhere near having small decoupled modules with few dependencies.
RMS's position is that big packages = useless since you can't import and would have to qualify, and that both the Lisp Machines and GNU/Emacs would have mostly big packages.
My favorite is the way Go does it. There is 1 level. Names have a sane default, are short and descriptive and can be simply overridden in the unusual case of a conflict. Prefixing the (short) name is required which keeps me from having to guess as I'm reading.
I'm not an expert in Go, Emacs or Lisp in general. I think I should have said Go has 2 levels, local (no prefixes) and imported (always prefixed). While the import lines are super long (usually a URL), the prefixes tend to be incredibly short because, by default, the name is the last segment of the fully qualified name. From what I gathered here, are the Emacs prefixes only a convention and have to worry about conflicting with the rest of the world? I think it's the short names and the required prefix (to avoid function name conflicts) that set the Go way apart. Like I said, I'm no expert, but when I saw how Go did it I thought it was a very good balance between all the tradeoffs that have to be considered for namespaceing.
The difference is that in elisp there is no importing. Either the files have been loaded or they haven't. The convention is that you prefix functions from your package with a common name.
This has a few drawbacks that are obvious. It also has a few benefits that are perhaps not as obvious.
To list a few. If I want to override a function that a lot of other folks use, I can just redefine that function. I don't have to find a way to convince a module system that for every importer of "Foo" I want to override "do-baz." I just provide my own definition of "foo-do-baz."
Similarly, if I want to find all of the places that this method is used, a very simple textual search works. (This is debated in that thread a little.)
Finally, since emacs is ultimately made for direct user experiences, if someone wants to try their hand at creating some things, they don't have to learn about any module system. Simply prefix your functions with a name and you are done.
That last really needs underscoring for some of us. You want to try your hand at writing an emacs extension? Simply start defining stuff in the scratch buffer. If you want to make sure it is saved, do so in a file. No need to setup any sort of module system.
Now... all of this sounds like I am against a module system. I honestly just don't know enough to care at the moment. I suspect they are oversold. I also suspect they are somewhat useful at times. Don't know which way the needle ultimately falls.
Edit: Looks like I removed an opening sentence. Apologies. I meant to lead this off with, "Ah, that makes sense. I think I can see the difference..."
> If I want to override a function that a lot of other folks use, I can just redefine that function.
That's possible and easy using Common Lisp packages. Just use DEFUN with the symbol naming that function, and anyone using that function will now use the new name.
> Similarly, if I want to find all of the places that this method is used, a very simple textual search works.
That doesn't work, but a symbol-based search will.
> That last really needs underscoring for some of us. You want to try your hand at writing an emacs extension? Simply start defining stuff in the scratch buffer.
CL is not so great at that, but for other reasons.
The first case intrigues me. Are you saying that if a package A has a symbol Foo, and another package B has a symbol Foo, how would I override only package B's?
Same for the second scenario? Isn't the symbol Foo essentially shadowed, depending on the package?
Notice how package FOO defined a function FUNC, which package BAR used without the package prefix (because it :USEs FOO), while package BAZ DEFUNed FOO:FUNC, redefining it everywhere.
> Are you saying that if a package A has a symbol Foo, and another package B has a symbol Foo, how would I override only package B's?
Easily: A:FOO and B:FOO are completely different symbols (unless B imports A or vice-versa). Setting `(symbol-value a:foo)` has no effect at all on `(symbol-value b:foo)`. If package C imports package A, then C:FOO and A:FOO are the same symbol.
If C needs to import both A, but wants to use its own FOO, then it can shadow FOO, which means it would import all the exported symbols of A, except for FOO. IF C wanted to import both A and B, then a correctable error would be signaled; in a typical implementation the corrections would include: using A:FOO in preference to B:FOO; using B:FOO in preference to A:FOO; and aborting the attempt.
The only thing not portably possible to my knowledge is to rename a symbol in an importing package, e.g. making C:FOO2 be an alias for A:FOO. If one thinks about what a symbol is, that makes sense. What would `(symbol-name c:foo2)` be in that case?
All this, BTW, is why I am so keen on Common Lisp as opposed to Scheme: it really is an industrial-strength Lisp. They standardized a lot of stuff twenty years (and three days!) ago, after a lot of thought. Yeah, there's some compatibility cruft, but it's generally not a big deal (and can mostly be hidden, if one wanted to create a MODERN-LISP package).
Ok, I think that make sense. Basically, importing it lets you refer to the symbol in an abbreviated way. But it is always that symbol. If there was some odd case where you wanted every ??:FOO to be overridden, it would be tougher. (No, I am not trying to make a case for that scenario. Just making sure I understand.)
And I understand what you are saying about CL being nice. I'm currently going through Land of Lisp and it is interesting to see how many things are covered. Granted, even scheme has this, to an extent. Reading SICP feels like reading a book on what was going to rise and fall in programming in the years to come.
Not being able to use a short name for a symbol even if it's only used in its own package is a big fail. Imagine that you have to call someone in your family by full name every single time.
I think I would like very much a namespacing system similar to .NETs, but with a few additions:
"using System.*";
Import all of the types in the System namespace as "using System;" does now, while also cutting the "System." prefix off of all subnamespaces of System. So "System.Web.HttpContext" would just be "Web.HttpContext", and "System.Drawing.Drawing2D.QualityMode" becomes "Drawing.Drawing2D.QualityMode". Currently, truncating namespaces like this is only possible for classes within the root namespace.
"using Some.Namespace.SomeClass;"
imports only the SomeClass class out of Some.Namespace. This can be done right now with "using SomeClass = Some.Namespace.SomeClass;" but I don't like to repeat myself (though the ability to rename the class is nice, so I still want to keep the "=" syntax around). The syntax "using Some.Namespace.SomeClass;" is going to be available in the next version of C#, but it will be for importing all of the static methods from a static class and making them look like bare functions. For that, I'd like...
"using Some.Namespace.SomeStaticClass.*;"
this would be the syntax to import static methods as bare functions.
There are two ways you can sensibly name the publicly visible symbols of a lisp package.
One way is to assume that the user won't import the package; in this case you'd probably name your functions and types in very generic ways. For example your HTTP package would have a REQUEST class, which would be referred to from outside the package as HTTP:REQUEST. People trying to import packages written in this style will eventually run into symbol clashes, and resolving those through shadowing is no fun.
The other way is to design the package for importing. In this case the HTTP package would have a HTTP-REQUEST class, which would be referred to as HTTP-REQUEST from most places (and as HTTP:HTTP-REQUEST in the few cases where you didn't import the package -- but of course you would avoid that because HTTP:HTTP-REQUEST looks ridiculous).
So in that sense it's totally true that a CL-style package system doesn't really solve the problem of name prefixing. It's either going to be part of the name, or it will be in the explicit package name that you'll end up using to refer to the symbol. But who cares? There are other problems that a flat namespace causes, which would actually be fixed by a package system.
> The idea sounds nice in theory, but in practice multiple name spaces do not fit into Lisp very well. Common Lisp packages are an unclean kludge; this was clear to me when I implemented them in the 1980s in the Lisp Machine. It is impossible to use them in the way one would wish to use them.
>
> In practice, you have to write the package prefix whenever you refer to a symbol that has one. It might as well be part of the symbol name itself. Thus, packages complicate the language definition while providing no benefit.
>
> So in GNU Emacs I decided to make them part of the symbol name itself and not have packages.
Agree or disagree as you like, but there is a reason: based on his experience with Lisp Machine Lisp, RMS didn't see the point.
I don't think his argument is very convincing, because it focuses on only one aspect of namespaces: how many characters do I have to type?
Old-style programmers who came of age in the 70's and 80's were absolutely obsessed with keeping the number of characters typed as low as possible. There was a reason for this: the keyboards in those days were terrible. I typed my Master's thesis and my PhD dissertation on VT100 terminal keyboards that had a throw of about 1 cm and required a few hundred newtons to get the damned keys to move at all, much less make a clean contact at the switch. It was more like punching a speed-bag for your fingers than typing on a modern keyboard.
But reducing the number of characters typed is not the only reason for having namespaces, nor is it the most important one. When you add a feature to a language you are saying to the users, "The people who designed this language felt that this feature was important enough to implement so we think it is worth your time to use it." That kind of moral persuasion is not to be under-estimated.
In the case of namespaces, having the feature encourages developers working in that language to think more carefully about how to modularize their code.
This argument may also not be enough to justify the complexity, but when evaluating such things there should be some attempt to cover all the arguments for and against, not just pull out the one you happened to find persuasive.
I can certainly see his point of view. I feel it falls on the Worse is Better side of decision making when he says they make the language definition easier.
While packages may not be a great solution they do at least make the the concept of namespacing explicit and a first class citizen of the language.
In a way I agree with this at the level of a package's external interface, but I think it's nice that with CL style packages you can call your local functions by their natural name. E.g. it may be "my-package::string-split", but because it is only used within "my-package" it can always be referred to as "string-split".
I don't see how packages can be worse than the lack of packages: instead of foo-bar-baz (the emacs convention), one has at least the option of writing bar-baz at best or foo:bar-baz at worst (and, if doing something naughty, foo::bar-baz).
Moreover, packages enable one to do things like lock down certain packages, e.g. for security in a multi-user system.
I will say that for all its faults, RMS's dynamically scoped un-namespaced Emacs Lisp interpreter was far better designed than James Gosling's MLisp interpreter. He honestly copped to how terrible a language MockLisp was in the 1981 Unix Emacs release notes:
12.2. MLisp - Mock Lisp
Unix Emacs contains an interpreter for a language
that in many respects resembles Lisp. The primary
(some would say only) resemblance between Mock Lisp
and any real Lisp is the general syntax of a program,
which many feel is Lisp's weakest point. The
differences include such things as the lack of a
cons function and a rather peculiar method of
passing parameters.
The "rather peculiar method of passing parameters" was an understatement!
When you called a function and passed it parameters that were MLisp expressions, they were not actually evaluated until the CALLED function PULLED the values with the (arg index "prompt") function, at which time the MLisp expressions from the CALLING were evaluated in the dynamic context of the CALLED function!!!
The official convention that we would use at UniPress to develop MLisp libraries that would work together without name clashes was:
Prefix all local variable names with an ampersand, then a (hopefully) unique abbreviation for the function name, then a dash, to make them unique across all known code.
So you could not use the same local variable names between functions that used each others parameters, because expressions passed as parameters would be evaluated with the bindings of the called function!
(I've written a couple of ; comments in the code to explain how it works.)
MLisp has a rather strange (relative to other
languages) parameter passing mechanism. The arg
function, invoked as (arg i prompt) evaluates the
i'th argument of the invoking function of the
invoking function was called interactively or,
if the invoking function was not called
interactively, arg uses the prompt to ask you for
the value. Consider the following function:
(defun
(in-parens ; The name of the function.
(insert-string "(")
(insert-string (arg 1 "String to insert? "))
(insert-string ")")
)
)
If you type ESC-X in-parens to invoke in-parens
interactively then EMACS will ask in the minibuffer
"String to insert? " and then insert the string
typed into the current buffer surrounded by
parenthesis. If in-parens is invoked from an
MLisp function by (in-parens "foo") then the
invocation of arg inside in-parens will evaluate
the expression "foo" and the end result will be
that the string "(foo)" will be inserted into
the buffer.
The function interactive may be used to determine
whether or not the invoking function was called
interactively. Nargs will return the number of
arguments passed to the invoking function.
This parameter passing mechanism may be used to
do some primitive language extension. For example,
if you wanted a statement that executed a statement
n times, you could use the following:
(defun
(dotimes n ; n is not a parameter, it's
; just being "declared" as a local.
(setq n (arg 1)) ; The first argument to dotimes
; is the number of times to do
; the second parameter.
(while (> n 0)
(setq n (- n 1))
(arg 2) ; The second argument to dotimes is
; an expression to evaluate.
; Each time you (arg 2) is called here,
; the caller's second parameter
; expression is evaluated in the
; current context.
)
)
)
Given this, the expression
(dotimes 10 (insert-string "<>"))
will insert the string "<>" 10 times. [Note: The
prompt argument may be omitted if the function can
never be called interactively].
Here's an old Emacs manual with more information about the wonders of MLisp! The Emacs mascot (and icon) was a unicorn, because you needed one of your hands to hold down the control key, and your other hand to hold down the meta key, and a horn on your head to type the letter.
What is good about namespaces? Well, it is ability to say something like
(require [blah-blah :as b])
(b/map f xxxx)
The key idea here is not to shadow map in global namespace, if there is one.
Of course, we could just name procedures like, say "list/map" and "this/that" and this is the simplest but clumsy solution.
The great idea from Erlang, is that we could export just few procedures from a module, and hide all the helper procedures, so they will not pollute the global namespace.
Together these ideas are good and, perhaps, worth implementing. But do not forget that just naming issues could easily be solved by having packages and "qualified imports" or "full path" naming like "list:map" or "list/map".
Completely off-topic, but does the signature bug anyone else?
> Skype: No way! That's nonfree (freedom-denying) software.
> Use Ekiga or an ordinary phone call.
"Skype is a closed-source, proprietary infrastructure, and that's bad. Therefore you should avoid it. Please consider using AT&T's closed-source, proprietary infrastructure instead."
It's not the infrastructure that he cares about. He believe that any provider you use should be free to run their infrastructure as they see fit, releasing or not releasing any code as they see fit. It's the fact that the software that you run, on your end, is non-free.
With Skype, you cannot connect your own client to their network; you must run their proprietary, highly secretive client.
With the phone system, you can attach any device that you want. You can use a dumb, analog phone, or a digital phone running free software, or a digital phone running proprietary software, or modem, or a fax machine, or whatever, to the phone system.
Now, he also does have other concerns about the infrastructure; he would generally prefer to support infrastructure running on open tools and technologies, and may have various other concerns with particular providers.
But he sees that as a separate question of the ethical question about running non-free software on his own computers.
I've wondered about this before, and had to explained to me. Basically, RMS only considers software "eligible" to be free if it could, in theory, be moved to other hardware of the customer's choice. It is the ability to put the software somewhere else that is the "freedom" being abridged by closing the source. On the other hand, software that is fundamentally tied to hardware (such as firmware or infrastructure control) couldn't reasonably be moved somewhere else, so close-sourcing that software is not impinging on any freedom that would otherwise exist.
It's a fine distinction to make, but I get it. I don't necessarily agree with it, and I think it's a strange place to draw the line, but I can understand the logic behind it.
If I'm understanding this right, then the only part of Skype that RMS objects to is the desktop client. The network and infrastructure being closed-source does not remove freedom. (NSA is a separate consideration)
As has been said, if the software is utterly hardware-specific firmware, such as the software that runs inside a hard drive that handles bad blocks and head movement and so on, he's OK with it being proprietary. A phone which somehow had software of that sort (to run an integrated answering machine?) would presumably be OK with him as well.
RMS is eccentric, principled to a degree that most of us have a hard time understanding, and probably more than a little nuts. However, everything I've seen about him indicates that he's highly intelligent, even if he uses it in unconventional ways. If you think you've spotted such an elementary error, it might be worth to stop and think about what you might be misunderstanding, rather than going straight for the conclusion that RMS can't spot the obvious contradiction.
RMS is the first politician that codes. Once you understand that most of what he rants about actually makes some sort of sense. He's appealing to his base by making political statements that make little sense to anyone who isn't an acolyte.
However, you'd think given his issues with UNIX he'd still be a staunch advocate of avoiding AT&T.
Mostly, the hilarity of RMS recommending AT&T for some reason makes me think of that scene in the Matrix where Agent Smith asks Neo what good a call is if you can't speak.
My apologies to the downvoters for questioning the dogma of St IGNUcius of the Church of Emacs.