Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Homoiconic Faq (c2.com)
49 points by networked on Sept 21, 2014 | hide | past | favorite | 31 comments


Conclusion on the page:

> Q: Is Homoiconic much ado about nothing ? A: Yes, after much wasted bandwidth on c2 this seems to be the only logical conclusion.

I would add my $.02 to the discussion. Homoiconicity is the "same" representation of program and data, and they give Lisp and Tcl as examples. I want add the following to the philosophical discussion.

Q: Are program & data representations really the "same" if data can represent invalid programs? E.g. the form (nil 3 4) is valid data, but invalid program in Lisp.

I would answer "no" to this question.


(nil 3 4) is a completely valid program if the "nil" symbol is bound to a lambda, so I'd argue that yes, it is still homoiconic.

If you're looking for a language where every valid data structure is a semantically valid program - it doesn't exist.


    $ sbcl
    * (defun nil (x y) (+ x y))
    * (nil 3 4)
    7
And there's really no way to systematically account for what may or may not be in the namespace without completely missing the point of homoiconicity.


> I would answer "no" to this question.

You're declaring a conclusion that goes against the outcomes of several pages of heavy argument. When you do this, it's customary to attempt to support your position with arguments. :)

An only-somewhat-related anecdot. I remember sitting in the car with my sister and mother when I was young. My sister was learning to write. On this day, she asked what a fictitious word meant. She showed us a word she had written down, and then said it out loud. We didn't know what she meant. But we worked out that was playing with an idea: does the fact that we can use the tools to represent an idea (in this case, that you can form apparently words from letters) mean that it maps to an idea. As it happens, no.


> You're declaring a conclusion that goes against the outcomes of several pages of heavy argument. When you do this, it's customary to attempt to support your position with arguments. :)

The argument is the conclusion from the bottom of the page, which I quoted: much ado about nothing.

Then there's a long discussion about how machine code is not homoiconic, which I absolutely disagree with. In von Neumann architecture, and modern CPUs data == code, otherwise you couldn't make a JIT compiler.


Some people deliberately miss the point. Oh well, replace "nil" above with "()".


This is my understanding of homoiconicity. There's a good chance that I'm wrong. Comments and clarifications welcome!

-----

Is homoiconicity about concrete syntax? I think the answer is no , but I also think that this confuses lots of people because it's not explicitly stated.

Then is homoiconicity about abstract syntax? Let's say we have some arbitrary language, for which there's an eval function (input: abstract syntax tree, output: some data structure) whose job is to execute/evaluate a program in that language. Homoiconicity, then, is the case where the output data structure is also an abstract syntax tree, or conversely, that an abstract syntax tree is expressed in data structures of the language.

So for homoiconicity, the input and output of your eval function are of the same type (of course the type may be a complicated algebraic sum type etc.). Thus the input and output have the same domain. This means that you can take the output from eval, and throw it back into eval again without getting a type error.

If the input and output types of the eval function don't match, then the language isn't homoiconic.

Is this correct?

What I don't get is whether the eval function is available from inside or outside the language, or both.

I'm also unsure of whether homoiconicity is a property of a language, or of an implementation of a language.


I'no authority on this but I think you are right and gave the best explanation, even definition of homoiconity I have read so far: You can take the output of eval and pass it back as further input to eval. In other words input and output of eval must have the same data-type.

A good example is JavaScript, often called a "lisp-like" language. In JavaScript you have the function Function() which takes a String as input and produces a Function. Thus a JavaScript program can create another program which it executes within itself. But it is not homoiconic: input of Function() is a String, and its output is a Function. You can not pass the output of Function() as an input to another call of Function().


The most coherent definition I can come up with is that homoiconicity describes the relationship between language syntax and a data representation, such that all valid programs can be parsed by a conforming data parser, and the parser output contains all the information required to compile or execute the program.

Thus, all programs are homoiconic to some data format (if only flat text files). The interesting thing about Lisp is that it's homoiconic to s-expressions, which it also provides powerful tools to manipulate. This paves the way for its macro system which provides compile-time support for processing s-expressions and treating the output as input to the compiler.


Your definition (already rather loose) covers only one direction. The point of homoiconicity is that code and data are the same, so they must be easily interchangeable both ways, and also during runtime.

Even if you would consider flat text as an acceptable data format for homoiconicity, most languages don't fit that criteria. Can a random program in a random language (not expressly written as a quine) easily obtain itself in this representation? Can I easily write a function that, when passed any arbitrary function as an argument, obtain the flat text representation of that function, modify it, and execute that modified function?


Does this even work properly for Lisps with lexical scoping? I'm not that familiar...

Modulo scoping, you can do that in JavaScript.


I think the word syntax is ambiguous. Is homoiconicity about concrete syntax? Abstract syntax? Both? Some other kind of syntax? Does parsing have something to do with homoiconicity?

I don't know the answers to these questions.


A simple example of homoiconicity:

(+ 1 2) is a list containing the 'plus' symbol the number 1 and the number 2.

(+ 1 2) is a program which when ran gives the number 3 as the result.

They are the exact same thing.


If one is looking for degrees of homoiconicity, a useful question to ask is how hard it is for a program to modify its code and execute the result. In Lisp, it's very easy, so Lisp is meaningfully homoiconic.

For a C program to modify its code, it needs access to its source code (which it might not have) and a compiler (also not necessary) and the programmer has to write rather contrived code to modify the source, compile it, fork, and start executing the result. So C is not meaningfully homoiconic.

Or I suppose the C program could modify an executable directly, but that is even more contrived since it requires the programmer to work with a completely different representation (machine code) than the language of the original program.


That has little to do with homoiconicity.

What you wrote about is called 'reflection'

https://en.wikipedia.org/wiki/Reflection_(computer_programmi...

> reflection is the ability of a computer program to examine (see type introspection) and modify the structure and behavior (specifically the values, meta-data, properties and functions) of the program at runtime.

Ignore the rest of the Wikipedia page, it is not really useful, since the examples show only very simple reflection capabilities.


That wikipedia definition, IMO, is weird. I would limit reflection to modifying values, metadata and properties, not to modifying functions.

Limiting it to that is what allows strongly-typed somewhat ahead of time compiled languages such as Java and C# to have reflection.

And yes, C# has support for creating new functions at runtime, but that is not reflection, not in my book, and not in Microsofts's (if it was, they would have put it in the reflection namespace)

Modifying code is a step up. For example, in lisp, it is fairly easy to grab a function, walk its tree for use of integers, and replace every '3' by '4' and every usage of the '+' function by '-'

And the wikipedia examples are amazingly bad, indeed. eval, as in the php example, has nothing to do with reflection. Neither does Objective-C's performSelector. You want to show respondsToSelector; that's what does the introspection part.


In Lisp it is relatively easy to change a function. A typical thing is 'advising', where some code is added to a function. This had been done to ordinary functions decades ago and Common Lisp added it as a standard to its object system.

> Modifying code is a step up. For example, in lisp, it is fairly easy to grab a function, walk its tree for use of integers, and replace every '3' by '4' and every usage of the '+' function by '-'

Actually it is not easy. Actually walking Lisp code is relatively complex. The code walker needs to understand built-in Lisp constructs, macros and more.


To be fair. The point, I believe, is that for a lisp program to alter itself, it only needs access to its source code. Which is also all it really needs to execute itself.

More impressively, the modified lisp program could easily serialize itself out for study and modification. Languages that use reflection can't easily do this.


Is x86 machine code homoiconic? Or, generalizing, is any/all VN-arch machine code homoiconic? It's pretty obvious that (pure) Harvard-arch machine code isn't, but VN?

(I'm talking about the interface here. Yes, I know that modern x86 processors are not really VN-arch internally. But they present the interface of a VN-arch processor to the programmer, which is what I mean here.)


I would argue: Yes.

When people write machine code, you'll note they find it very easy to resort to code-that-writes-code and data-as-code-templates.


After reading this FAQ I got an impression that language with built-in AST reflection capabilities (i.e. equivalent of quasiquotes) should be considered homoiconic in a sense that the language exposes code as data. For example: Scala [1], Elixir [2] and Julia [3] all have native AST manipulation capabilities. But are those languages homoiconic?

[1] http://docs.scala-lang.org/overviews/quasiquotes/intro.html

[2] http://elixir-lang.org/getting_started/meta/1.html

[3] http://julia.readthedocs.org/en/latest/manual/metaprogrammin...


'Homoiconic' might want to require that the code and the code as data have the same internal (internal code and internal data) AND external representation (external code and external data). Also it would be really good if we don't need special AST data structures.

Thus we can read source code and then printing it makes it look the same (usually with some delta).

    CL-USER 6 > (let (program)
                  (format t "~%Enter a program: ")
                  (finish-output)
                  (setq program (read))
                  (format t "~%~%The entered program is: ~a" program)
                  (finish-output)
                  (format t "~%~%The result is: ~a~%~%" (eval program))
                  (values))

    Enter a program: (+ 1 2)  ; user input of a program

    The entered program is: (+ 1 2)  ; program output

    The result is: 3


Elixir's creator (Jose Valim) says no.

"PS: Elixir is not homoiconic but we do have Lisp-style macros."

https://news.ycombinator.com/reply?id=8337087&whence=threads...


I found, that implementing a LISP in a language you know, helps understanding.

I saw one which worked with JavaScript arrays and eval(): ['add', 1 , 2] etc.

It was rather crude, but it helped me.

But I didn't go deeper and tried to understand how Homoiconicity could improve my code.


True. I've seen very succint python eval, and my 'favorite' would be https://news.ycombinator.com/item?id=3511100


Homoiconicity is either a trivial property (we can define an AST!) or a not-relevant-to-macros one (probably best expressed as "we have quote", which is basically true of JS).

This post by Dave Herman nicely expresses the problems with homoiconicity as a concept: http://calculist.org/blog/2012/04/17/homoiconicity-isnt-the-...


As for some kind of material test, I propose that a language which can be expressed in itself in a page, has the desired properties behind what people mean by homoiconic.

The fact that the small set of primitive abstraction LISPs (even though cons, read aren't necessarily trivial) use are enough to express/manipulate LISP. Doesn't have to be a data-structure thing.


Doesn't make much sense. Anyway, the whole concept of 'homoiconic' is underdefined. Plus: the word 'homoiconic' does not really fit well.

In Lisp we talk about that code is data, other than a string.

Thus we can construct programs by constructing data using the usual primitive data structure functions:

    ? (list '+ 1 2)
    (+ 1 2)
Note that Lisp does not use any special purpose data-structures for code. Nothing like special constructors for AST data structures. One does NOT have to call a MAKE-FUNCTION-CALL procedure. Lisp also does not need to construct programs as strings, where I would have to parse the language with string/stream operations to access individual elements of the program. A number 2 would be represented as text, not as numeric data as in Lisp.

Above is a valid program and it is a list of three elements. There are no strings involved. + is a symbol and 1 and 2 are actually real numbers.

Let's look at the 1:

    ? (describe (second (list '+ 1 2)))
    Fixnum: 1
The interface to executing code is the function EVAL. It takes code as data, and not a string.

    ? (eval (list '+ 1 2))
    3


I never mentioned or meant to imply strings anywhere.

I'm saying that lisp abstractions: cons, car, cdr, eq*, if, lambda... are enough to write eval in a page. To me the only reason it can be so short is that the abstractions and the interpreter are close to some sort of fix-point, an almost infinite regression in desugaring, and that's what people are thinking about when talking about code as data and homoiconicity.


Is FORTH homoiconic?


FORTH is iffy. But ultimately I'd say no.

Many implementations of FORTH allow something close to homoiconicity. But portable FORTH isn't homoiconic.

Portable FORTH has no issues with dynamic code generation. But it balks at dynamic code manipulation. SEE and such aren't portable. You cannot manipulate an already-compiled word in portable Forth.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: