Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Implementing strict three-point perspective (pomax.github.io)
133 points by TheRealPomax on June 5, 2021 | hide | past | favorite | 51 comments


This is not how geometrically-correct pinhole-camera perspective projection works; Pomax is approaching the vanishing points exponentially rather than hyperbolically, which is why straight lines turn into curves. If you work out the geometry of the pinhole camera setup [edit: with a flat imaging plane], you find that you're just doing a division (conventionally, dividing by the Z-coordinate in camera space, though you don't have to set it up that way) instead of an exponential. This is a "projective transformation", and those do map straight lines to straight lines: https://en.wikipedia.org/wiki/Projective_transformation

If you have a sufficiently wide field of view, even the standard projection transformation gives you things that look pretty "distorted", which is a fun trick often used in art photography. Also fisheye lenses are often designed to do a non-projective transformation and get effects you can't theoretically get with a pinhole camera, like mapping straight lines to curved lines as Pomax does.

Pomax's invention is super cool, even though it isn't the usual 3D→2D mapping motivated by pinhole-camera geometry. Weird is good! I wonder if there's a version of it in which straight lines become curves, but polynomial curves of some reasonable degree, or maybe something involving square roots and division, so we can rasterize them without computing a bunch of exponentials? Is that even a reasonable way to seek optimizations in 02021?

Pomax may have a good point that projective transformations aren't "strict three-[vanishing]-point perspective" because, instead of having three vanishing points, they have an infinite number of vanishing points, a point others have made in the comments here. The profoundly weird and cool aspect of Pomax's invention is that every line converges to one of those three points.


You can get distorted projections with pinhole cameras too, as long as your sensor is curved. Fisheye is impractical, but cylindrical projections are easy. Common way to do pinhole photography is with an oatmeal box, which is a cylinder.

https://sdsu-physics.org/assets/PDFs/oatmeal_pinhole_camera....

I'm not really trying to nitpick here, just like talking about pinhole cameras. With a flat image plane the pinhole lens is "rectilinear", but the beauty of pinhole cameras is that you can project the image onto any shape you want without affecting focus.


Oh, that's an excellent point! I guess I didn't even think about that, which shows how much experience I have doing actual pinhole photography in the physical world :)


> Pomax is approaching the vanishing points exponentially rather than hyperbolically

Exactly this. To put it in drawing terminology, the distance of a line towards the vanishing point is related to the angle from the camera (station point) projected back to the picture plane. Images like this may be of assistance for those of a computer and not hand drawing background https://guidetodrawing.com/site/assets/files/1082/gtd-235.79...

An exponential ratio as used by the article could maybe be described as assuming the camera is 0 distance from every point drawn.

[Edit: Really great article though - the kind of thing I love to see is looking at a complex problem with established solutions from first principles again]


That diagram is very confusing if you aren't aware that (1) the object being projected is a shed; (2) the perspective projection is incomplete.

Full article: https://www.guidetodrawing.com/linear-perspective/two-point-...

I'm taking an online drawing perspective class right now, and finding that much of what I've been learning is covered right on that site! Pretty neat stuff.


> An exponential ratio as used by the article could maybe be described as assuming the camera is 0 distance from every point drawn.

Hmm, wouldn't that give you an orthographic or isometric projection, or another member of that family, rather than this crazy exponential thing?


It seems to me that if you want some fancy curvilinear perspective, the simplest way is to render the scene normally (or make six renderings in a cubemap if needed), then texture-map it onto a suitable shape. That should work for any rearrangement of light rays, as long as they are straight and come to a point camera. Might get more fun if you want curved light or a camera extended in space :-)


Well, that won't give you the "strict three-point" property, where all 3-D lines asymptote to just one of those three points. More broadly, there are a wide range of mappings from 3-D to 2-D that use information that is lost in the standard divide-by-Z technique; some of them are used in automated optical inspection systems, for example.


I think a lot of this confusion stems from the author framing this project in a way that is, frankly, nonsense. Vanishing points have nothing to do with the clipping plane, and 3D computer graphics isn't trying to approximate the coordinate transform described in the post, it's trying to emulate real-life optics. The sense in which this definition is "strict" is based on ignoring that N-point perspective is just a tool for drawing lines parallel to the coordinate axes with correct perspective. If the author had framed this as just a fun "what if", making every line converge to the vanishing point, it might have been better received.

The observation that some lines look strange is about as meaningful as the same observation on an any given map projection. It's definitely true, but it all depends on the choice of mapping function, nothing inherent to spherical geometry.

I choose map projections for a particular reason: as has already been pointing out, there's no connection between this "strict" definition and an exponential coordinate transform. Any function mapping the axes that grows faster than polynomial would produce the desired behavior (essentially, you need the ratio between z and x on a line a*x + b*z = c to approach zero or infinity), but would produce different "line" shapes. This choice is entirely arbitrary, and is in fact equivalent to simply mapping the underlying space by a related transformation and then using a normal projective transformation.


> The sense in which this definition is "strict" is based on ignoring that N-point perspective is just a tool for drawing lines parallel to the coordinate axes with correct perspective.

The thing is, the author claims that students in art classes are taught that in n-point perspective, all parallel lines converge at one of n points. In defense of this, they have [1], which says

> in basic one-point perspective, lines are either vertical, horizontal or recede toward the vanishing point. In two-point, lines are either horizontal or recede toward one of the two vanishing points. In three-point perspective all lines recede toward one of the three vanishing points.

I don't think I've ever heard this before. Maybe what's happening here is that artists are used to thinking about works like [2], in other words cases where the point of the work is to show off 1, 2, or 3 point perspective and so you deliberately minimize lines in the image that are not parallel to the scene's axes. Of course, artists surely realize that this can't possibly apply to all parallel lines in the scene, as the artist's own drawing in [1] illustrates: see my edit here [3].

Still, I find the OP's work interesting as a possible implementation of what it would mean to take this mistaken description of n-point perspective seriously.

[1] https://www.craftsy.com/post/three-point-perspective/

[2] https://upload.wikimedia.org/wikipedia/commons/6/6b/One_poin...

[3] https://ipfs.io/ipfs/QmTLmrBQz21cgUX3qZJvrhS4zHrENyWyvbNJRrU...


If you fix your incorrect exponential formula

    1.0 - 1.0 / pow(base, step)
to the correct inverse-proportional formula

    1.0 - 1.0 / (1.0 + log(base) * step)
(where I’ve written log(base) just to match the effect of your existing parameter near the origin), you should find that your curved lines become straight.

If the “base” parameters are chosen correctly, you should also find that these fixed results match the ones from usual computer graphics.


Except this article isn't interested in "the usual computer graphics". We already know how to do that one, it looks great, it barely has any problems, and on the whole if you want to render things in a cool perspective, use a camera.

The whole point of the article is to show what happens if we don't follow conventional computer graphics wisdom and instead implement it based on how it gets taught in art class, and what happens if you break the fundamental "don't get too close to your vanishing points" rule you get taught there.

Mind you, it's great advice in general, it just doesn't match up with art class, which was the whole point of this article. That said, if you want to file an issue over on the repo (https://github.com/Pomax/three-point-perspective/issues), I'm sure a new section can be added at the end going "how can we fix this?" and then show this is one of the several ways in which real world 3D graphics handle perspective.


In my technical drawing classes, I was taught that in three point perspective, lines that aren't parallel to the three vanishing axes are still straight, and you can in fact use that fact to construct additional vanishing points.

So for example, once you've established an axis-aligned cube, you can draw parallel lines across the diagonals of two opposite faces and they will meet at a new vanishing point, which you can use to construct more lines parallel to them, lying in that 45 degree plane.

It turns out that that vanishing point will lie on the line through the vanishing points that you used to define the cube axes, which is kind of interesting - each orientation of a plane in 3d space has a straight 'horizon' line in the drawing space, and parallel lines on those planes all vanish at a particular point on that horizon line.

I was specifically taught the diagonal projection technique particularly to transfer measurements throughout the perspective space - once you've established a diagonal baseline you can use it to transfer a length from one axis to another, which is great for doing things like counting out distances along an architectural facade to space windows. Laying out perspective-correct squares also turns out to be important for being able to correctly size ellipses for perspective-correct circles.

So I'm not sure where you got the idea that three-point-perspective as usually taught is a weird curved geometry. It's meant to be straight-line-preserving, which is precisely what makes it good for technical and architectural illustration. There's no such rule as 'don't get too close to your vanishing points' - it works just fine.


One thing I didn't see mentioned in the article is the motivation for choosing the particular (exponential) mapping. If you imagine replacing your function with f(s)=1-1/g(s) for some g with g(0)=1 which goes to infinity as s does, then as long as g grows faster than polynomial you'll obtain the same properties. Something like x^x, or just 2^sqrt(x). 2^x certainly seems the simplest, but was there a further reason?


Art classes don’t teach that objects appear exponentially smaller as they recede into the distance. To make something appear half the size, you need to move it twice as far away, not move it by a constant amount.


Either you never had art class in high school, or you forgot that it does not get explained that nicely. Most kids, as well as adults taking casual courses, will never learn this.

Can we go "hey we can fix all of this with a switch in function"? Absolutely. Should I ammend the article with that? Yeah, also absolutely. Is it how the vast majority of people get taught three point perspective? Very much not.

Again, please file an issue so that I don't forget to write that section, because pages on the internet well outlast their life on hackernews, but I work on a million projects and will forget about this if there's no issue =)


I did have art class in high school, and I’m aware that most artists won’t have been taught the formulas in full mathematical detail. But you’re the one describing your method as “based on how it gets taught in art class” and calling it “strict” and “true” three-point perspective (or at least calling the usual method “not true three-point perspective”). What I’m suggesting is that maybe you should examine the extra assumptions you made in addition to the rules from art class before drawing conclusions from them.

(Filed https://github.com/Pomax/three-point-perspective/issues/1 as you asked.)


And that's fair, please file an issue. I will literally not remember this conversation tomorrow, let's not lose it.


I think you might have just had a bad class on the topic. For what it's worth I watched Erik Olson's artistic linear perspective class lectures on New Masters Academy and it definitely didn't teach perspective working the way you describe.

https://www.nma.art/courses/a-complete-guide-to-perspective-...

The first lesson is free to watch too: https://www.nma.art/videolessons/perspective-1-an-introducti...


The perspective class I seen did not taught what you say.


For anyone finding this discussion in the future: the article has been completely rewritten since this posting. While perhaps historically interesting for people who find their way here in the future (if any?), note that pretty much none of the comments apply anymore.


That last image is not how artists draw rotated cubes in three-point perspective. So, I have to conclude that the author's formalization only works for lines parallel to the axes.


Indeed. As you stated, mutual vanishing points, as used in illustration, are when parallel lines of the scene are extended to infinity and converge. Animation of a rotating object requires animation of a set of vanishing points for each frame.

This is nearly deja vu for me, as a conversation that I've seen several times at the intersection of arts and computer graphics. I can only conclude that a sub-population have received instruction on "vanishing points" that were overly specific and missed the footnote on how it generalizes. Specifically: the artist and draftsman idioms of 2 or 3-point perspective or vanishing points are short cuts for perspective rendering of subjects which have cartesian layouts, such as rectilinear buildings and street scenes built on a cartesian grid, with lots of parallel or perpendicular edges. The missing footnote is that this does not work for arbitrary subjects.

Real perspective rendering has infinite potential vanishing points in one scene. Each point represents nothing more than the infinite continuation of any line segment in the scene, as it would be rendered by a true perspective rendering. The vanishing point of any one line is when the viewing ray converges with that line, to the limit of angular resolution in your rendering. All more distant segments project into the same small picture element, whether formed by raster graphics or your finest pencil, pen, brush, or engraving tool.

For a complex scene, the skilled artist would choose [edited to delete typo "three"] different vanishing points appropriate for each set of parallel lines. E.g. a vineyard with parallel rows of plantings might use different points than a road passing by at an odd angle, and a set of high-tension powerlines crossing the scene would have its own vanishing points as well. If the power lines follow a ragged course, each segment between two towers would need its own vanishing points. Furthermore, the artist would have to approximate the parabolic sag of the wire below these projected line segments, perhaps using parallel lines to locate the envelope within which the line sags.


Artist is my day job and this is a decent approximation of what a complex perspective drawing entails. And, really, it probably entails a lot of eyeballing it, too - sketch in something that feels roughly right against whatever guidelines you may have constructed, and leave it at that.

"Two-point perspective" and "three-point perspective" are good enough approximations for the common problem of "drawing a scene set within a place built around a lot of right angles".

I feel like the computing analogy to make here is "a very opinionated framework": it makes life a lot easier, if you want to do things the same way it's built to do. If you want to do stuff outside of what it's built to do you're gonna have to do a lot of the work yourself.


Artists don't draw all the way to the vanishing points, we only draw in the part that looks normal. Look at the figure showing off how straight lines behave again, and you'll see that everything starts off actually looking like normal straight lines, and things don't actually go crazy until we start getting close to our vanishing points.

Which is why on paper, you place the vanishing points outside your drawing, so your drawing can stay constrained to the part of exponential space that looks normal. E.g. if you constrain the viewport to the unit area (the triangle between 1/0/0, 0/1/0, and 0/0/1) and scale your scene down to fit inside of that, things will actually look perfectly fine, with a gorgeous perspective (but will also be visually indistinguishable from a wide angle camera positioned closed to your major scene point, so just use that).

However, with computers we can trivially see what will happen if you do try to use the full space, rather than only working inside an (incredibly sensible!) crop. The result is pretty wild.


Artists don't draw all the way to the vanishing points, we only draw in the part that looks normal

I am a professional artist and I would just like to confirm this line. 3-point perspective tends to start looking weird once you get outside of a certain sweet spot; I did so many beginner drawings with the vanishing points too close, which resulted in a weirdly exaggerated set of shapes.


The effect is also pretty dramatic and I think this is a big part of illustration work. Comic books in particular use three point perspective to add drama.

https://www.youtube.com/watch?v=wyIKZhIAl0k

The cropping aspect is made pretty explicit in Stan Lee's video. ;-)


I think there's two issues making it this way. The big one is the use of 'exponential space' based on each distance being halved along an axis, which is not accurate to how artists perspective works if drawn properly (someone smarter than me may do the math to figure out what the real formula is here).

The second point is the vanishing point for verticals is ridiculously close to the cube which makes it an incredibly tortured perspective for the eye to understand to start with.


If you're a professional artist, and you draw on actual paper, you will almost never work outside of the very first grid space. This article shows you what happens if you were to use the full exponential space, which (and let me stress this yet again) *no one should ever do*.

Any tutorial on perspective drawing teaches you to keep your vanishing points out of your picture, and to stay the hell away from them. That's solid advice, and in this article we investigate why that's solid advice.


This is just not the case. You can have three orthogonal vanishing points all inside the bounds of an image and it can look just fine. It just corresponds to having a field of view wider than 90 degrees (which looks a bit unnatural but can be a dramatic way of presenting an image).

Imagine standing on the corner of a city block, such that without turning your head you can look down both streets and up at the sky. You can see the two streets recede off into the distance and the skyscraper above you foreshorten towards a vanishing point far overhead.

Of course a vanishing point can be inside a perspective drawing, after all where do you put the vanishing point for one-point perspective?


I did set out to make a wireframe two-point perspective dungeon crawl in PICO-8 once. It worked(I recall the math for making the height of walls feel correct at different distances was hard to understand and I probably did resort to the CG approach) but I eventually abandoned it because I was spending way too much time trying to deal with occlusion and optimize it for PICO-8's CPU.

Since then the system constraints have changed and I would probably just make a tline() raycaster.


PICO-8 is such a fun piece of software. Getting lost in graphics optimization for it feels like it's more the norm than the exception =D


Getting lost in optimization is the opposite of actually making things, and as an artist, it's at the top of my reasons why PICO-8 isn't fun to use.


That's a fair point. PICO-8 never appealed to me for artistic expression, it's purely a programming challenge to cram as much functionality as possible into a tiny, tiny program.


My experiences with PICO-8 led me in the direction of making my own fantasy console...which then turned into a more thorough examination of computing itself.

I reached the conclusion that what I want is actually a lot of small source mediums that would cooperate well together and be bundled in arbitrary fashion as a single runtime engine, vs the "familiar computer platform but less" model that is intrinsic to the PICO-8 approach, which puts a lot of pressure on the platform to address all needs.

So I started working on that problem instead. What I have now is a binary format that might suffice as a common source document type. It's binary because it let me declare more precise meanings for the bits instead of offloading those meanings to a character parser, but it's also text-like because it doesn't enforce a lot of structure, it has notions of "lines" and "strings".

I'm now just about to where I can test it on media I might want to write and work in and see how well it edits.


Back in college I wrote a 3D rendering engine using what seems to be a similar method as the OP. I always thought the results looked a little strange... https://github.com/mpetrovich/CUB3D


Neat!


> Except that’s not “true” three point perspective

Is “strict 3-point perspective” a known concept or the author just invented it?

> That’s the easy-for-computer-graphics version of three point perspective

It was devised for people, not computers, over 600 years ago.

I can’t find any other references or why you’d want this odd infinite representation of space, it makes little sense except as a mathematical curiosity.


In drawing, it's just "three point perspective", and it's this. In computer graphics, every resource you look for explains three point perspective as a plane intersection (because then computers can draw it using a camera), which is not the same thing as pen and paper three point perspective. The word "strict" here is just the natural language version of the word, not a technical term. We're implementing three point perspective following the pen and paper rules, not the "let's subtly change how it works so a 3D camera can do this" rules =)

> I can’t find [...] why you’d want this odd infinite representation of space.

Like I say in the post: you almost certainly don't ever want it. When drawing three point perspective using pen and paper we typically pick vanishing points that aren't even on the paper itself (you tape down your paper and mark them on your desk instead) to get sweet looking faux architectural drawings while effectively working on a "crop" where the effects of exponential space are subtle, instead of super obvious, so it never gets weird. (heck, even adding secondary vanishing points that are further apart for working at different scales for different parts of your picture so people will never see the effect of exponential space is pretty common)

Similarly, you can get something "close enough" in any 3D software with a wide angle camera positioned close to your subject, so unless you've very explicitly setting out to do exponential space graphics, there is literally no reason for you to ever need, let alone implement this.

But it is fun to work out what the real behaviour is if we try to implement strict three point perspective on a computer, because we like programming puzzles, and (also as mentioned) there aren't any pages on the web that I've been able to find that cover this extremely niche projection so now there is at least one.


Do you have animations generated from this? I would love to see rotations in motion.

I feel like there is some game that could be created from needing to manipulate objects in this space.



Agree, none of this makes sense. Three point perspective is best explained by the concept of a pinhole camera with a planar film surface, and simply assumes that we are most interested in the theoretical directions of the 3 major axes, X Y and Z.

For it to really duplicate the geometry you see with your eyes, you need to look at the resulting photo from the spot where the pinhole is (related to the paper, and presumably turning it upside down first). This of course means you need to close one eye, or show a different image to each eye).

But computer graphics, to my knowledge(see 1 below), almost never thinks of it in terms of vanishing points, this is a convenient concept (essentially, a shortcut) for humans who are drawing on paper. Do computers ever even calculate where the vanishing points are on the drawing plane? (other than niche uses, such as an art composition app or the like?) I have never seen computer graphics software "care" about the concept of vanishing points, such as by having a variable that represents said point.

I feel weird having such a negative reaction to this article since I have used the author's bezier library for ages and have a lot of respect for his writing regarding beziers and related curves.

[1] I implemented view controls in CAD systems 25 years ago that are still in use today, and which concentrated especially on perspective views, so I have some knowledge of the subject. Also I learned perspective drawing skills in my industrial design education prior to that, and previous to that was into photography and mechanical drawing and obsessed over such geometrical stuff, starting 40 years ago now.


Computers graphics based on linear algebra can't do true vanishing points, so... no?

This isn't a tutorial on how to implement a useful three point perspective, this is an analysis of how three point perspective behaves if we don't make any computing concessions and examine the full space. You're never going to use that in 3D graphics, it looks terrible and I can't even think of a fun game mechanic that could be based on it. Just use a wide FOV camera in your software of choice and you'll get something much better. But it is a programming exercise that is worth running through.

Remember, when we draw perspective on paper, we never draw all the way up to the vanishing points, we keep them far away enough that every straight line we draw still behaves like a straight line. Things don't get crazy until you get close enough to the vanishing points for the exponential mapping to become really pronounced, and starts doing really wild things.

So obviously for an analysis of the space I'm going to draw something that is intentionally close enough to the vanishing points to show that insanity off =)


I don't know what your first sentence means. What are "true" vanishing points? What does it mean for computer graphics to "do" them?

Basic computer graphics (linear algebra etc) does indeed create images that adhere to the rules of perspective. Lines that are parallel in 3d space, when projected onto the drawing plane, will now all intersect at a point on the plane. Etc. Whether or not the program actually calculates where that point is (typically, it doesn't) is not relevant.

So what do you even mean by this? Have you defined vanishing points in some oddly obscure way that by definition can't be "done" by computer graphics?

Maybe if you started your article with explaining how 3 point perspective is simply based on pinhole camera geometry (which is closely approximated by most camera lenses), it would help convince us that you are not simply stating a bunch of nonsense. I'm sorry but I don't know what else to say. The article doesn't seem to understand the basic theory of how perspective works, or has some odd idea of what it is that doesn't align with how others think about it. If somehow this aided understanding or insight, great, but it doesn't. Instead it simply tells people "don't bother understanding this thing, it is too complicated", but for no good reason.

The article would do well to at least discuss this basic theory before delving into... weirdness.

https://en.wikipedia.org/wiki/Pinhole_camera_model


Of course computer graphics based on linear algebra can do true vanishing points. It wouldn’t look right at all if it couldn’t.

Perhaps what you’re missing is that 3D computer graphics actually uses 4D matrices with homogeneous coordinates. The extra dimension allows perspective projections to be represented, and also allows us to assign coordinates to vanishing points (points at infinity). The usual finite points are represented by (x, y, z, 1), and the vanishing point of (say) lines parallel to the x axis is represented by (1, 0, 0, 0).

https://en.wikipedia.org/wiki/Homogeneous_coordinates#Use_in...


> Do computers ever even calculate where the vanishing points are on the drawing plane?

I don't see why they would, but the vanishing points for the X Y and Z axist are just the homogenous coordinates (1, 0, 0, 0), (0, 1, 0, 0) and (0, 0, 1, 0) and putting those through your normal view transform and projection will get you the corresponding positions on screen. Note that the fourth component is 0 to represent a point at an infinite distance.


It looks like the OP has come up with their own variation on some sort of curvilinear perspective (which is a known concept in visual design, and may even be useful to model, e.g. distortion introduced by a lens) but this is not how actual 3D projection works.


Very much not, that's the whole point of this post. This is what happens when you take the maths associated with 2 and 3 point perspective, and work out what the non-cartesian properties actually mean if you were to implement it "the way it really is" on a computer.

At which point you should go "this is silly, let's never do this" because: it's really silly, let's never do this. And now we know why.


Well, you can do actual 3D projection using a fisheye lens! It might even be a worthwhile thing to do.


Isn't regular point perspective that all axis-aligned lines meet at the one of the three points, whereas here all lines whatsoever eventually converge there?

This reminds me of the mathematical trick to make a "hollow earth" work, with us living on the inside: All rays of light are bent. You can never see the curvature of the earth (or it seems that we are living on the outside). The stars that seem to be at ~infinity are at the center of the sphere, and the sun is a ball of fire orbiting around the center, and so on.


Almost: all equal ratio lines (x=z+c, x=y+c, y=z+c where c is some constant) head off in a straight line towards their respective horizons, but all other non-axis-aligned lines converge at the vanishing points.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: