I'm still waiting for someone to do a modal abstract syntax tree editor. You would only type text when naming something, the other mode would be for navigating the AST.
Have you used paredit-mode in emacs with a lisp dialect? Getting proficient with this mode can be a lot like what you describe. Paredit let's you navigate and edit the tree structure of lisp code pretty effectively. It's not inherently a modal paradigm, but I used it evil-mode successfully. I'd imagine what you describe could be a refinement on that technique. Lisp lends itself well to this type of editing due to it's lack of syntax. Other languages are more difficult.
Visual scripting is nice in that you can have a seperate presentation layer on top of the language syntax. There's a lot of potential in improved readability that way I think. Most visual scripting depends on mouse actions, which is kind of slow and I suspect a big reason why programmers dislike it so much.