Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Employee of Cycorp here. Aside from the current ML hype-train (and the complementary unfashionability of symbolic AI), I think the reason symbolic AI doesn't get as much attention is that it's much more "manual" in a lot of ways. You get more intelligent results, but that's because more conscious human thought went into building the system. As opposed to ML, where you can pretty much just throw data at it (and today's internet companies have a lot of data). Scaling such a system is obviously a major challenge. Currently we support loading "flat data" from DBs into Cyc - the general concepts are hand-crafted and then specific instances are drawn from large databases - and we hope that one day our natural language efforts will enable Cyc to assimilate new, more multifaceted information from the web on its own, but that's still a ways off.

I (and my company) believe in a hybrid approach; it will never be a good idea to use symbolic AI for getting structured data from speech audio or raw images, for example. But once you have those sentences, or those lists of objects, symbolic AI can do a better job of reasoning about them. Pairing ML and symbolics, they can cover each other's weaknesses.



Many aspects of this topic deserves extensive study. For example, ML is all about generalizability, ever since the deep learning flushed the field, it seems like the numeric representation (tensor) always yields better generalizability than the symbolic representation. Is it true? Or, under what circumstances, symbolic representation helps?

In the past couple of years, several papers have shown that predefined symbolic relationship can improve over vanilla DL. For example, recognizing a picture of numerical arithmetic equations and compute the result. This is very difficult for neural networks to parametrize over the pixel space.

Moreover, statistical generalizability is currently all derived from concentration theories. This means that the knowledge encoded by a neural network model depends on the statistical distribution of the data. If two people have two different sets of data, they might end with two very different models. Symbolic generalizability is quite different. A rigorous mathematical proof holds true as long as we all live in the same world with the same set of axioms. For example, no person can appear in two physical places at the same time. Or, if A causes B, A has to occur before B occurs. These knowledge can't be learned through statistical methods with no symbolic priors. We postulate the logic first, then verified through observations.

At last, problems that statistical learning handles well so far are essentially interpolations of the collected data. Being able to extrapolate well is still an unknown. Will inductive logic programing work better in this scenario?

Symbolic is the signature of human intelligence. All of our scientific breakthroughs are encoded by symbols, even the DL (deep learning) stuffs. It won't be replaced by the numerical paradigm completely in any time soon.


I've been following Cyc since the Lenat papers in the 80s. Wondering what happened to OpenCyc, if you guys changed your thinking about the benefits of an open ecosystem, and if there's any future plans there?


I've only been here for a couple years, so my perspective on that is limited. My understanding is that we still have some form of it available (I believe it's now called "ResearchCyc"), but there isn't a lot of energy around supporting it, much less promoting it.

As to why that is, my best guess is a combination of not having enough man-hours (we're still a relatively small company) and how difficult it has historically been for people to jump in and play with Cyc. There could also be a cultural lack of awareness that people still have interest in tinkering with it, which is something I've thought about bringing up for discussion.

As to the accessibility issue, that's been one of our greatest hurdles in general, and it's something we're actively working on reducing. The inference engine itself is something really special, but in the past most of our contracts have been pretty bespoke; we essentially hand-built custom applications with Cyc at their core. This isn't because Cyc wasn't generic enough, it's because Cyc was hard enough to use that only we could do it. We're currently working to bridge that gap. I'm personally part of an effort to modernize our UIs/development tools, and to add things like JSON APIs, for example. Others are working on much-needed documentation, and on sanding off the rough edges to make the whole thing more of a "product". We also have an early version of containerized builds. Currently these quality-of-life improvements are aimed at improving our internal development process, but many of them could translate easily to opening things up more generally in the future. I hope we do so.


Good write-up, confirms my suspicions. Thanks for your thoughts.


There's an official statement of sorts here: https://www.cyc.com/opencyc/

That meshes with what I've heard at conferences, that Cyc management was worried people were treating OpenCyc as an evaluation version of Cyc, even though it was significantly less capable, and using its capabilities to decide whether to license Cyc or not. The new approach seems to be that you can get a free version of Cyc (the full version) for evaluation or research purposes, and the open-source version was discontinued.


What kind of experiments have you guys done that combine symbolic and statistical/ML methods? It sounds like an area ripe for research


I've built Bayesian non-parametric methods that performed inference on certain formulae, FOPL subsets or even Turing-complete programs. IMHO, it's a very exciting field that will bloom midterm.


This sounds fun. Do you mind giving a longer description?


I can't say much due to contract. But I can point you to a good source to get started:

https://v1.probmods.org/learning-as-conditional-inference.ht...


I know we use ML to "grease the wheels" of inference; i.e., Cyc gains an intuition about what kinds of paths of reasoning to follow when searching for conclusions. I don't know of any higher-level hybridization experiments; I think we only have one ML person on staff and mostly our commercial efforts focus on accentuating what we can do that ML can't, so we haven't had the chance to do many projects where we combine the two as equals.


To clarify the above:

"Cyc gains an intuition about what kinds of paths of reasoning to follow when searching for conclusions"

The possible paths come purely from symbolics. But that creates a massive tree of possibilities to explore, so ML is used simply to prioritize among those subtrees.


Basically you are learning the heuristic? Do you have any public information on that? That something I have always wanted to work on and really think it could be a shortcut to AGI...


Hmmm.

> I (and my company) believe in a hybrid approach

> I don't know of any higher-level hybridization experiments

That contradiction and the admission that Cyc only has "one ML person on staff" signals to me, an outsider, that the belief of parity between Machine Learning and "Symbolic" might be predicated more on faith than on reason.


I would say "more on theory than on empirical evidence". It's entirely reasonable; the way your eye "thinks" is entirely different from how your higher cognition "thinks", but you need both. If you want something more concrete, here's a recent experiment done by MIT in this realm:

https://news.mit.edu/2019/teaching-machines-to-reason-about-...

We aren't an ML shop ourselves; we don't claim to be. Given that we have around 100 people, we focus on what we have that's special instead of trying to compete in an overcrowded market. The idea of hybrid AI is something we see as a future part for us to play in the bigger picture of machine intelligence.


Wow, I should have read further ahead in the comments, before dumping my first thoughts [1] as a standalone. How do you interface between the distinct parts of your machinery? Do you use deeper level neural network representations/activities as symbol embeddings?

[1] https://news.ycombinator.com/item?id=19717680


We're beginning to run up against what I may not be allowed to talk about :)

But I will affirm that Cyc is fundamentally symbolics-based. We don't position ourselves as anti-ML, because it's seriously good at a certain subset of things, but Cyc would still be fully-functional without any ML in the picture at all.



I wish Eurisko got more love back in the days...

I'd love to experiment with an automated planner (a good old symbolic AI technique) but use deep learning to design the heuristic. It feels like a lot of reinforcement learning techniques are becoming close to this kind of things.

AlphaGo was a rough implementation of that. Do you know of some efforts from symolic AI in that respect?

And do you still accept candidates ? :-)


Doug loves to reminisce about that battling ships game; if I didn't know better I'd think he was prouder of that tournament than of Cyc itself ;)

> And do you still accept candidates ? :-)

If you mean job candidates, then yes, we definitely do!


Well does Cyc have similar success at overperforming humans?

I am employed now but I'll probably send a CV when I am looking for a change!


I don't honestly know if we've done any comparable experiments with it; we still operate somewhat like a research shop, but we've been dependent on real-world contracts since the 90s so we haven't had as much opportunity for pure research. Not for a lack of interest, of course.


I'd love to see you opposing alpha star in a StarCraft tournament. If you have Doug's ear, pitch it to him, he may miss playing with battlecruisers :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: