That's a big question that I'm not going to suggest I can answer. But I do have some comments about it.
To provide some of my perspective (or bias) I'm going to talk briefly on about Artificial Intelligence in my background.
My research background is in Artificial Intelligence with a graduate degree from University of Illinois, Champagne-Urbana. It's a bit dated now seeing as I graduated in 1988 - wow 20 years already. After a few years working in research and development in artificial intelligence I've been in the commercial world since.
I haven't noticed a lot of change although it is now hard to find people talking about or working on artificial intelligence. It doesn't seem to be the main goal any longer - although there are aspects of it that are behind the scenes but in specialized and limited ways. I think this is partly because when one talks about artificial intelligence one thinks of trying to mimic the human which is still a big task and not immediately relevant to most companies due to the importance of their bottom line.
In my background I've seen two basic types of artificial intelligence: the computational and the symbolic. The computational can be exemplified with neural nets and fuzzy logic. Trying to define knowledge in some sort of implicit manner that is very difficult to introspect upon. The symbolic type can best be exemplified by expert systems. There have been some successes in expert systems but many have found them difficult to maintain as they grow.
Without suggesting that either type of AI is better my interest is in symbolic programming and the symbolic approach to AI and to knowledge. It seems to me that in many cases people reason symbolically - it's a more abstract level of the brain I suppose.
When one goes out on the weekend to run a bunch of errands one typically goes through a planning process to determine what order to do the errands in. There are obviously different goals involved for different people - but those goals are, in a sense, a constraint on the planning process.
When a salesperson asks a customer some questions to help select a product the salesperson is providing expert advice. Similarly when a doctor asks questions about symptoms the doctor is trying to reason toward a diagnosis.
These are the sorts of reasoning I am interested in (as opposed to other AI topics such as vision, pattern recognition, robots, and so forth). These types of reasoning, or AI, are nicely modeled using symbolic programming.
So how does all this relate to knowledge? In my opinion this symbolic AI is all about knowledge or reasoning. Computational AI seems to me like a different aspect that isn't as closely related to knowledge.
Knowledge is what I know and the ability to reason with what I know (without trying to get philosophical). And what I am interested in is the latter part - the reasoning aspect of knowledge. If what I know is my data then reasoning is the type of knowledge that processes the data in various ways.
By that definition the reasoning form of knowledge includes most any computer program. There is no problem with that (compilers were once considered AI). Some have said that the more we understand some particular AI program the less it is considered AI. In that sense AI is always pushing the boundary of reasoning in a computer program.
Data is all about us. The web is full of data. A textbook is data. There is good data and bad data. And this is the fundamental problem. Too much data and no ability to process it. If you want to find out why your fish has white spots on it you have to search the web and weed through all the data trying to assimilate the good data and how it makes sense. Or you can read a book on fish diseases. Or you go to the fish store and ask the people and they ask you questions to help diagnose the problem.
I will suggest that data, by itself is fairly useless. Data is useful when reasoned upon. Knowledge on the web is Wikipedia or Google's Knol. Or about.com. We go to google search in an attempt to find the knowledge we need.
And this is where it starts to get interesting. Google is all about information, data. But they want to know what it means and this is hard. Data can mean multiple things, it's all about how it's being used during some reasoning. Google wants to provide accurate search results with the information they have analyzed and collected. It is unclear, in the current guise, whether this is possible since there isn't necessarily enough context (or situation) provided in a search box for Google to provide the right results, no matter how hard they've analyzed data and information. Because they don't know which reasoning on the data you are intereseted in. This is an interesting and potentially difficult problem. But one that Google is working on.
Knowledge isn't just data. I can look out the window and see a blade of grass, a tree, a deck that is falling apart and the road. Knowledge is data interrelated and reasoned about in a large variety of ways.
Back to Wikipedia and Knol. These articles are knowledge as they have already analyzed a large amount of data and made sense of it with essentially one form of reasoning. There are likely to be other articles that use a different set of data, even intersecting with the first set, that make sense of the data for a different analysis.
What is a textbook but a large amount of data organized in a particular manner for the purpose of understanding the subject matter (data) so that you can reason on that sort of data on your own in other contexts or situations.
Knowledge is data that has been organized, and in that sense some reasoning has taken place to organize it. Reasoning is essential to knowledge.
And this is the current state of affairs on the web. A very large amount of pre-organized data (or not). I need to solve a problem and now I need to find the right set of organized data to help me do so. Or multiple similar sets so that I can find the solution myself by further reasoning on the data. This is the very difficult part of using the web. The organized data is not necessarily designed for my situation and I have to do a lot of work to apply it to my situation. This, in harder cases, requires a lot of learning. And when the problems for which I need an answer become really hard I typically do not have the time and will need to consult an expert.
Expert knowledge is the ability to process data on some subject matter in various ways to understand what it means. There has been a lot of learning by that expert in order to be able to do this. Salt water aquariums are interesting in this regard. There are few experts and the books available are incomplete - not much is really known on how salt water aquariums work. The aficionado typically must learn by experience. Or if inclined and has the ability to do so, work in a fish store to gain experience faster. Whereas picking a product from a store tends to be straightforward with some salesperson help.
I will now submit that knowledge is data! Knowledge is the ability to reason on data and this knowledge can be encoded, stored, and executed. Doing so makes it data that can then be used in various ways (I won't pretend to know what all these various ways are). Now we are talking about expert systems, case base reasoning systems, decision tree systems, and the like (even spreadsheets, anything that processes data).
This is what is now needed on the web. In addition to nicely analyzed data in places such as Wikipedia and Knol we need the ability to encode our knowledge that we have and make that available. The online merchant that has 50 DMT sharpening stones needs an advisor box on the web page that asks me questions about my situation and needs and suggests 2 or 3 sharpening stones that are most likely to be what I want. This is just what a salesperson would do. These specific applications for knowledge are myriad. How to put a window in. When and how to plant a tree. What are these white spots on my fish. Picking a new dog for your home.
Obvious areas are medical diagnosis systems. These can be large and complex. I am not clear whether they must be - but there are a lot of specific web sites working on that problem. Another area we have seen is in large scale international security trading.
Knowledge is the ability to analyze data in various situations in order to reach conclusions. This knowledge may be codified and available on the web just as other data is available.
This is what the Jnana Logic Server is all about.
Monday, September 15, 2008
Open Beta and What is JLS anyway?
The Jnana Logic Server (JLS) is available for anyone to build their own smart web applications. JLS is used entirely from your browser - no applications are installed on your own computer.
You can build a decision tree application using the QART (Question-Answer-Result-Tree) editor. Or you can build a case base reasoning (CBR) application using the CBR editor. Both kinds of applications may be extended by what we are calling special cases. These are situations where someone thinks a different result should be provided at the end of the reasoning - perhaps with some additional questions. In this way your application can be dynamically extended while you as the author have the ability to manage these special cases and update your application.
These JLS applications can then be used as iGoogle gadgets, or on Facebook, or linked to in any way you please from your own web pages. JLS is great for building applications for advice, how-to, or product selection. How many times does one go to an online store and find too many options or products available. Wouldn't it be nice if there was an automated advisor available to ask you some questions and help you choose 2 or 3 options that are relevant to you? Much as a salesperson in a brick and mortar store might help.
The possibilities are open to the imagination.
A more technical view of JLS
This version of JLS is hosted on the Google App Engine (GAE) hosting environment. GAE is still in a preview status. Currently all usage of it is free but includes various resource usage quotas. It's been mentioned that a paid version will be available before the end of the year. Until then it is entirely possible that JLS max out some resource usage quotas at various times; we will be doing everything we can to avoid that and if anyone notices any problems please let us know.
This version of the Jnana Logic Server on GAE currently supports decision trees and case base reasoning. The implementation is more general. Currently rules and the left hand sides of rules (situations) are used to support the operation of decision trees and the operation of a JLS application. There are JLS engines we call alerts and orientations that are used for some simple output.
JLS on GAE is a small subset of our general artificial intelligence JLS version that has been developed over the last 12 years. We have taken a general approach to declarative programming that doesn't require a JLS application author to understand imperative programming (control structures). In all cases we take a completely graphical approach to author editing - no actual programming syntax is encountered.
We take the point of view that one may use the type of logic one needs for their JLS application. This can be basic mapping rules, decision trees, CBR, constraint reasoning, distributed applications, natural language processing, factor lists and even spreadsheets. Logic can be simply thought of as take some data as input and write your outputs to some data. Writing a rule doesn't have to know anything about writing a constraint, for example. All logic can be worked on independently and in fact one could have different bits of logic that produce outputs for the same data. How the system decides what to do is managed by JLS's prioritization algorithms that operate similarly to how a human might ask questions of a user in order to get to a goal result. A human would try to ask the smallest number of questions but as some questions are answered different options or bit of logic may become relevant and different questions then asked. It's all very complex :-)
Our full JLS platform also includes more outputs and inputs. As usual, these are modelled as logics inside the system, but the application author doesn't need to understand that in most cases. These include Rest and Web Service logics, database integration logic, Chart and Table logic utilities, html report outputs, PDF logic (filling PDF forms automatically), email logic (in and out), and much more. Many of these logics can become quite specific. Part of the point is that all these logics are essentially independent of each other and new logics can therefore be added as necessary (whether for some specific need or recognized as a generally useful addition to JLS). Due to the general nature of how logic interacts with data this model is effective as a general purpose declarative programming environment while also providing artificial intelligence tools to the everyday JLS application author.
Who could ask for more :-?
JLS on GAE is a subset of our full JLS system for two reasons. First we are interested in simplifying JLS to the point that it becomes more attractive and easier to use for anyone. It's possible we have simplified too much and will need to add some of our features back into it.
The other reason is that this is a re-implementation in the Python language with the restrictions that GAE imposes. This subset re-implementation includes the core of our JLS system with only a couple features not implemented. The interesting aspect of this is that we must have a response time in the neighborhood of 1/3 of a second. If and when we add more logic engines or core features to JLS on GAE we will be closely watching how that affects our CPU performance.
Some may find it interesting that we did this in Python. Our full JLS is 100% Java. However, we have noticed that Python lends itself to symbolic programming readily. We (ok, I) have been pleasantly surprised at the simplicity and effectiveness of using Python for symbolic programming and find that many of Python's language features are readily mapped to the types of programming that we need in JLS. We've found that the need to build support utilities for JLS in Python has been significantly less than in Java - less code generally means better maintainability as well as faster execution. And I dare say that some aspects of programming in Python are very much like the days of programming in Common LISP.
You can build a decision tree application using the QART (Question-Answer-Result-Tree) editor. Or you can build a case base reasoning (CBR) application using the CBR editor. Both kinds of applications may be extended by what we are calling special cases. These are situations where someone thinks a different result should be provided at the end of the reasoning - perhaps with some additional questions. In this way your application can be dynamically extended while you as the author have the ability to manage these special cases and update your application.
These JLS applications can then be used as iGoogle gadgets, or on Facebook, or linked to in any way you please from your own web pages. JLS is great for building applications for advice, how-to, or product selection. How many times does one go to an online store and find too many options or products available. Wouldn't it be nice if there was an automated advisor available to ask you some questions and help you choose 2 or 3 options that are relevant to you? Much as a salesperson in a brick and mortar store might help.
The possibilities are open to the imagination.
A more technical view of JLS
This version of JLS is hosted on the Google App Engine (GAE) hosting environment. GAE is still in a preview status. Currently all usage of it is free but includes various resource usage quotas. It's been mentioned that a paid version will be available before the end of the year. Until then it is entirely possible that JLS max out some resource usage quotas at various times; we will be doing everything we can to avoid that and if anyone notices any problems please let us know.
This version of the Jnana Logic Server on GAE currently supports decision trees and case base reasoning. The implementation is more general. Currently rules and the left hand sides of rules (situations) are used to support the operation of decision trees and the operation of a JLS application. There are JLS engines we call alerts and orientations that are used for some simple output.
JLS on GAE is a small subset of our general artificial intelligence JLS version that has been developed over the last 12 years. We have taken a general approach to declarative programming that doesn't require a JLS application author to understand imperative programming (control structures). In all cases we take a completely graphical approach to author editing - no actual programming syntax is encountered.
We take the point of view that one may use the type of logic one needs for their JLS application. This can be basic mapping rules, decision trees, CBR, constraint reasoning, distributed applications, natural language processing, factor lists and even spreadsheets. Logic can be simply thought of as take some data as input and write your outputs to some data. Writing a rule doesn't have to know anything about writing a constraint, for example. All logic can be worked on independently and in fact one could have different bits of logic that produce outputs for the same data. How the system decides what to do is managed by JLS's prioritization algorithms that operate similarly to how a human might ask questions of a user in order to get to a goal result. A human would try to ask the smallest number of questions but as some questions are answered different options or bit of logic may become relevant and different questions then asked. It's all very complex :-)
Our full JLS platform also includes more outputs and inputs. As usual, these are modelled as logics inside the system, but the application author doesn't need to understand that in most cases. These include Rest and Web Service logics, database integration logic, Chart and Table logic utilities, html report outputs, PDF logic (filling PDF forms automatically), email logic (in and out), and much more. Many of these logics can become quite specific. Part of the point is that all these logics are essentially independent of each other and new logics can therefore be added as necessary (whether for some specific need or recognized as a generally useful addition to JLS). Due to the general nature of how logic interacts with data this model is effective as a general purpose declarative programming environment while also providing artificial intelligence tools to the everyday JLS application author.
Who could ask for more :-?
JLS on GAE is a subset of our full JLS system for two reasons. First we are interested in simplifying JLS to the point that it becomes more attractive and easier to use for anyone. It's possible we have simplified too much and will need to add some of our features back into it.
The other reason is that this is a re-implementation in the Python language with the restrictions that GAE imposes. This subset re-implementation includes the core of our JLS system with only a couple features not implemented. The interesting aspect of this is that we must have a response time in the neighborhood of 1/3 of a second. If and when we add more logic engines or core features to JLS on GAE we will be closely watching how that affects our CPU performance.
Some may find it interesting that we did this in Python. Our full JLS is 100% Java. However, we have noticed that Python lends itself to symbolic programming readily. We (ok, I) have been pleasantly surprised at the simplicity and effectiveness of using Python for symbolic programming and find that many of Python's language features are readily mapped to the types of programming that we need in JLS. We've found that the need to build support utilities for JLS in Python has been significantly less than in Java - less code generally means better maintainability as well as faster execution. And I dare say that some aspects of programming in Python are very much like the days of programming in Common LISP.
Subscribe to:
Comments (Atom)