Proof Markup Language (PML) Primer

KSL Working Draft 28 November 2005

This version:
http://iw.stanford.edu/doc/WD/2005/pml-primer-20051128.htm
Latest version:
http://iw.stanford.edu/doc/TR/pml-primer
Editors:
Deborah L. McGuinness (Stanford University ) dlm@ksl.stanford.edu
Paulo Pinheiro da Silva  (Stanford University) pp@ksl.stanford.edu
Li Ding (University of Maryland Baltimore County) dingli1@umbc.edu

Copyright ©2005 Stanford University Knowledge Systems, Artificial Intelligence Laboratory.


Abstract
This document provides a brief introduction to the Proof Markup Language (PML), which is part of the Inference Web (IW) infrastructure for explanations. PML was designed to function as a proof interlingua. It can be used to encode justification collections in a declarative and interoperable format. PML may be used to support applications that need to interoperate and provide access information to components that require justifications for answers. PML has been used in a number of government and corporate projects to encode justification collections. Users and agents can benefit of justification collections encoded in PML by using the Inference Web framework. IW provides a number of web tools and off-line tools for browsing, manipulating, checking, searching, combining, summarizing and rendering PML. When employing engines to generate PML, users may use the framework to validate PML encodings. This document introduces PML by motivating its need through use of simple examples. The examples are functional and can be reused to test IW tools.

Contents

Introduction

The Proof Markup Language consists of a specification of terms for encoding collections of justifications to answers in question answering scenarios. Some applications will view each justification in the collections as a formal proof and others may view each justification as an information manipulation trace describing both a reasoning process and provenance information that led a system to make a conclusion.

With PML, collections of justifications can be encoded in the interlingua, and then be shared for interoperation and information integration among heterogeneous distributed applications especially for question answering systems.

Motivating Example

Many applications work in settings where answers will not be trusted or used unless they can provide at least one justification that supports the answer that end users (people or programs) can inspect. Consider the following example:

Question & Query

John is visiting Stanford and is aware of an online tour guide agent that has information about local restaurants. John is aware of a seafood restaurant nearby called Tony's, so he asks a question in English.

 What type of food is Tony's specialty?

Upon receiving the question, the agent translated it into an internal representation, i.e. a Knowledge Interchange Format (KIF) query:

 (type TonysSpecialty ?x) 

Answer

Then the agent will run a question-answering process and come up with an answer in KIF.

 (type TonysSpecialty ShellFish) 

It then translate the answer above into English and present that to the end user.

Tony's specialty is Shellfish.

Justification Collection

Upon John's request for justifying the answer, the agent can provide a justification collection where the answer is the final conclusion of the collection. The collection may consist of one or more justifications and is encoded in PML. We use justification collections to encode justifications because one answer may have many justifications depending, for example, on the variety of information sources and the agent's run-time inference processes. In this scenario, we will first show two representative types of justification collections where only only one justification can be extracted for the the final conclusion.

Using PML to Encode Justification Collections

The justification collection for an answer is a directed acyclic graph of node sets connected by inference steps. Each node set has a conclusion and any number of inference steps including zero. An inference step is an application of an inference rule on the the conclusions of other node sets, which are called the immediate antecedents of the step. The conclusion of the node set hosting an inference step is the conclusion of applying the inference rule of the step on the immediate antecedents of the step.

A justification collection can encode one or more justifications for a given answer. The answer is the final conclusion of a collection, which is the conclusion of the node set that is not an antecedent of any inference step in the collection. A collection is said to have a single justification for the answer if every node set in the collection has exactly one inference step. If every node set in the collection has at least one inference step and ate at least one node set in the collection has two or more inference step, it is said that the collection has alternative justifications for the answer.

We will first discuss how to use PML to encode justification collections with a single justification for the answer. Then we will explain how to encode justification collections with alternative justifications for the answer. Finally, we will describe how to encode information about the query and possible question originating the answer justified by the collection.

Encoding Collections With A Single Justification for the Answer

In this section we discuss multiple ways of encoding single justifications for the answer in our scenarion. We first present a way of encoding a trivial justification for the answer and then we present a more complex justification for the same answer.

Answer Justified by a Direct Assertion

In what follows, we show how to represent justification collections for the conclusion '(type TonysSpecialty SHELLFISH)' that have been directly asserted. We have created several justification collections supporting the same conclusion with increasing amounts of provenance information.


A short justification

The justification means:

'(type TonysSpecialty SHELLFISH)' has being directly asserted.

The justification can be encoded in the following PML: (browse it in IWBrowser).

<iw:NodeSet rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/ans1/answer.owl#answer">

    <iw:hasConclusion>(type TonysSpecialty SHELLFISH)</iw:hasConclusion>
    <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/KIF.owl#KIF" />
    <iw:isConsequentOf>
        <iw:InferenceStep>
            <iw:hasIndex rdf:datatype="http://www.w3.org/2001/XMLSchema#int">0</iw:hasIndex>
            <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/Told.owl#Told" />
        </iw:InferenceStep>
    </iw:isConsequentOf>

</iw:NodeSet>

NOTE: the font colors in the gray boxes are used in the following manner: (i) black for new content; (ii) red for highlighted new content; and (iii) green for visited content.

One of the most basic elements in PML is a NodeSet. A NodeSet is used to encode information about an information manipulation process that was used to derive a conclusion.

This example shows a simple instance of NodeSet which captures one justification for information that has been directly asserted. A NodeSet usually has the following properties:

An instance of InferenceStep encodes one justification for the conclusion of NodeSet. It usually has the following properties:


A refinement of the above justification (with source usage information)

The justification means:

(type TonysSpecialty SHELLFISH)' has been directly asserted in Stanford's Tony's Specialty Example as of 10:30 on 2005-10-17

The justification can be encoded in the following PML: (browse it in IWBrowser)

<iw:NodeSet rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/ans1/answer.owl#answer">
    <iw:hasConclusion>(type TonysSpecialty SHELLFISH)</iw:hasConclusion>

    <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/KIF.owl#KIF"/>
    <iw:isConsequentOf>
        <iw:InferenceStep>
            <iw:hasIndex rdf:datatype="http://www.w3.org/2001/XMLSchema#int">0</iw:hasIndex>
            <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/Told.owl#Told" />

            <iw:hasSourceUsage>
                <iw:SourceUsage>
                    <iw:usageTime rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2005-10-17T10:30:00Z</iw:usageTime>
                    <iw:hasSource rdf:resource="http://inferenceweb.stanford.edu/ksl/registry/PUB/STE.owl#STE"/>
                </iw:SourceUsage>
            </iw:hasSourceUsage>
        </iw:InferenceStep>

    </iw:isConsequentOf>
</iw:NodeSet>

This example expands the example above by adding additional source information. The PML utilizes the SourceUsage construct. An instance of SourceUsage caputures the source and how the source has been used, and it usually has the following properties:


A refinement of the above justification (with enriched source usage information)

The justification means:

'(type TonysSpecialty SHELLFISH)' has been directly asserted in Stanford's Tony's Specialty Example as a span of text between byte offset 62 and byte offset 92 as of 10:30 on 2005-10-17

The justification can be encoded in the following PML: (browse it in IWBrowser)

<iw:NodeSet rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/ans1/answer.owl#answer">

    <iw:hasConclusion>(type TonysSpecialty SHELLFISH)</iw:hasConclusion>
    <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/KIF.owl#KIF" />
    <iw:isConsequentOf>
        <iw:InferenceStep>
            <iw:hasIndex rdf:datatype="http://www.w3.org/2001/XMLSchema#int">0</iw:hasIndex>
           <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/Told.owl#Told"/>
           <iw:hasSourceUsage>
                <iw:SourceUsage>
                   <iw:usageTime rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2005-10-17T10:30:00Z</iw:usageTime>
                   <iw:hasSource rdf:resource="http://inferenceweb.stanford.edu/ksl/registry/PUB/STE.owl#STE"/>

                    <iw:spanFromByte rdf:datatype="http://www.w3.org/2001/XMLSchema#int">62</iw:spanFromByte>
                    <iw:spanToByte rdf:datatype="http://www.w3.org/2001/XMLSchema#int">92</iw:spanToByte>
                </iw:SourceUsage>
           </iw:hasSourceUsage>
        </iw:InferenceStep>
    </iw:isConsequentOf>
</iw:NodeSet>

This example includes additional information using SourceUsage. It specifies the portion of the source text being used. These two properties are not required since they are auxiliary and often used for display purposes.

Answer Justified as an Inference Result

In what follows, we show how to represent justifications derived from inference. A very simple justification with three node sets are elaborated below. We will quickly mention representing the premises and elaborate the construction


The premises (from direct assertion)

Premise 1:

'(subClassOf CRAB SHELLFISH)' has being directly asserted in Stanford's Tony's Specialty Ontology as a span of text between byte offset 56 and byte offset 82 as of 10:30 on 2005-10-17

The justification can be encoded in the following PML: (browse it in IWBrowser)

<iw:NodeSet rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns1.owl#ns1">

    <iw:hasConclusion>(subClassOf CRAB SHELLFISH)</iw:hasConclusion>
    <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/KIF.owl#KIF" />
    <iw:isConsequentOf>
        <iw:InferenceStep>
            <iw:hasIndex rdf:datatype="http://www.w3.org/2001/XMLSchema#int">0</iw:hasIndex>
            <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/Told.owl#Told" />
            <iw:hasSourceUsage>
                <iw:SourceUsage>
                    <iw:usageTime rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2005-10-17T10:30:00Z</iw:usageTime>
                    <iw:hasSource rdf:resource="http://inferenceweb.stanford.edu/ksl/registry/ONT/STOntology.owl#STOntology"/>
                   <iw:spanFromByte rdf:datatype="http://www.w3.org/2001/XMLSchema#int">56</iw:spanFromByte>
                    <iw:spanToByte rdf:datatype="http://www.w3.org/2001/XMLSchema#int">82</iw:spanToByte>
                </iw:SourceUsage>
            </iw:hasSourceUsage>
        </iw:InferenceStep>
    </iw:isConsequentOf>

</iw:NodeSet>


Premise 2:

'(or (not(subClassOf CRAB ?x)) (type TonysSpecialty ?x))' has being directly asserted in Deborah as of 10:30 on 2005-10-17

The justification can be encoded in the following PML: (browse it in IWBrowser)

<iw:NodeSet rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns2.owl#ns2">

    <iw:hasConclusion>(or (not(subClassOf CRAB ?x)) (type TonysSpecialty ?x))</iw:hasConclusion>
    <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/KIF.owl#KIF" />
    <iw:isConsequentOf>
        <iw:InferenceStep>
            <iw:hasIndex rdf:datatype="http://www.w3.org/2001/XMLSchema#int">0</iw:hasIndex>
            <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/Told.owl#Told" />
            <iw:hasSourceUsage>
                <iw:SourceUsage>
                    <iw:usageTime rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2005-10-17T10:30:00Z</iw:usageTime>
                    <iw:hasSource rdf:resource="http://inferenceweb.stanford.edu/registry/PER/DLM.owl#DLM"/>
                </iw:SourceUsage>
            </iw:hasSourceUsage>
        </iw:InferenceStep>
    </iw:isConsequentOf>

</iw:NodeSet>

A simple derived justification

The justification means:

'(type TonysSpecialty SHELLFISH)' is derived from the application of General Modus Ponens rule on the two premises linked by http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns1.owl#ns1 and http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns2.owl#ns2

The justification can be encoded in the following PML: (browse it in IWBrowser)

<iw:NodeSet rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/answer.owl#answer">

    <iw:hasConclusion>(type TonysSpecialty SHELLFISH)</iw:hasConclusion>
    <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/KIF.owl#KIF"/>
    <iw:isConsequentOf>
        <iw:InferenceStep>
            <iw:hasIndex rdf:datatype="http://www.w3.org/2001/XMLSchema#int">0</iw:hasIndex>
            <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/GMP.owl#GMP" />
            <iw:hasAntecedent rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns1.owl#ns1"/>
            <iw:hasAntecedent rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns2.owl#ns2"/>
        </iw:InferenceStep>
    </iw:isConsequentOf>

</iw:NodeSet>

The derived justification differs from direct assertion in that (i) it uses uses rules other than Direct Assertion and (ii) it specifies addition information including:


A refinement of the above justification (with inference engine specified)

The justification means:

'(type TonysSpecialty SHELLFISH)' is derived by JTP from the application of General Modus Ponens rule on the two premises linked by http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns1.owl#ns1 and http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns2.owl#ns2

The justification can be encoded in the following PML: (browse it in IWBrowser)

<iw:NodeSet rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/answer.owl#answer">

    <iw:hasConclusion>(type TonysSpecialty SHELLFISH)</iw:hasConclusion>
    <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/KIF.owl#KIF" />
    <iw:isConsequentOf>
        <iw:InferenceStep>
            <iw:hasIndex rdf:datatype="http://www.w3.org/2001/XMLSchema#int">0</iw:hasIndex>
            <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/GMP.owl#GMP" />

            <iw:hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/JTP.owl#JTP" />
            <iw:hasAntecedent rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns1.owl#ns1" />
            <iw:hasAntecedent rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns2.owl#ns2" />
        </iw:InferenceStep>
    </iw:isConsequentOf>

</iw:NodeSet>

A refinement of the above justification (with variable mapping specified)

The justification means:

'(type TonysSpecialty SHELLFISH)' is derived from the application of General Modus Ponens rule on the two premises linked by http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns1.owl#ns1 and http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns2.owl#ns2 , and we also record that the inference step binds the variable ?x to SHELLFISH.

The justification can be encoded in the following PML: (browse it in IWBrowser)

<iw:NodeSet rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/answer.owl#answer">

    <iw:hasConclusion>(type TonysSpecialty ?x)</iw:hasConclusion>
    <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/KIF.owl#KIF" />
    <iw:isConsequentOf>
        <iw:InferenceStep>
            <iw:hasIndex rdf:datatype="http://www.w3.org/2001/XMLSchema#int">0</iw:hasIndex>
            <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/GMP.owl#GMP" />

            <iw:hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/JTP.owl#JTP" />
            <iw:hasAntecedent rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns1.owl#ns1" />
            <iw:hasAntecedent rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns2.owl#ns2" />
            <iw:hasVariableMapping>
                <iw:Mapping>
                    <iw:mapTo>SHELLFISH</iw:mapTo>
                    <iw:mapFrom>?x</iw:mapFrom>
                </iw:Mapping>
            </iw:hasVariableMapping>
        </iw:InferenceStep>
    </iw:isConsequentOf>

</iw:NodeSet>

PML also enable us to preserve the generic conclusion with associated Variable Mapping. An instance of Mapping records the variable bindings obtained in inference, and the conclusion does not have to be grounded. The semantics of the conclusion is in fact the one after variable mappings have been applied.

Encoding Answer with Multiple Justifications

A NodeSet may contain two inference steps, which lead to two alternative justifications of the same conclusion.

The justification means:

'(type TonysSpecialty SHELLFISH)' is derived from the application of General Modus Ponens rule or, alternatively, from a direct assertion.

The justification can be encoded in the following PML: (browse it in IWBrowser)

<iw:NodeSet rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/ans3/answer.owl#answer">

    <iw:hasConclusion>(type TonysSpecialty SHELLFISH)</iw:hasConclusion>
    <iw:hasLanguage rdf:resource="http://inferenceweb.stanford.edu/registry/LG/KIF.owl#KIF" />
    <iw:isConsequentOf>
        <iw:InferenceStep>
            <iw:hasIndex rdf:datatype="http://www.w3.org/2001/XMLSchema#int">0</iw:hasIndex>
            <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/Told.owl#Told" />
            <iw:hasSourceUsage>
                <iw:SourceUsage>
                    <iw:usageTime rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2005-10-17T10:30:00Z</iw:usageTime>
                    <iw:hasSource rdf:resource="http://inferenceweb.stanford.edu/ksl/registry/PUB/STE.owl#STE"/>
                    <iw:spanFromByte rdf:datatype="http://www.w3.org/2001/XMLSchema#int">802</iw:spanFromByte>
                    <iw:spanToByte rdf:datatype="http://www.w3.org/2001/XMLSchema#int">830</iw:spanToByte>
                </iw:SourceUsage>
            </iw:hasSourceUsage>
        </iw:InferenceStep>
    </iw:isConsequentOf>
    <iw:isConsequentOf>
        <iw:InferenceStep>
            <iw:hasIndex rdf:datatype="http://www.w3.org/2001/XMLSchema#int">1</iw:hasIndex>
            <iw:hasRule rdf:resource="http://inferenceweb.stanford.edu/registry/DPR/GMP.owl#GMP" />
            <iw:hasInferenceEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/JTP.owl#JTP" />
            <iw:hasAntecedent rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns1.owl#ns1" />
            <iw:hasAntecedent rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns2.owl#ns2" />

            <iw:hasVariableMapping>
                <iw:Mapping>
                <iw:mapTo>SHELLFISH</iw:mapTo>
               <iw:mapFrom>?x</iw:mapFrom>
                </iw:Mapping>
            </iw:hasVariableMapping>
        </iw:InferenceStep>
    </iw:isConsequentOf>
</iw:NodeSet>

Encoding Proofs

Encoding the Question

An instance of question records the original user's question, which is an English sentence in this scenario.

"What type of food is Tony's specialty?"

The question is represented in PML as below:

<iw:Question rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/question.owl#question">

    <iw:questionContent>What type of food is Tony's specialty?</iw:questionContent>

</iw:Question>

Encoding the Query/Answer

An instance of query records the translated internal representation of user's question, which is a KIF query in this scenario.

(type TonysSpecialty ?x)

The query is represented in PML as below: (browse it in IWBrowser)

<iw:Query rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/query.owl#query">

    <iw:queryContent>(type TonysSpecialty ?x)</iw:queryContent>
    <iw:isFromEngine rdf:resource="http://inferenceweb.stanford.edu/registry/IE/JTP.owl#JTP" />
    <iw:isQueryFor rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/question.owl#question" />

    <iw:hasAnswer rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/ans1/answer.owl#answer"/>
    <iw:hasAnswer rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/answer.owl#answer"/>
    <iw:hasAnswer rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/ans3/answer.owl#answer"/>

</iw:Query>

In this query, we need to specify the following information that captures how the query has been answered.

Linking NodeSets to Query and Answer

<iw:NodeSet rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/answer.owl#answer">
    ...
    <iw:InferenceStep>
        ...
       <iw:fromQuery rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/query.owl#query" />
        ...
    </iw:InferenceStep>
    ...

</iw:NodeSet>


<iw:NodeSet rdf:about="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/ns1.owl#ns1">
    ...
    <iw:InferenceStep>
        ...
       <iw:fromAnwer rdf:resource="http://iw.stanford.edu/proofs/2005/11/tonys/ans2/answer.owl#answer" />
        ...
    </iw:InferenceStep>
    ...

</iw:NodeSet>

References

All references are available from the Inference Web publication page: http://iw.stanford.edu/documents_papers.html

Best PML reference:
Paulo Pinheiro da Silva, Deborah L. McGuinness and Richard Fikes. A Proof Markup Language for Semantic Web Services Information Systems http://www.ksl.stanford.edu/KSL_Abstracts/KSL-04-01.html

Best Inference Web reference:
Deborah L. McGuinness and Paulo Pinheiro da Silva. Explaining Answers from the Semantic Web: The Inference Web Approach Journal of Web Semantics Vol.1 No.4., pages 397-413, October 2004. http://www.ksl.stanford.edu/KSL_Abstracts/KSL-04-03.html