使用Java解析名稱空間的方法

作者：佚名 2009-06-30 13:54:00

本文介紹了Java中向名稱空間映射提供前綴的三種不同方式。本文亦包含了示例代碼以方便您編寫(xiě)自己的 NamespaceContext。

如果想要在 XPath 表達(dá)式中使用名稱空間，必須提供對(duì)此名稱空間 URI 所用前綴的鏈接。

前提條件和示例

本文所有的示例均使用如下這個(gè)XML文件：

清單1. 示例XML




  
    
    Michael Schmidt
  
  
    
    Johann Wolfgang von Goethe
  
  
    
    Johann Wolfgang von Goethe

這個(gè) XML 示例包含三個(gè)在根元素內(nèi)聲明的名稱空間，一個(gè)在此結(jié)構(gòu)的更深層元素上聲明的名稱空間。您將可以看到這種設(shè)置所帶來(lái)的差異。

這個(gè) XML 示例的第二個(gè)有趣之處在于元素 booklist 具有三個(gè)子元素，均名為 book。但是***個(gè)子元素具有名稱空間 science，而其他子元素則具有名稱空間 fiction。這意味著這些元素完全有別于 XPath。在接下來(lái)的這些例子中，您將可以看到這種特性產(chǎn)生的結(jié)果。

示例源代碼中有一個(gè)需要注意之處：此代碼沒(méi)有針對(duì)維護(hù)進(jìn)行優(yōu)化，只針對(duì)可讀性進(jìn)行了優(yōu)化。這意味著它將具有某些冗余。輸出通過(guò) System.out.println() 以最為簡(jiǎn)單的方式生成。在本文中有關(guān)輸出的代碼行均縮寫(xiě)為 “...”。

理論背景

名稱空間究竟有何意義？為何要如此關(guān)注它呢？名稱空間是元素或?qū)傩缘臉?biāo)識(shí)符的一部分。元素或?qū)傩钥梢跃哂邢嗤谋镜孛Q，但是必須使用不同的名稱空間。它們完全不同。請(qǐng)參考上述示例（science:book 和 fiction:book）。若要綜合來(lái)自不同資源的 XML 文件，就需要使用名稱空間來(lái)解決命名沖突。以一個(gè) XSLT 文件為例。它包含 XSLT 名稱空間的元素、來(lái)自您自己名稱空間的元素以及（通常）XHTML 名稱空間的元素。使用名稱空間，就可以避免具有相同本地名稱的元素所帶來(lái)的不確定性。

名稱空間由 URI（在本例中為 http://univNaSpResolver/booklist）定義。為了避免使用這個(gè)長(zhǎng)字符串，可以定義一個(gè)與此 URI 相關(guān)聯(lián)的前綴（在本例中為 books）。請(qǐng)記住此前綴類似于一個(gè)變量：其名稱并不重要。如果兩個(gè)前綴引用相同的 URI，那么被加上前綴的元素的名稱空間將是相同的（請(qǐng)參見(jiàn) 清單 5 中的示例 1）。

XPath 表達(dá)式使用前綴（比如 books:booklist/science:book）并且您必須提供與每個(gè)前綴相關(guān)聯(lián)的 URI。這時(shí)，就需要使用 NamespaceContext。它恰好能夠?qū)崿F(xiàn)此目的。

本文給出了提供前綴和 URI 之間的映射的不同方式。

在此 XML 文件中，映射由類似 xmlns:books="http://univNaSpResolver/booklist" 這樣的 xmlns 屬性或 xmlns="http://univNaSpResolver/book"（默認(rèn)名稱空間）提供。

提供名稱空間解析的必要性

如果 XML 使用了名稱空間，若不提供 NamespaceContext，那么 XPath 表達(dá)式將會(huì)失效。清單 2 中的示例 0 充分展示了這一點(diǎn)。其中的 XPath 對(duì)象在所加載的 XML 文檔之上構(gòu)建和求值。首先，嘗試不用任何名稱空間前綴（result1）編寫(xiě)此表達(dá)式。之后，再用名稱空間前綴（result2）編寫(xiě)此表達(dá)式。

清單 2. 無(wú)名稱空間解析的示例 0

private static void example0(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Zero example - no namespaces provided ***");

        XPath xPath = XPathFactory.newInstance().newXPath();

...
        NodeList result1 = (NodeList) xPath.evaluate("booklist/book", example,
                XPathConstants.NODESET);
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/science:book", example, XPathConstants.NODESET);
...
    }

輸出如下所示。

清單 3. 示例 0 的輸出

*** Zero example - no namespaces provided ***
First try asking without namespace prefix:
--> booklist/book
Result is of length 0
Then try asking with namespace prefix:
--> books:booklist/science:book
Result is of length 0
The expression does not work in both cases.

在兩種情況下，XPath 求值并不返回任何節(jié)點(diǎn)，而且也沒(méi)有任何異常。XPath 找不到節(jié)點(diǎn)，因?yàn)槿鄙偾熬Y到 URI 的映射。

硬編碼的名稱空間解析

也可以以硬編碼的值來(lái)提供名稱空間，類似于清單 4 中的類：

清單 4. 硬編碼的名稱空間解析

public class HardcodedNamespaceResolver implements NamespaceContext {

    /**
     * This method returns the uri for all prefixes needed. Wherever possible
     * it uses XMLConstants.
     * 
     * @param prefix
     * @return uri
     */
    public String getNamespaceURI(String prefix) {
        if (prefix == null) {
            throw new IllegalArgumentException("No prefix provided!");
        } else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
            return "http://univNaSpResolver/book";
        } else if (prefix.equals("books")) {
            return "http://univNaSpResolver/booklist";
        } else if (prefix.equals("fiction")) {
            return "http://univNaSpResolver/fictionbook";
        } else if (prefix.equals("technical")) {
            return "http://univNaSpResolver/sciencebook";
        } else {
            return XMLConstants.NULL_NS_URI;
        }
    }

    public String getPrefix(String namespaceURI) {
        // Not needed in this context.
        return null;
    }

    public Iterator getPrefixes(String namespaceURI) {
        // Not needed in this context.
        return null;
    }

}

請(qǐng)注意名稱空間 http://univNaSpResolver/sciencebook 被綁定到了前綴 technical（不是之前的 science）。結(jié)果將可以在隨后的示例（清單 6）中看到。在清單 5 中，使用此解析器的代碼還使用了新的前綴。

清單 5. 具有硬編碼名稱空間解析的示例 1

private static void example1(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** First example - namespacelookup hardcoded ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new HardcodedNamespaceResolver());

...
        NodeList result1 = (NodeList) xPath.evaluate(
                "books:booklist/technical:book", example,
                XPathConstants.NODESET);
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate("books:booklist/technical:book/:author",
                example);
...
    }

如下是此示例的輸出。

清單 6. 示例 1 的輸出

*** First example - namespacelookup hardcoded ***
Using any namespaces results in a NodeList:
--> books:booklist/technical:book
Number of Nodes: 1

  
    
    Michael Schmidt
  
--> books:booklist/fiction:book
Number of Nodes: 2

  
    
    Johann Wolfgang von Goethe
  

  
    
    Johann Wolfgang von Goethe
  
The default namespace works also:
--> books:booklist/technical:book/:author
Michael Schmidt

如您所見(jiàn)，XPath 現(xiàn)在找到了節(jié)點(diǎn)。好處是您可以如您所希望的那樣重命名前綴，我對(duì)前綴 science 就是這么做的。XML 文件包含前綴 science，而 XPath 則使用了另一個(gè)前綴 technical。由于這些 URI 都是相同的，所以節(jié)點(diǎn)均可被 XPath 找到。不利之處是您必須要在多個(gè)地方（XML、XSD、 XPath 表達(dá)式和此名稱空間的上下文）維護(hù)名稱空間。

從文檔讀取名稱空間

名稱空間及其前綴均存檔在此 XML 文件內(nèi)，因此可以從那里使用它們。實(shí)現(xiàn)此目的的最為簡(jiǎn)單的方式是將這個(gè)查找指派給該文檔。

清單 7. 從文檔直接進(jìn)行名稱空間解析

public class UniversalNamespaceResolver implements NamespaceContext {
    // the delegate
    private Document sourceDocument;

    /**
     * This constructor stores the source document to search the namespaces in
     * it.
     * 
     * @param document
     *            source document
     */
    public UniversalNamespaceResolver(Document document) {
        sourceDocument = document;
    }

    /**
     * The lookup for the namespace uris is delegated to the stored document.
     * 
     * @param prefix
     *            to search for
     * @return uri
     */
    public String getNamespaceURI(String prefix) {
        if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
            return sourceDocument.lookupNamespaceURI(null);
        } else {
            return sourceDocument.lookupNamespaceURI(prefix);
        }
    }

    /**
     * This method is not needed in this context, but can be implemented in a
     * similar way.
     */
    public String getPrefix(String namespaceURI) {
        return sourceDocument.lookupPrefix(namespaceURI);
    }

    public Iterator getPrefixes(String namespaceURI) {
        // not implemented yet
        return null;
    }

}

請(qǐng)注意如下這些事項(xiàng)：

•如果文檔在使用 XPath 前已更改，那么此更改還將反應(yīng)在名稱空間的這個(gè)查找上，因?yàn)橹概墒窃谛枰臅r(shí)候通過(guò)使用文檔的當(dāng)前版本完成的。

•對(duì)名稱空間或前綴的查找在所用節(jié)點(diǎn)的祖先節(jié)點(diǎn)完成，在我們的例子中，即節(jié)點(diǎn) sourceDocument。這意味著，借助所提供的代碼，您只需在根節(jié)點(diǎn)上聲明此名稱空間。在我們的示例中，名稱空間 science 沒(méi)有被找到。

•此查找在 XPath 求值時(shí)被調(diào)用，因此它會(huì)消耗一些額外的時(shí)間。

如下是示例代碼：

清單 8. 從文檔直接進(jìn)行名稱空間解析的示例 2

private static void example2(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Second example - namespacelookup delegated to document ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new UniversalNamespaceResolver(example));

        try {
...
            NodeList result1 = (NodeList) xPath.evaluate(
                    "books:booklist/science:book", example,
                    XPathConstants.NODESET);
...
        } catch (XPathExpressionException e) {
...
        }
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate(
                "books:booklist/fiction:book[1]/:author", example);
...
    }

此示例的輸出為：

清單 9. 示例 2 的輸出

*** Second example - namespacelookup delegated to document ***
Try to use the science prefix: no result
--> books:booklist/science:book
The resolver only knows namespaces of the first level!
To be precise: Only namespaces above the node, passed in the constructor.
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2

  
    
    Johann Wolfgang von Goethe
  

  
    
    Johann Wolfgang von Goethe
  
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe

正如輸出所示，在 book 元素上聲明的、具有前綴 science 的名稱空間并未被解析。求值方法Java異常拋出了一個(gè) XPathExpressionException。要解決這個(gè)問(wèn)題，需要從文檔提取節(jié)點(diǎn) science:book 并將此節(jié)點(diǎn)用作代表（delegate）。但是這將意味著對(duì)此文檔要進(jìn)行額外的解析，而且也不優(yōu)雅。

從文檔讀取名稱空間并緩存它們

NamespaceContext 的下一個(gè)版本要稍好一些。它只在構(gòu)造函數(shù)內(nèi)提前讀取一次名稱空間。對(duì)一個(gè)名稱空間的每次調(diào)用均回應(yīng)自緩存。這樣一來(lái)，文檔內(nèi)的更改就變得無(wú)關(guān)緊要，因?yàn)槊Q空間列表在 Java 對(duì)象創(chuàng)建之時(shí)就已被緩存。

清單 10. 從文檔緩存名稱空間解析

public class UniversalNamespaceCache implements NamespaceContext {
    private static final String DEFAULT_NS = "DEFAULT";
    private Map prefix2Uri = new HashMap();
    private Map uri2Prefix = new HashMap();

    /**
     * This constructor parses the document and stores all namespaces it can
     * find. If toplevelOnly is true, only namespaces in the root are used.
     * 
     * @param document
     *            source document
     * @param toplevelOnly
     *            restriction of the search to enhance performance
     */
    public UniversalNamespaceCache(Document document, boolean toplevelOnly) {
        examineNode(document.getFirstChild(), toplevelOnly);
        System.out.println("The list of the cached namespaces:");
        for (String key : prefix2Uri.keySet()) {
            System.out
                    .println("prefix " + key + ": uri " + prefix2Uri.get(key));
        }
    }

    /**
     * A single node is read, the namespace attributes are extracted and stored.
     * 
     * @param node
     *            to examine
     * @param attributesOnly,
     *            if true no recursion happens
     */
    private void examineNode(Node node, boolean attributesOnly) {
        NamedNodeMap attributes = node.getAttributes();
        for (int i = 0; i < attributes.getLength(); i++) {
            Node attribute = attributes.item(i);
            storeAttribute((Attr) attribute);
        }

        if (!attributesOnly) {
            NodeList chields = node.getChildNodes();
            for (int i = 0; i < chields.getLength(); i++) {
                Node chield = chields.item(i);
                if (chield.getNodeType() == Node.ELEMENT_NODE)
                    examineNode(chield, false);
            }
        }
    }

    /**
     * This method looks at an attribute and stores it, if it is a namespace
     * attribute.
     * 
     * @param attribute
     *            to examine
     */
    private void storeAttribute(Attr attribute) {
        // examine the attributes in namespace xmlns
        if (attribute.getNamespaceURI() != null
                && attribute.getNamespaceURI().equals(
                        XMLConstants.XMLNS_ATTRIBUTE_NS_URI)) {
            // Default namespace xmlns="uri goes here"
            if (attribute.getNodeName().equals(XMLConstants.XMLNS_ATTRIBUTE)) {
                putInCache(DEFAULT_NS, attribute.getNodeValue());
            } else {
                // The defined prefixes are stored here
                putInCache(attribute.getLocalName(), attribute.getNodeValue());
            }
        }

    }

    private void putInCache(String prefix, String uri) {
        prefix2Uri.put(prefix, uri);
        uri2Prefix.put(uri, prefix);
    }

    /**
     * This method is called by XPath. It returns the default namespace, if the
     * prefix is null or "".
     * 
     * @param prefix
     *            to search for
     * @return uri
     */
    public String getNamespaceURI(String prefix) {
        if (prefix == null || prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
            return prefix2Uri.get(DEFAULT_NS);
        } else {
            return prefix2Uri.get(prefix);
        }
    }

    /**
     * This method is not needed in this context, but can be implemented in a
     * similar way.
     */
    public String getPrefix(String namespaceURI) {
        return uri2Prefix.get(namespaceURI);
    }

    public Iterator getPrefixes(String namespaceURI) {
        // Not implemented
        return null;
    }

}

請(qǐng)注意在代碼中有一個(gè)調(diào)試輸出。每個(gè)節(jié)點(diǎn)的屬性均被檢查和存儲(chǔ)。但子節(jié)點(diǎn)不被檢查，因?yàn)闃?gòu)造函數(shù)內(nèi)的布爾值 toplevelOnly 被設(shè)置為 true。如果此布爾值被設(shè)為 false，那么子節(jié)點(diǎn)的檢查將會(huì)在屬性存儲(chǔ)完畢后開(kāi)始。有關(guān)此代碼，有一點(diǎn)需要注意：在 DOM 中，***個(gè)節(jié)點(diǎn)代表整個(gè)文檔，所以，要讓元素 book 讀取這些名稱空間，必須訪問(wèn)子節(jié)點(diǎn)剛好一次。

在這種情況下，使用 NamespaceContext 非常簡(jiǎn)單：

清單 11. 具有緩存了的名稱空間解析的示例 3（只面向***）

private static void example3(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Third example - namespaces of toplevel node cached ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new UniversalNamespaceCache(example, true));

        try {
...
            NodeList result1 = (NodeList) xPath.evaluate(
                    "books:booklist/science:book", example,
                    XPathConstants.NODESET);
...
        } catch (XPathExpressionException e) {
...
        }
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate(
                "books:booklist/fiction:book[1]/:author", example);
...
    }

這會(huì)導(dǎo)致如下輸出：

清單 12. 示例 3 的輸出

*** Third example - namespaces of toplevel node cached ***
The list of the cached namespaces:
prefix DEFAULT: uri http://univNaSpResolver/book
prefix fiction: uri http://univNaSpResolver/fictionbook
prefix books: uri http://univNaSpResolver/booklist
Try to use the science prefix:
--> books:booklist/science:book
The cache only knows namespaces of the first level!
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2

  
    
    Johann Wolfgang von Goethe
  

  
    
    Johann Wolfgang von Goethe
  
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe

上述代碼只找到了根元素的名稱空間。更準(zhǔn)確的說(shuō)法是：此節(jié)點(diǎn)的名稱空間被構(gòu)造函數(shù)傳遞給了方法 examineNode。這會(huì)加速構(gòu)造函數(shù)的運(yùn)行，因它無(wú)需迭代整個(gè)文檔。不過(guò)，正如您從輸出看到的，science 前綴不能被解析。XPath 表達(dá)式導(dǎo)致了一個(gè)異常（XPathExpressionException）。

從文檔及其所有元素讀取名稱空間并對(duì)之進(jìn)行緩存

此版本將從這個(gè) XML 文件讀取所有名稱空間聲明。現(xiàn)在，即便是前綴 science 上的 XPath 也是有效的。但是有一種情況讓此版本有些復(fù)雜：如果一個(gè)前綴重載（在不同 URI 上的嵌套元素內(nèi)聲明），所找到的***一個(gè)將會(huì) “勝出”。在實(shí)際中，這通常不成問(wèn)題。

在本例中，NamespaceContext 的使用與前一個(gè)示例相同。構(gòu)造函數(shù)內(nèi)的布爾值 toplevelOnly 必須被設(shè)置為 false。

清單 13. 具有緩存了的名稱空間解析的示例 4（面向所有級(jí)別）

private static void example4(Document example)
            throws XPathExpressionException, TransformerException {
        sysout("\n*** Fourth example - namespaces all levels cached ***");

        XPath xPath = XPathFactory.newInstance().newXPath();
        xPath.setNamespaceContext(new UniversalNamespaceCache(example, false));
...
        NodeList result1 = (NodeList) xPath.evaluate(
                "books:booklist/science:book", example, XPathConstants.NODESET);
...
        NodeList result2 = (NodeList) xPath.evaluate(
                "books:booklist/fiction:book", example, XPathConstants.NODESET);
...
        String result = xPath.evaluate(
                "books:booklist/fiction:book[1]/:author", example);
...
    }

其輸出結(jié)果如下：

清單 14. 示例 4 的輸出

*** Fourth example - namespaces all levels cached ***
The list of the cached namespaces:
prefix science: uri http://univNaSpResolver/sciencebook
prefix DEFAULT: uri http://univNaSpResolver/book
prefix fiction: uri http://univNaSpResolver/fictionbook
prefix books: uri http://univNaSpResolver/booklist
Now the use of the science prefix works as well:
--> books:booklist/science:book
Number of Nodes: 1

  
    
    Michael Schmidt
  
The fiction namespace is resolved:
--> books:booklist/fiction:book
Number of Nodes: 2

  
    
    Johann Wolfgang von Goethe
  

  
    
    Johann Wolfgang von Goethe
  
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe

結(jié)束語(yǔ)

實(shí)現(xiàn)名稱空間解析，在Java中有幾種方式可供選擇，這些方式大都好于硬編碼的實(shí)現(xiàn)方式：

•如果示例很小并且所有名稱空間均位于頂部元素內(nèi)，指派到此文檔的方式將會(huì)十分有效。

•如果 XML 文件較大且具有深層嵌套和多個(gè) XPath 求值，***是緩存名稱空間的列表。

•但是如果您無(wú)法控制 XML 文件，并且別人可以發(fā)送給您任何前綴，***是獨(dú)立于他人的選擇。您可以編碼實(shí)現(xiàn)您自己的名稱空間解析，如示例 1 （HardcodedNamespaceResolver）所示，并將它們用于您的 XPath 表達(dá)式。

在上述這些情況下，解析自此 XML 文件的 NamespaceContext 能夠讓您的代碼更少、并且更為通用。

【編輯推薦】