TypeScript 解析 xml 的寫法與比較

工作上需要解析 XML 格式的資料，
找到兩個函式庫 - libxmljs、xml2js，在此紀錄使用方式與比較，供往後參考：

以下為方便比較，以相同資料來做評比：

<rootNode>
   <child foo="bar">
      <grandchild baz="fizbuzz">grandchild content</grandchild>
   </child>
   <sibling foo="call">with content!</sibling>
</rootNode>

libxmljs

解析結果是類似 C# 的 XElement 結構，支援 XPath 查詢，可取得屬性或節點內容。
寫法範例：

    let xmlDoc = libxmljs.parseXmlString(this.xml);

    // xpath query
    var gchild = xmlDoc.get('//grandchild');
    console.log(gchild.text());
    // prints "grandchild content"

    let children = xmlDoc.get('//rootNode').childNodes();   // get child node
    var child = children[0];
    console.log(child.attr('foo').value()); // get attribute
    // prints "bar"

遞迴解析各節點：

    private populateForlibxmljs(node: any, level: number) {
        console.log(new Array(level + 1).join('-') + node.name()); // 節點名稱
        console.log(new Array(level + 1).join(' ') + "'" + node.text() + "'");  // 節點內容??

        node.attrs().forEach(attr => {
            console.log(new Array(level + 1).join(' ') + attr.name() + '=' + attr.value()); // 屬性
        });
        node.childNodes().forEach(element => {
            this.populateForlibxmljs(element, level + 1);
        });
    }

結果：

-rootNode
 'grandchild contentwith content!'
--child
  'grandchild content'
  foo=bar
---grandchild
   'grandchild content'
   baz=fizbuzz
----text
    'grandchild content'
--sibling
  'with content!'
  foo=call
---text
   'with content!'

可以看到第 2 行列印結果是出乎意料的，
因為 rootNode 並沒有文字內容，只有子項目，但呼叫 .text() 時卻得到所有子項目的內容 (WHY???)
參考最後一行， sibling 節點有個 attribute 是 text ，內容正是節點內文 “with content!”
所以，調整一下程式：

    private populateForlibxmljs(node: any, level: number) {
        console.log(new Array(level + 1).join('-') + node.name()); // 節點名稱

        node.attrs().forEach(attr => {
            console.log(new Array(level + 1).join(' ') + attr.name() + '=' + attr.value()); // 屬性
        });
        node.childNodes().forEach(element => {
            if (element.name() === 'text') {
                // 文字內容被當成屬性的其中一類了
                console.log(new Array(level + 1).join(' ') + "'" + element.text() + "'");
            }
            else {
                this.populateForlibxmljs(element, level + 1);
            }
        });
    }

結果：

-rootNode
--child
  foo=bar
---grandchild
   baz=fizbuzz
   'grandchild content'
--sibling
  foo=call
  'with content!'

PERFECT!!

xml2js

解析結果是 JSON 格式物件，所以熟悉 JSON 處理的彭油可以開心使用。
因為我不太熟，所以研究了很久，在此紀錄處理方式：

    xml2js.parseString(this.xml, function (err, result) {
        console.dir(result); 
        // 結果：
        // Object {rootNode: Object}

        // 由於 console.log 預設只會列印一層資料，要引用 util 函式庫才能看到物件完整內容
        console.log(util.inspect(result, false, null));
        // 結果： 
        // { rootNode: 
        //    { child: 
        //       [ { '$': { foo: 'bar' },
        //           grandchild: [ { _: 'grandchild content', '$': { baz: 'fizbuzz' } } ] } ],
        //      sibling: [ { _: 'with content!', '$': { foo: 'call' } } ] } }
    }

遞迴解析各節點：

    private populateForxml2js(node: any, level: number) {
        let keys = Object.keys(node);   // 取得所有 key

        keys.forEach(key => {
            let value = node[key];
            if (key === '$') {
                // key of attributes
                let subKeys = Object.keys(value);
                subKeys.forEach(element => {
                    console.log(new Array(level).join(' ') + element + "=" + value[element]);
                });
            }
            else if (key === '_') {
                // key of content
                console.log(new Array(level).join(' ') + "'" + value + "'");
            }
            else if (typeof (value) === 'string') {
                console.log(new Array(level).join(' ') + key);
                console.log("key:" + key + " value:" + value);
            }
            else if (Array.isArray(value)) {
                value.forEach(element => {
                    this.populateForxml2js(element, level + 1);
                });
            }
            else if (value instanceof Object) {
                console.log(new Array(level).join(' ') + "-" + key);
                this.populateForxml2js(value, level + 1);
            }
        });
    }

結果：

-rootNode
-child
 foo=bar
 -grandchild
  'grandchild content'
  baz=fizbuzz
-sibling
 'with content!'
 foo=call

JSON 物件的 key 會是 XML 的節點名稱；
若 key 為 ‘$’，value 是此節點的屬性；
若 key 為 ‘_’，value 是此節點的內容。
重點就是要一層層解析才能取得完整內容
注意：
如果有相同節點，則會被集中放在 value 中，
所以要另外判斷，因為很複雜，所以就沒有再繼續研究下去了…..

[結論]：

光看程式碼複雜度，libxmljs 輕鬆勝出!!
以上是我的心得，如果有其他函式庫建議，或是寫法的意見，
歡迎交流!

搜尋此網誌

LEE's BLOG