Pike v8.1 release 6

Class Parser.SGML

Description

This is a handy simple parser of SGML-like syntax like HTML. It doesn't do anything advanced, but finding the corresponding end-tags.

It's used like this:

array res=Parser.SGML()->feed(string)->finish()->result();

The resulting structure is an array of atoms, where the atom can be a string or a tag. A tag contains a similar array, as data.

Example

A string "<gat>&nbsp;<gurka>&nbsp;</gurka>&nbsp;<banan>&nbsp;<kiwi>&nbsp;</gat>" results in

({
   tag "gat" object with data:
   ({
       tag "gurka" object with data:
       ({
           " "
       })
       tag "banan" object with data:
       ({
           " "
           tag "kiwi" object with data:
           ({
              " "
           })
       })
   })
})

ie, simple "tags" (not containers) are not detected, but containers are ended implicitely by a surrounding container _with_ an end tag.

The 'tag' is an object with the following variables:

	 string name;           - name of tag
	 mapping args;          - argument to tag
	 int line,char,column;  - position of tag
	 int eline,echar,ecolumn;  - end position of tag, src[char..echar-1] got the block. add by Xuesong Guo
	 string file;           - filename (see <ref>create</ref>)
	 array(SGMLatom) data;  - contained data
	 int open;		- is not an empty element and has no end tag. add by Xuesong Guo
     


Variable file

string Parser.SGML.file


Method create

Parser.SGML Parser.SGML()
Parser.SGML Parser.SGML(string filename, function(:void)|void name_formater, function(:void)|void argname_formater)

Description

This object is created with this filename. It's passed to all created tags, for debug and trace purposes. All tag name will be replace as name_formater(name) All arg_name will be replace as argname_formater(arg_name)

Note

No, it doesn't read the file itself. See feed().