THTMLParser v1.02
THTMLParser is a delphi class to parse a HTML file.
The file will be split into tags and text
objects (useful for validating tags or for automatic code corrections).
Sample file included (very simple web browser!).
Supports HTML3.2 entities and Western Latin-1 charset.
How to use the HTMLParser
Create an instance of HTMLParser with
HTMLParser:=THTMLParser.Create;
then load a HTML file, e.g.
HTMLParser.Lines.LoadfromFile(filename)
whereas Lines is a normal TStringlist.
With
HTMLParser.Execute;
the file will be parsed into HTMLParser.Parsed
this TList consists of objects derived from 2 classes:
type THTMLText = class
property Line:string; // HTML3.2 Entities and Western Latin-1 Font converted text
property Raw:string; // raw text line as read from input file
type THTMLTag = class
Params:TList; // see below
property Name:string; // uppercased TAG
property Raw:string; // raw TAG (parameters included) as read from input file
Params is a list of all parameters for the TAG (if any):
type THTMLParam = class
property Key:string; // Key name
property Value:string; // Value name
property Raw:string read fRaw; // raw parameter line
Example
The HTML file
<html>
<BODY LINK="#FF00FF" border=0>
Hello You & Co!
</html>
will result in 4 objects (HTMLParser.Parsed.Count=4):
[0] HTMLTag.Name = "HTML"
.Params.count = 0
[1] HTMLTag.Name = "BODY"
.Params.count = 2
[0] HTMLParam.Key = "LINK"
.Value = "#FF00FF"
[1] HTMLParam.Key = "BORDER"
.Value = "0"
[2] HTMLText.Line = "Hello You & Co!"
[3] HTMLTag.Name = "/HTML"
.Params.count = 0
Comments and Bugs
Please send any comments or bugs to
dennis@spreendigital.de.
Known Bugs and Problems
There are some problems with chars in a hex representation e.g. & #x67.
Important!
Please do NOT report any bugs considering this WebBrowser sample!
This sample is not meant as a full HTML compatible browser, indeed it is
programmed to show this help file only.
©1999 Dennis D. Spreen