HTMLParser.HTMLParser¶
-
class
HTMLParser.HTMLParser[source]¶ Find tags and other markup and call handler functions.
- Usage:
- p = HTMLParser() p.feed(data) ... p.close()
Start tags are handled by calling self.handle_starttag() or self.handle_startendtag(); end tags by self.handle_endtag(). The data between tags is passed from the parser to the derived class by calling self.handle_data() with the data as argument (the data may be split up in arbitrary chunks). Entity references are passed by calling self.handle_entityref() with the entity reference as the argument. Numeric character references are passed to self.handle_charref() with the string containing the reference as the argument.
Methods¶
__init__() |
Initialize and reset this instance. |
_parse_doctype_attlist(i, declstartpos) |
|
_parse_doctype_element(i, declstartpos) |
|
_parse_doctype_entity(i, declstartpos) |
|
_parse_doctype_notation(i, declstartpos) |
|
_parse_doctype_subset(i, declstartpos) |
|
_scan_name(i, declstartpos) |
|
check_for_whole_start_tag(i) |
|
clear_cdata_mode() |
|
close() |
Handle any buffered data. |
error(message) |
|
feed(data) |
Feed data to the parser. |
get_starttag_text() |
Return full source of start tag: ‘<...>’. |
getpos() |
Return current line number and offset. |
goahead(end) |
|
handle_charref(name) |
|
handle_comment(data) |
|
handle_data(data) |
|
handle_decl(decl) |
|
handle_endtag(tag) |
|
handle_entityref(name) |
|
handle_pi(data) |
|
handle_startendtag(tag, attrs) |
|
handle_starttag(tag, attrs) |
|
parse_bogus_comment(i[, report]) |
|
parse_comment(i[, report]) |
|
parse_declaration(i) |
|
parse_endtag(i) |
|
parse_html_declaration(i) |
|
parse_marked_section(i[, report]) |
|
parse_pi(i) |
|
parse_starttag(i) |
|
reset() |
Reset this instance. |
set_cdata_mode(elem) |
|
unescape(s) |
|
unknown_decl(data) |
|
updatepos(i, j) |
Attributes¶
CDATA_CONTENT_ELEMENTS |
|
entitydefs |