File: //lib64/python3.8/html/__pycache__/parser.cpython-38.opt-1.pyc
U
e5d9E ã @ sÀ d Z ddlZddlZddlZddlmZ dgZe d¡Ze d¡Z e d¡Z
e d¡Ze d ¡Ze d
¡Z
e d¡Ze d¡Ze d
¡Ze dej¡Ze d
¡Ze d¡ZG dd„ dejƒZdS )zA parser for HTML and XHTML.é N)ÚunescapeÚ
HTMLParserz[&<]z
&[a-zA-Z#]z%&([a-zA-Z][-.a-zA-Z0-9]*)[^a-zA-Z0-9]z)&#(?:[0-9]+|[xX][0-9a-fA-F]+)[^0-9a-fA-F]z <[a-zA-Z]ú>z--\s*>z+([a-zA-Z][^\t\n\r\f />\x00]*)(?:\s|/(?!>))*z]((?<=[\'"\s/])[^\s/>][^\s/=>]*)(\s*=+\s*(\'[^\']*\'|"[^"]*"|(?![\'"])[^>\s]*))?(?:\s|/(?!>))*aF
<[a-zA-Z][^\t\n\r\f />\x00]* # tag name
(?:[\s/]* # optional whitespace before attribute name
(?:(?<=['"\s/])[^\s/>][^\s/=>]* # attribute name
(?:\s*=+\s* # value indicator
(?:'[^']*' # LITA-enclosed value
|"[^"]*" # LIT-enclosed value
|(?!['"])[^>\s]* # bare value
)
\s* # possibly followed by a space
)?(?:\s|/(?!>))*
)*
)?
\s* # trailing whitespace
z#</\s*([a-zA-Z][-.a-zA-Z0-9:_]*)\s*>c @ sè e Zd ZdZdZddœdd„Zdd„ Zd d
„ Zdd„ Zd
Z dd„ Z
dd„ Zdd„ Zdd„ Z
dd„ Zd9dd„Zdd„ Zdd„ Zdd „ Zd!d"„ Zd#d$„ Zd%d&„ Zd'd(„ Zd)d*„ Zd+d,„ Zd-d.„ Zd/d0„ Zd1d2„ Zd3d4„ Zd5d6„ Zd7d8„ Zd
S ):r aE Find tags and other markup and call handler functions.
Usage:
p = HTMLParser()
p.feed(data)
...
p.close()
Start tags are handled by calling self.handle_starttag() or
self.handle_startendtag(); end tags by self.handle_endtag(). The
data between tags is passed from the parser to the derived class
by calling self.handle_data() with the data as argument (the data
may be split up in arbitrary chunks). If convert_charrefs is
True the character references are converted automatically to the
corresponding Unicode character (and self.handle_data() is no
longer split in chunks), otherwise they are passed by calling
self.handle_entityref() or self.handle_charref() with the string
containing respectively the named or numeric reference as the
argument.
)ZscriptZstyleT)Úconvert_charrefsc C s || _ | ¡ dS )zÆInitialize and reset this instance.
If convert_charrefs is True (the default), all character references
are automatically converted to the corresponding Unicode characters.
N)r Úreset)Úselfr © r ú#/usr/lib64/python3.8/html/parser.pyÚ__init__W s zHTMLParser.__init__c C s( d| _ d| _t| _d| _tj | ¡ dS )z1Reset this instance. Loses all unprocessed data.Ú z???N)ÚrawdataÚlasttagÚinteresting_normalÚinterestingÚ
cdata_elemÚ_markupbaseÚ
ParserBaser ©r r r r r ` s
zHTMLParser.resetc C s | j | | _ | d¡ dS )z‘Feed data to the parser.
Call this as often as you want, with as little or as much text
as you want (may include '\n').
r N)r Úgoahead©r Údatar r r Úfeedh s zHTMLParser.feedc C s | d¡ dS )zHandle any buffered data.é N)r r r r r Úcloseq s zHTMLParser.closeNc C s | j S )z)Return full source of start tag: '<...>'.)Ú_HTMLParser__starttag_textr r r r Úget_starttag_textw s zHTMLParser.get_starttag_textc C s$ | ¡ | _t d| j tj¡| _d S )Nz</\s*%s\s*>)Úlowerr ÚreÚcompileÚIr )r Úelemr r r Úset_cdata_mode{ s
zHTMLParser.set_cdata_modec C s t | _d | _d S ©N)r r r r r r r Úclear_cdata_mode s zHTMLParser.clear_cdata_modec C sJ | j }d}t|ƒ}||k rÚ| jrv| jsv| d|¡}|dk r | dt||d ƒ¡}|dkrpt d¡ ||¡spqÚ|}n*| j
||¡}|r’| ¡ }n| jrœqÚ|}||k rÞ| jrÌ| jsÌ| t
|||… ƒ¡ n| |||… ¡ | ||¡}||kröqÚ|j}|d|ƒrJt ||¡r"| |¡} n†|d|ƒr:| |¡} nn|d|ƒrR| |¡} nV|d|ƒrj| |¡} n>|d |ƒr‚| |¡} n&|d
|k rÚ| d¡ |d
} nqÚ| dk r<|s¼qÚ| d|d
¡} | dk rú| d|d
¡} | dk r|d
} n| d
7 } | jr*| js*| t
||| … ƒ¡ n| ||| … ¡ | || ¡}q|d|ƒrðt ||¡}|r²| ¡ d
d… }
| |
¡ | ¡ } |d| d
ƒs¢| d
} | || ¡}qn<d||d … krÚ| |||d
… ¡ | ||d
¡}qÚq|d|ƒrt ||¡}|rN| d
¡}
| |
¡ | ¡ } |d| d
ƒs@| d
} | || ¡}qt ||¡}|r¨|rÚ| ¡ ||d … krÚ| ¡ } | |kr’|} | ||d
¡}qÚn.|d
|k rÚ| d¡ | ||d
¡}nqÚqq|r8||k r8| js8| jr| js| t
|||… ƒ¡ n| |||… ¡ | ||¡}||d … | _ d S )Nr ú<ú&é"