Skip to content

picomet/htmst

Repository files navigation

htmst

PyPI - Python Versions PyPI - Version GitHub

htmst is a python library for parsing html into AST with positions.

Installation

uv add htmst

or

pip install htmst

Usage

from htmst import HtmlAst

html = """<span foo="bar">hi</span>"""
ast = HtmlAst(html)

print(ast.root.children[0].tag) # span

print(ast.root.children[0].start.row) # 0
print(ast.root.children[0].start.col) # 0

print(ast.root.children[0].end.row) # 0
print(ast.root.children[0].end.col) # 25

print(ast.root.children[0].attrs[0].name) # foo
print(ast.root.children[0].attrs[0].value) # bar

print(ast.root.children[0].attrs[0].start.row) # 0
print(ast.root.children[0].attrs[0].start.col) # 6

print(ast.root.children[0].attrs[0].end.row) # 0
print(ast.root.children[0].attrs[0].end.col) # 15

Nodes

  • DoubleTagNode: represents double tags
  • SingleTagNode: represents single tags
  • AttrNode: represents attributes
  • TextNode: represents texts
  • CommentNode: represents comments
  • DoctypeNode: represents doctypes

Each node has a start and end position.

Contributing

Contributions are welcome! Please read the contributing guidelines for more information.

License

This project is licensed under the MIT License.

About

HTML to AST with positions

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Languages