Module net.html stdlib

net.html
Version:
0.3.3
License:
MIT
Dependencies from vmod:
0
Imports:
3
Imported by:
0
Repository:
OS-specific
Show selected OS-specific symbols.
Backend-specific
Show selected Backend-specific symbols.

Dependencies defined in v.mod

This section is empty.

Imports

Imported by

This section is empty.

Overview

net/html is an HTML Parser written in pure V.

Usage

import net.html

fn main() {
    doc := html.parse('<html><body><h1 class="title">Hello world!</h1></body></html>')
    tag := doc.get_tag('h1')[0] // <h1>Hello world!</h1>
    println(tag.name) // h1
    println(tag.content) // Hello world!
    println(tag.attributes) // {'class':'title'}
    println(tag.str()) // <h1 class="title">Hello world!</h1>
}

More examples found on parser_test.v.html and html_test.v.html

Aliases

This section is empty.

Constants

This section is empty.

Sum types

This section is empty.

Functions

#fn parse

fn parse(text string) DocumentObjectModel

parse parses and returns the DOM from the given text.

#fn parse_file

fn parse_file(filename string) DocumentObjectModel

parse_file parses and returns the DOM from the contents of a file.

Structs

#struct DocumentObjectModel

pub struct DocumentObjectModel {
mut:
	root           &Tag = unsafe { nil }
	constructed    bool
	btree          BTree
	all_tags       []&Tag
	all_attributes map[string][]&Tag
	close_tags     map[string]bool // add a counter to see count how many times is closed and parse correctly
	attributes     map[string][]string
	tag_attributes map[string][][]&Tag
	tag_type       map[string][]&Tag
	debug_file     os.File
}

The W3C Document Object Model (DOM) is a platform and language-neutral interface that allows programs and scripts to dynamically access and update the content, structure, and style of a document.

https://www.w3.org/TR/WD-DOM/introduction.html

#fn (&DocumentObjectModel) print_debug

debug_html ?
fn (mut dom &DocumentObjectModel) print_debug(data string)

#fn (&DocumentObjectModel) where_is

fn (mut dom &DocumentObjectModel) where_is(item_name string, attribute_name string) int

#fn (&DocumentObjectModel) add_tag_attribute

fn (mut dom &DocumentObjectModel) add_tag_attribute(tag &Tag)

#fn (&DocumentObjectModel) add_tag_by_type

fn (mut dom &DocumentObjectModel) add_tag_by_type(tag &Tag)

#fn (&DocumentObjectModel) add_tag_by_attribute

fn (mut dom &DocumentObjectModel) add_tag_by_attribute(tag &Tag)

#fn (&DocumentObjectModel) construct

fn (mut dom &DocumentObjectModel) construct(tag_list []&Tag)

#fn (&DocumentObjectModel) get_tag_by_attribute_value

fn (mut dom &DocumentObjectModel) get_tag_by_attribute_value(name string, value string) []&Tag

get_tag_by_attribute_value retrieves all the tags in the document that has the given attribute name and value.

fn (dom &DocumentObjectModel) get_tag(name string) []&Tag

get_tag retrieves all the tags in the document that has the given tag name.

#fn (&DocumentObjectModel) get_tag_by_attribute

fn (dom &DocumentObjectModel) get_tag_by_attribute(name string) []&Tag

get_tag_by_attribute retrieves all the tags in the document that has the given attribute name.

#fn (&DocumentObjectModel) get_root

fn (dom &DocumentObjectModel) get_root() &Tag

get_root returns the root of the document.

#fn (&DocumentObjectModel) get_tags

fn (dom &DocumentObjectModel) get_tags() []&Tag

get_tags returns all of the tags stored in the document.

#fn (&DocumentObjectModel) get_tags_by_class_name

fn (dom &DocumentObjectModel) get_tags_by_class_name(names []string) []&Tag

get_tags_by_class_name retrieves all the tags recursively in the document that has the given class name(s).

#struct Parser

pub struct Parser {
mut:
	dom                DocumentObjectModel
	lexical_attributes LexicalAttributes = LexicalAttributes{
		current_tag: &Tag{}
	}
	filename    string = 'direct-parse'
	initialized bool
	tags        []&Tag
	debug_file  os.File
}

Parser is responsible for reading the HTML strings and converting them into a DocumentObjectModel.

#fn (&Parser) add_code_tag

fn (mut parser &Parser) add_code_tag(name string)

This function is used to add a tag for the parser ignore it's content.

For example, if you have an html or XML with a custom tag, like <script>, using this function, like add_code_tag('script') will make all script tags content be jumped, so you still have its content, but will not confuse the parser with it's > or <.

#fn (Parser) builder_str

inline
fn (parser Parser) builder_str() string

#fn (&Parser) print_debug

debug_html ?
fn (mut parser &Parser) print_debug(data string)

#fn (&Parser) verify_end_comment

fn (mut parser &Parser) verify_end_comment(remove bool) bool

#fn (&Parser) init

fn (mut parser &Parser) init()

init initializes the parser.

#fn (&Parser) generate_tag

fn (mut parser &Parser) generate_tag()

#fn (&Parser) split_parse

fn (mut parser &Parser) split_parse(data string)

split_parse parses the HTML fragment

#fn (&Parser) parse_html

fn (mut parser &Parser) parse_html(data string)

parse_html parses the given HTML string

#fn (&Parser) finalize

inline
fn (mut parser &Parser) finalize()

finalize finishes the parsing stage .

#fn (&Parser) get_dom

fn (mut parser &Parser) get_dom() DocumentObjectModel

get_dom returns the parser's current DOM representation.

#struct Tag

heap
pub struct Tag {
pub mut:
	name               string
	content            string
	children           []&Tag
	attributes         map[string]string // attributes will be like map[name]value
	last_attribute     string
	class_set          datatypes.Set[string]
	parent             &Tag = unsafe { nil }
	position_in_parent int
	closed             bool
	close_type         CloseTagType = .in_name
}

Tag holds the information of an HTML tag.

#fn (&Tag) add_parent

fn (mut tag &Tag) add_parent(t &Tag, position int)

#fn (&Tag) add_child

fn (mut tag &Tag) add_child(t &Tag) int

#fn (&Tag) text

fn (tag &Tag) text() string

text returns the text contents of the tag.

#fn (&Tag) str

fn (tag &Tag) str() string

#fn (&Tag) get_tags

fn (tag &Tag) get_tags(name string) []&Tag

get_tags retrieves all the child tags recursively in the tag that has the given tag name.

#fn (&Tag) get_tags_by_attribute

fn (tag &Tag) get_tags_by_attribute(name string) []&Tag

get_tags_by_attribute retrieves all the child tags recursively in the tag that has the given attribute name.

#fn (&Tag) get_tags_by_attribute_value

fn (tag &Tag) get_tags_by_attribute_value(name string, value string) []&Tag

get_tags_by_attribute_value retrieves all the child tags recursively in the tag that has the given attribute name and value.

#fn (&Tag) get_tags_by_class_name

fn (tag &Tag) get_tags_by_class_name(names []string) []&Tag

get_tags_by_class_name retrieves all the child tags recursively in the tag that has the given class name(s).

Interfaces

This section is empty.

Enums

This section is empty.