html5lib/html5lib-php
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
master
Could not load branches
Nothing to show
Could not load tags
Nothing to show
{{ refName }}
default
Code
-
Clone
Use Git or checkout with SVN using the web URL.
Work fast with our official CLI. Learn more.
- Open with GitHub Desktop
- Download ZIP
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
Latest commit
Files
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
html5lib - php flavour
This is an implementation of the tokenization and tree-building parts
of the HTML5 specification in PHP. Potential uses of this library
can be found in web-scrapers and HTML filters.
Warning: This is a pre-alpha release, and as such, certain parts of
this code are not up-to-snuff (e.g. error reporting and performance).
However, the code is very close to spec and passes 100% of tests
not related to parse errors. Nevertheless, expect to have to update
your code on the next upgrade.
Usage notes:
<?php
require_once '/path/to/HTML5/Parser.php';
$dom = HTML5_Parser::parse('<html><body>...');
$nodelist = HTML5_Parser::parseFragment('<b>Boo</b><br>');
$nodelist = HTML5_Parser::parseFragment('<td>Bar</td>', 'table');
Documentation:
HTML5_Parser::parse($text)
$text : HTML to parse
return : DOMDocument of parsed document
HTML5_Parser::parseFragment($text, $context)
$text : HTML to parse
$context : String name of context element
return : DOMDocument of parsed document
Developer notes:
* To setup unit tests, you need to add a small stub file test-settings.php
that contains $simpletest_location = 'path/to/simpletest/'; This needs to
be version 1.1 (or, until that is released, SVN trunk) of SimpleTest.
* We don't want to ultimately use PHP's DOM because it is not tolerant
of certain types of errors that HTML 5 allows (for example, an element
"foo@bar"). But the current implementation uses it, since it's easy.
Eventually, this html5lib implementation will get a version of SimpleTree;
and may possibly start using that by default.
vim: et sw=4 sts=4
About
PHP port of html5lib, currently unmaintained.
Resources
Stars
Watchers
Forks
Packages 0
No packages published