René Nyffenegger's collection of things on the web
René Nyffenegger on Oracle - Most wanted - Feedback -
 

Validating HTML with onsgmls

Valid html

I once decided to validate the html for all my pages on www.adp-gmbh.ch. It looked like nsgmls was the way to go, but after I have downloaded it and compiled it, it took me a while until I figured out how to use it; although, in fact, it is quite easy:
onsgmls -s -E0 -c /path/to/catalog /path/for/file.html
find comes in handy if one wants to check all html pages within a directory and its subdirectories:
find . -name '*.html' -exec onsgmls -s -E0 -c ~/htmlv/sgml-lib/catalog {} \; 2> wrong_htmls
The -c flag specifies the catalog. Such a catalog (and associated files) can be downloaded from validator.w3.org.
nsgmls is part of openjade and can be downloaded from here.

Broken links

I use checkbot to check for broken links. From the docu: Checkbot is a tool to verify links on a set of HTML pages. Checkbot can check a single document, or a set of documents on one or more servers. Checkbot creates a report which summarizes all links which caused some kind of warning or error.

Other HTML validators

Other HTML validators (that I haven't tried, however) are: