Sunday, April 1, 2007

Charlie's Tidy Add-ons

Charlie's Tidy Add-ons

Charlie's Tidy Add-Ons
By Charles ReitzelPlease send bug reports or comments to mailto:creitzel@rcn.com?Subject=Tidy%20Add-ons
This is a brief page showing a couple additions to Tidy I have written.
Latest: 20 February, 2003, Exposed TidyOutputBOM option to TidyATL (and thru to Tidy.NET) and speeded up syntax highlighting. Removed DLL build, which is now included in TidyLib proper.
3 February, 2003, Added required DLL to tidyui.zip
1 February, 2003, Removed stuff that you can now get from the Tidy Project Page. Added .NET wrapper and syntax highlighting and other goodies to Tidy UI.
Enjoy!
Table Of Contents
The following are current as of 20 February, 2003:
Tidy UI More Info
C++ Wrapper More Info
Perl Wrapper More Info
COM/ATL Wrapper More Info
.NET Wrapper More Info
Tidy UI
This is a windows executable that puts up a GUI for the HTML Tidy library.


COM/ATL
The latest is a simple COM/ATL wrapper for the library. The simple operations are supported: parse file, parse from memory, cleanup, diagnostics, save file and save to memory. You can also set options in the usual ways. I got just a bit fancy and supported the I/O and error handling callbacks. Also, TidyLib fixes for Unicode/UTF-16 are included.
20 February, 2003: Fixed TidyOptionId enum so that all the conditionally compiled options are included, especially TidyOutputBOM. Note, for XHTML/XML output, the BOM is required. However, if you set TidyOutputBOM false after the parse, then it will be respected.
1 February, 2003: Some IDL updates, but no interface/UUID changes. IDL changes are purely for the benefit of generating the .NET wrapper.
Previous fix for character conversion in ATL wrapper now works fine with UTF16 due to fixes in core library. Thanks to Moshe Plotkin for identifying problem and testing updates. Parse/Save String worked OK only if current code page and desired encoding match. Now, the "String" methods temporarily force the encoding to UTF16LE to work with COM/OLE Unicode strings. Didn't break the test on my Latin1 system. Feedback still appreciated on non-Western European systems/content.
There is an example of redirecting Tidy output to a static control in the VB test driver. Note, this is still a rough draft. If there is demand, it may flesh out a bit.
.NET
You can download a pre-generated .NET wrapper here.
20 February, 2003: Regenerated to use latest TidyATL.
I have also been examining how to call Tidy from .NET. So far, there are 3 different options. Which is best depends, as always, on your requirements.
Quick and Dirty
With a few simple declarations, you can call directly into a DLL build of TidyLib. See a simple example VB.NET program sent to me by Phil Weber.