Introduction

HTMLslicer is a Java application that slices a long HTML source document into either or both:

You can create a long technical document by using a word processor that can generate an HTML-formatted file, such as Microsoft Word (I am using Word 97) using all of its advantages (ex. WISIWIG, lexical and grammatical correctors, etc.). Once processed by HTMLslicer, all the resulting HTML pages share a common style from your own templates. If you prefer, you can also use the simple templates provided with this software.

HTMLslicer has been created and is owned by Marcel St-Amant. You may, however, enjoy the use of a copy of this application for free (See: License).

Here is a typical screenshot:


Application Main Window

 

Features

The HTMLslicer:

Note 1: This application is not robust; it may crash (not your system) if the source file contains HTML tag errors or is not HTML compliant. If this happens, simply stop the application, correct your source document and restart HTMLslicer.

Note 2: It does not handle heading tags that contain attributes (ex. <h2 class="subtitle">), only simple heading tags (ex. <H2>). A future version may be able to handle them.

 

Requirements

 

Installation

Simply unzip the htmlslicer.zip file under a sub-directory (or folder) of your choice.

 

Concepts

In this section, the specialized terminology (in bold) will be introduced and the slicing process will be explained.

HTMLslicer (the application) takes, as inputs:

It searches for the headings in your source document that are at the level equal or higher than your set heading level for segmentation. For example, if you set to slice at <H2> heading tags, your document will be segmented at each point where a <H1> tag and a <H2> tag are found (its cleavage points). There are two special segments (part of the document between 2 cleavage points) that may be used as a potential body template source:

It then:

The figure below illustrates how these templates (except the transition one) are used for generating your resulting HTML pages set:


Input documents used for generating the resulting HTML pages set.

As shown in the figure above, one resulting HTML page (ex.: The page related to the segment N of your source document) contains the "Top part" of a body template (Your source document may be used for that), a header (if requested), the segment content, a footer (if requested) and the "Bottom part" of the body template. If your source document has been used as a Body template, the "Bottom part" contains only this text: "</BODY></HTML>".

Not illustrated above, is the impact of the transition template. It generates a sub table of content with a hyperlink for each sub-section HTML page that follows. This resulting part (the transition part) is placed between the segment part and the footer part of such pages. In this document the, Introduction, User Interface, Templates Reference and Appendix sections contain such a transition part.

 

Tutorial

This section introduces to the use of the application.

 

User Interface

The HTMLslicer user interface has the following components:

There are other components that can be viewed as textual user input interfaces; they are in the form of templates (See Templates Reference).

 

Menus

There are two menus:

 

File Menu

 

Help Menu

 

Tabbed-Panels

There are four panels:

 

Drop-In Panel


Drop-In panel display

This panel contains a large text area. It plays two roles:

 

Slice Setup Panel


Slice Setup panel display

This panel contains two panes:

Note: The name of each segment is based from its heading title, and to avoid potential name conflicts with other segments, a unique number might be appended. If you have a sub-heading called "Introduction" under many main headings, you will get "Introduction.htm" for the first file and "Introduction1.htm" for the second file and "Introduction2.htm for the third file, etc. To generate the related file name, the under-stroke character replaces spaces, punctuation, non-letter and non-numeral characters found in the related heading title. For example, a segment with its main header title "Top & Bottom" will be found in a resulting HTML file named "Top___Bottom.htm" (with three consecutive under-strokes).

 

TOC Setup Panel


TOC Setup panel display

This panel contains three items:

 

Style Setup Panel


Style Setup panel display

This panel contains four combined items (check box with its related text field):

 

Feedback System

This application contains an extensive feedback system; it has four components:

The messages generated by this application contain two parts: An alert-level tag as a prefix and a body part that follows. There are four possible alert-level tags:

 

Drop Area and Activity Report

 This text area contains a long list of messages; each corresponds to a processing step of your source document. The steps are in the following general sequence:

 

Generated Report File

A report file named JHSplitterLog.txt is created (replacing the previous one, if any) when a source document file has been selected. Each time the File > Slice HTML command is given (on the same file), a report of the activity is appended to the file. Each report is separated by the following message:

"INFO: ========================================="

Each report contains three sections, each one is separated by a series of dashes (i.e. --------) in this order:

  1. The section of collected information. It is a list of main headings of the segments, followed by a list of heading titles with their eventual bookmarks ID (i.e. the NAME attribute of the HTML <A> tag).
  2. The content of item type identifiers (See TOC Template) from the selected TOC template (An item type identifier describes the format to be used for generating an item of the table of content).
  3. A copy of the Drop Area and Activity Report text area display

Any error found will also be reported in the report file.

 

Integrated Help System

The HTMLslicer documentation set is presented in two formats:

 

Template Reference

 The HTMLslicer application uses four types of templates while processing your source HTML document:

Each template contains a static part and variables. All variables share a common format:

An example: @PREV@

Each template has its own recognized set of variables, as explained in the following sections.

 

Body Template

 The body template is a simple HTML file that may contain one variable: @DOC@. If the variable is absent, HTMLslicer assumes that an implicit @DOC@ variable has been placed just before the last HTML tags: "</BODY></HTML>". Each resulting HTML page file contains (See also Concepts):

Indeed, @DOC@ represents the HTML current segment from your source document (with, optionally, its header and footer parts). A body template might come from three possible sources:

  1. Your source document: it does not contain any variable. The "top" part of such a template is the part contained between the beginning of your document file, and up to the first header, where the segmentation starts. The bottom part is simply the string: "</BODY></HTML>".
  2. The file named BodyTemplate.htm that is located in the application directory (the "default template"). The simplest one to use in the one I am providing with the installation package. Examine it, because it is a very good introduction. Later, you can replace it with your own designed default template, but it should have the same file name "BodyTemplate.htm".
  3. The name of the file that you specified in the Style Setup panel and that should be found in your source document directory.

 

Header and Footer Templates

 The header and footer parts are generated with the help of related templates. Both templates recognize the following variables:

An example will illustrate how such a template works. Here is a simple footer template:

<HR><A HREF="@PREV@">Prev</A>&nbsp;&nbsp;&nbsp;<A HREF="@NEXT@">Next</A>

If your documents had the following segments "Chapter1", Chapter 2" etc; the resulting files will be Chapter1.htm, Chapter2.htm, etc. For your third segment, the footer part will become:

<HR><A HREF="Chapter2.htm">Prev</A>&nbsp;&nbsp;&nbsp;<A HREF="Chapter4.htm">Next</A>

Probably, you get the idea.

A header and/or footer template might come from two possible sources:

  1. The files named HeaderTemplate.txt and FooterTemplate.txt that are located in the application directory; they are the default header and footer templates. You can replace these files with your own version as long as they have the same names.
  2. The name of the files that you specified in the Style Setup panel and that should be found in your source document directory.

 

Transition Template

The Transition template contains variables and a special <X1> and </X1> tag pairs that delimit the part that is repeated for each item of the sub-table of contents. The top and bottom parts of the template are not repeated and contain no variable. The part that is delimited by the <X1> and </X1> tags contains the following variables:

Here is a simple transition template, the one used for this documentation:

<HR><H4>Table of Contents</H4><UL>
<X2>   <LI><A HREF="@FILEPATH@">@TITLE@</A></LI>
</X2></UL>

 

TOC Template

The TOC template contains variables and special item type identifier tags, or, I call them the "X" tags. As explained earlier, an item type identifier contains formatting information for generating an item that identifies a resulting HTML page that corresponds to a specific level of its related heading. A TOC page has the following parts:

There are 7 recognized X tags pairs, three item tag pairs and four transitional tag pairs. The item tag pairs are:

The template part contained within these X tag pairs can have the following variables:

A typical simple second level item part will be:

<X2>   <LI><A HREF="@FILEPATH@">@TITLE@</A></LI>
</X2>

The following X tags (recognized by the fact that they contain 2 digits) are "transitional" tags. They do not contain any variable.

A typical simple 1st to 2nd level transition is:

<X12><UL>
</X12>

The TOC.htm file of this document has been created with the help of the TOCTemplate.htm file that is located in the application directory. The TOC template might come from two possible sources:

  1. The file named TOCTemplate.htm that is located in the application directory (the "default TOC template"). You can replace this file with your own designed TOC template and it should have the same name.
  2. The name of the file that you specified in the TOC Setup panel and that should be found in your source document directory.

 

JavaHelp Templates

Four templates are required to create a corresponding set of JavaHelp metadata files; their related file names are:

Unless you are a skilled hacker or Java programmer, I recommend that you do not modify these files. With these templates the following files will be created in your source document directory:

NAMETOC.xml: The resulting TOC metadata file for your Java-based help document.

Map.jhm: The resulting map file for your Java-based help document.

NAME.hs: The resulting help set document for your Java-based help document.

The "NAME" part of the above file name is replaced by the name of your source document file. A Java programmer will be able to integrate your HTML file set and the metadata file set to the related Java application. This integration part is documented by Sun Microsystems and is out of the scope of this document.

 

Appendix

This appendix contains three sections:

 

License

HTMLslicer v. 1.0 freeware. 

Copyright ( C ) 2002 by Marcel St-Amant

By using this software you accept all the terms and conditions of this License Agreement

This software is the property of Marcel St-Amant and is protected by copyright law. There is no license fee, and registration is not required for HTMLslicer v.1.0 freeware. However, this Agreement does not grant you any rights to enhancements or updates, or support or maintenance of the product. Future versions of the product may not be freeware.

This software is provided 'as-is', without any express or implied warranty. In no event will the author be held liable for any damages arising from the use of this software.

Permission is granted to anyone to use this software for document generation purpose, including commercial applications and create and publish derivative work from it (the resulting HTML page file set, TOC page file and JavaHelp compliant help document).

You may not modify this software in any way, nor reverse-engineer it. If you find a bug or have a suggestion, please contact the author by Email, however, this does not imply that the requested correction or enhancement will be applied.

You may distribute this software to anyone with the following restrictions:

  1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software.
  2. You must provide an EXACT copy of the original package; no modification is allowed.
  3. If you provide the software on your Internet site, the author of this software must be clearly identified and a link to his site be provided. Please inform the author by E-mail.

Author: Marcel St-Amant
Email: bigfeet@videotron.ca
Big Feet Software
Montreal, Quebec, CANADA.

 

About Big Feet Software

Big Feet Software provides Java-based application in the following domains:

 

About Me

I am Marcel St-Amant, a physicist and geophysicist by training (a BSc and MSc in these fields) and worked as such for the first part of my career. I gradually became more involved in computers then in software development. I am a Sun-Certified Programmer for Java 2 platform.

As hobbies, I did many electronic and optic projects (IR vision system, UV laser, various detectors, remote control and amateur scientific instrumentation, etc). I like to do software for controlling and monitoring various electronic gadgets. I also like to create Java applications. I draw comics (European style).

 

Author: Marcel St-Amant
Email: bigfeet@videotron.ca
Big Feet Software
Montreal, Quebec, CANADA.