detroff


detroff, a C++ code which makes a copy of a file from which all pairs of backspace+character have been removed.

Given the string of 8 characters:

AB#C#DE
where we are using "#" to represent a backspace, the returned string will be
AE

This function was originally written for use in "cleaning up" MAN pages so that they would print properly. These MAN pages were formatted in the Paleolithic and Byzantine TROFF printing format. Although the files were text, and would seem to "print" correctly to the screen, an unholy mess would emerge if the same file was sent to the printer. This is because the screen handled the backspaces by backspacing, but most printers don't know anymore how to handle TROFF's backspaces, and so they just print them as blobs, instead of, say, spacing back.

Passages which are to be underlined were written so:

_#T_#e_#x_#t
when what is meant is that "Text" is to be underlined if possible. Note that the seemingly equivalent
T#_e#_x#_t#_
is NOT used. This is because, in the olden days, certain screen terminals could backspace, but would only display the new character, obliterating rather than overwriting the old one. This convention allows us to know that we want to delete "character" + Backspace, rather than Backspace + "character".

Passages which are meant to be in BOLDFACE are written so:

U#U#U#Ug#g#g#gl#l#l#ly#y#y#y
when what is meant is that "Ugly" is to be printed as boldly as possible. These boldface passages may also be cleaned up using the same rule of removing all occurrences of "character" + Backspace.

It is truly a fright to look at the text of one of these MAN pages with all the ugly Backspace's, which sometimes display as ^H or as a reverse-video bugsplat.

Usage:

detroff old new
where

Licensing:

The information on this web page is distributed under the MIT license.

Languages:

detroff is available in a C++ version.

Related Data and codes:

detroff_test

CR2CRLF, a C++ code which replaces carriage returns by carriage returns + line feeds.

CR2LF, a C++ code which reads a text file and replaces carriage returns by line feeds.

CRRM, a C++ code which removes all carriage returns from a file.

DEBLANK, a C++ code which makes a copy of a text file which contains no blank lines.

DECOMMENT, a C++ code which makes a copy of a text file which contains no "comment" lines (that begin with "#").

FILUM, a C++ code which performs various operations on files.

LF2CR, a C++ code which replaces linefeeds by carriage returns.

LF2CRLF, a C++ code which replaces linefeeds by carriage returns + line feeds.

LFRM, a C++ code which removes all linefeeds from a file.

REFORMAT, a FORTRAN90 code which reads a text file that contains only real values, and writes a copy which has a fixed number of real values on each line.

REWORD, a C++ code which reads a text file and writes a copy which has a fixed number of "words" per line.

UNCONTROL, a C++ code which makes a copy of a text file which contains no control characters.

WRAP, a C++ code which makes a copy of a text file in which no line is longer than a user-specified wrap length.

WRAP2, a C++ code which wraps long lines in a text file, but which wraps some lines "early", so as to avoid breaking words.

Source Code:


Last revised on 27 January 2009.