


brede_str_ps2txt - Convert a PostScript file to a text file.
function brede_str_ps2txt(filename)
Input: Filename
This function tries to translate a PostScript file to text
(ASCII). The text file is written to .txt.
There are usually problems because of kerning,
two-column lay-out and line break. An other problem is that
specific sections cannot be cut out, e.g., Nature and Science
articles might start on the middle of the page. There are a
number of programs that does PostScript conversion:
pdftotext, maintains the columns by default. But the -raw
switch can make it one column
ps2a, ps2ascii, ps2txt, ps2ascii.ps or ps2ascii.pl.
prescript requires python
ps2ascii does not work well for two-column files: The text is
interlaced. Landscape files might fail. Kerning can be a
problem.
pstotext handles two-column files but is slow and a version
did not terminate. Furthermore the kerning can become a
problem so that words are not separated.
References:
PreScript, http://www.nzdl.org/html/prescript.html
ps2ascii, comes with ghostscript
pstotext, http://research.compaq.com/SRC/virtualpaper/pstotext.html
See also BREDE, BREDE_STR, BREDE_STR_PDF2TXT.
$Id: brede_str_ps2txt.m,v 1.4 2003/07/08 15:29:49 fnielsen Exp $