My résumé is typeset using LaTeX. It generates nice PDFs, but many online job applications require plain-text versions to be submitted.
Here is a sequence that converts my LaTeX file to plain text:
$ latex zdpurvis_resume.tex
$ catdvi -e 1 -U zdpurvis_resume.dvi | sed -re "s/\[U\+2022\]/*/g" | sed -re "s/([^^[:space:]])\s+/\1 /g" > zdpurvis_resume.txt
The -e 1 option to catdvi tells it to output ASCII. If you use 0 instead of 1, it will output Unicode. Unicode will include all the special characters like bullets, emdashes, and Greek letters. It also include ligatures for some letter combinations like "fi" and "fl." You may not like that. So, use -e 1 instead. Use the -U option to tell it to print out the unicode value for unknown characters so that you can easily find and replace them.
The second part of the command finds the string [U+2022] which is used to designate bullet characters (•) and replaces them with an asterisk (*).
The third part eats up all the extra whitespace catdvi threw in to make the text full-justified while preserving spaces at the start of lines (indentation).
After running these commands, you would be wise to search the .txt file for the string [U+ to make sure no Unicode characters that can't be mapped to ASCII were left behind and fix them.
- Converting LaTeX to plain text
accomplished
Thank goodness...
(Anonymous)
2009-07-16 12:31 am (UTC)
2009-07-16 12:53 am (UTC)
2009-07-16 01:12 am (UTC)
detex, unicode
(Anonymous)
2009-08-02 09:18 am (UTC)
Re: detex, unicode
2009-08-02 03:58 pm (UTC)
I tried out detex, and it leaves a little to be desired for me. It just strips out the TeX commands that it recognizes, and leaves the ones it doesn't. I wind up with some spacing commands being ignored, some measurement units still being displayed, and \kill lines still showing up. For example: UniversityRaleigh, NCAugust 2008. Custom commands aren't parsed properly, either:
[3]
#1, #2 #3
And, I appreciate the irony of résumé having non-unicode characters where I'm suggesting people convert to ASCII. Some online forms choke on the unicode chars, though. :(
Спасибо за инфу
2011-07-23 04:05 am (UTC)