How to quickly convert file formats

published 2023-08-08 (last changed on 2024-03-14) by Lukas Winkler

This is a collection of a few useful commands for converting files between formats that I use all the time.

# Audio

# Audio → FLAC

lossless if converting from e.g. WAV

$ flac --best input.wav

# Audio → Opus

Check Opus Recommended Settings for the ideal bitrate for your use case. 24 Kb/s is a good basis for very small files that are still high quality for voice recordings. Only use --downmix-mono if you don’t loose information by merging stereo audio to mono.

$ opusenc --bitrate 24 --downmix-mono input.wav output.opus

# PDFs

# PDF → Extracted Images

$ mkdir tmp
$ pdfimages input.pdf -all tmp/name

This will extract all images that are contained in the input pdf to tmp/name-000.png, tmp/name-001.jpg, etc.


$ pdftoppm input.pdf slides -png -scale-to 1080 -progress

This will generate images like slides-1.png for every page of the PDF with the specified width. With e.g. -scale-to 3840 one can quickly convert a PDF of presentation slides to images for a high quality 4K video.

# PDF → compressed PDF


Sometimes I have a PDF that is far too large (hundreds of MB for a simple document) because it was generated in an ineffcient way. Using ghostscript can in many cases reduce the file size dramatically while decreasing the quality only a bit.

$ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dPrinted=false -dNOPAUSE -dBATCH -sOutputFile=small.pdf input.pdf 

Depending on the required quality, /ebook can be replaced with one of /printer, /prepress and /default (or /screen for a very bad quality). See the documentation for more information

# PDF → PDF with OCR

$ ocrmypdf -cdr --force-ocr input.pdf ocr.pdf -l deu

Check the documentation for more information.

# PDF → Plaintext → Spellcheck

$ pdftotext main.pdf - | pylanguagetool

Using my own commandline interface to LanguageTool.

You have any feedback or ideas to improve this? Contact me per E-Mail or edit the source directly. You can find my other projects at