Translating texts from images with Bash

Across the company, I have co-workers who speak different languages. This might lead to inconveniences. For example, when we use group chats, we have to write the same announcements in different languages for everyone to understand it.

But it’s 2020 and there is a tool for everything. So, I came across this very cool articlewhich describes how to write a script that will translate any text selection on your desktop. So, when a non-native speaker receives a text message, they can select it with a mouse and translate it to any preconfigured language.

This led me to the idea of translating texts from an image (or — from a screenshot) into any language. So here is how to create a Bash script that does just that.

First of all, we need a tool that reads text from an image. One of the best open-source projects for this is Tesseract. To install it on Ubuntu-based distros, simply run:

sudo apt install tesseract-ocr

To install tesseract on different Linux platforms, see this link. To add support for specific languages, simply install a tesseract package with a three-letter ISO language code appended. For example, to install support for English, simply run sudo apt install tesseract-ocr-eng and to add Italian support, run sudo apt install tesseract-ocr-ita , and so on.

Now let’s start building our script, which will name transimg.sh (for “translate image”).

First, we define script arguments:

#!/usr/bin/env bash
# Image with text to translate.
image=$1
# 3-letter ISO Language code.
from_lang=$2
# Language to translate into.
to_lang=$3
# If set, translation will be displayed in a message box.
display=$4
# Two-letter language code, used for Google Translate.
to_lang=${to_lang:0:2}

Next, we use tesseract to read text from an image and save it in a variable:

# Extract text from an image using a language code.
# 2> /dev/null - is used to suppress possible errors/warnings.
# So, if you want to debug the script, just remove it.
text="$(tesseract -l $from_lang $image stdout 2> /dev/null)"

At this point, we already can translate the text and save the translation in a variable:

# Translate the text using Google API.
translation="$(wget -U "Mozilla/5.0" -qO - \
"http://translate.googleapis.com/translate_a/single?client=gtx&sl=auto&tl=$to_lang&dt=t&q=\
$(echo $text | sed "s/[\"'<>]//g")" | sed "s/,,,0]],,.*//g" | awk -F'"' '{print $2, $6}')"

You probably noticed $to_lang, that’s how you tell google translate what language to translate into. Note that the source language is automatically picked up by Google Translate. And the $text is the text extracted from the image.

Let’s format the translated text to make it more readable:

# Remove last word from the translation (a long hash).
translation=$(echo "$translation" | awk '{$1=$1};1' | sed s/'\w*$'//)
# Remove empty lines.
translation=$(echo $translation | sed /^$/d)

As mentioned, by default, the translation is printed in the terminal. But if we pass the display parameter, the translation will appear in a message box. For that, we use a very nice tool called zenityIn case it’s not installed on your system, please follow the instructions from this article.

Printing out the translation in a dialog box:

if [[ $display == true ]]; then
    # Show translation in a zenity box.
    zenity --width=400 --height=400 --info --text "<span font='18'>$translation</span>"
else
    # Print the translated text.
    echo "$translation"
fi

Finally, here’s how the complete script looks like:

#!/usr/bin/env bash

# Image with text to translate.
image=$1
# 3-letter ISO Language code.
from_lang=$2
# Language to translate into.
to_lang=$3
# If set, translation will be displayed in a message box.
display=$4
# Two-letter language code, used for Google Translate.
to_lang=${to_lang:0:2}

# Extract text from an image using a language code.
# 2> /dev/null - is used to suppress possible errors/warnings.
# So, if you want to debug the script, just remove it.
text="$(tesseract -l $from_lang $image stdout 2> /dev/null)"

# Translate the text using Google API.
translation="$(wget -U "Mozilla/5.0" -qO - \
"http://translate.googleapis.com/translate_a/single?client=gtx&sl=auto&tl=$to_lang&dt=t&q=\
$(echo $text | sed "s/[\"'<>]//g")" | sed "s/,,,0]],,.*//g" | awk -F'"' '{print $2, $6}')"

# Remove last word from the translation (a long hash).
translation=$(echo "$translation" | awk '{$1=$1};1' | sed s/'\w*$'//)
# Remove empty lines.
translation=$(echo $translation | sed /^$/d)

if [[ $display == true ]]; then
    # Show translation in a zenity box.
    zenity --width=400 --height=400 --info --text "<span font='18'>$translation</span>"
else
    # Print the translated text.
    echo "$translation"
fi

Just don’t forget to make the script file executable:

sudo chmod +x transimg.sh

Let’s try it out

Now suppose we have the following image with English text and we want to translate it to Italian:

Translation source image

Run the script:

./transimg.sh message.png eng ita true

…and we’ll see the result in a dialog.

Translation result

If you run it without the last argument (true), the translation will be printed directly to the terminal.

Let‘s Talk

No matter if you already have a project specification or you’re at the early stages of evaluating potential vendors, drop us a line and get a free estimation of our service costs.
Tell us about your needs
We‘ll have a short discovery call
You‘ll get a free quote from us