To use Joshua as a standalone decoder (with language packs), you only need to download and install the runtime version of the decoder. If you also wish to build translation models from your own data, you will want to install the full version. See the instructions below.
Set up some basic environment variables.
You need to define
export JAVA_HOME=/path/to/java # JAVA_HOME is not very standardized. Here are some places to look: # OS X: export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_71.jdk/Contents/Home # Linux: export JAVA_HOME=/usr/java/default
If you are installing the full version of Joshua, you also need to define
$HADOOP to point to your Hadoop installation.
(Joshua looks for the Hadoop executuble in
If you don’t have a Hadoop installation, Joshua’s pipeline can install a standalone version for you.
To install just the runtime version of Joshua, type
wget -q http://cs.jhu.edu/~post/files/joshua-runtime-6.0.5.tgz
Then build everything
tar xzf joshua-runtime-6.0.5.tgz cd joshua-runtime-6.0.5 # Add this to your init files export JOSHUA=$(pwd) # build everything ant
To instead install the full version, type
wget -q http://cs.jhu.edu/~post/files/joshua-6.0.5.tgz tar xzf joshua-6.0.5.tgz cd joshua-6.0.5 # Add this to your init files export JOSHUA=$(pwd) # build everything ant
If you wish to build models for new language pairs from existing data (such as the WMT data), you need to install some additional dependencies.
For learning hierarchical models, Joshua includes a tool called Thrax, which
is built on Hadoop. If you have a Hadoop installation, make sure that the environment variable
$HADOOP is set and points to it. If you don’t, Joshua will roll one out for you in standalone
mode. Hadoop is only needed if you plan to build new models with Joshua.
You will need to install Moses if either of the following applies to you:
You wish to build phrase-based models (Joshua 6 includes a phrase-based decoder, but not the tools for building such a model)
Follow the instructions for installing Moses
here, and then define the
environment variable to point to the root of the Moses installation.
For more detail on the decoder itself, including its command-line options, see the Joshua decoder page. You can also learn more about other steps of the Joshua MT pipeline, including grammar extraction with Thrax and Joshua’s efficient grammar representation.
A bundled configuration, which is a minimal set of configuration, resource, and script files, can be created and easily transferred and shared.