Optimize Linear Algebra Calculations with Native Libraries


Building large-scale machine-learning systems often involves a massive execution of linear algebra calculation under the hood. Whether you use Spark, R, or even plain old MapReduce code written in Java, you might end up doing some operation on a big matrix/vector. And those operations can be done 5x-7x faster!

Having designed and implemented large-scale machine-learning based systems, I found out there are great alternatives to the default linear algebra libraries of whatever environments in use. And the best part of it is they are free to use and easy to install once you have a good tutorial. Now, a good tutorial is not so trivial to assume, and so, this post is going to be short and technical, but very practical for those of you who seek to speed things up and save some expensive computation time. I’m going to try and fill this missing step-by-step tutorial.

Although examples will be given for using only few alternative libraries and for Spark-on-Scala, links to other sources of information for doing the same for other environments will be given at the end of the post. Feel free to comment/ask questions, I’ll do my best to answer.

BLAS and LAPACK Interfaces

250px-vector_space_modelBLAS (Basic Linear Algebra Subprograms) is an interface for libraries that include routines that provide standard building block for vector and matrix calculations. It is divided into three levels (look at this cheatsheet for more details):

  1. Level 1 – includes only scalar and vector operations (dot products,
    vector norms, summations, multiplications, etc.)
  2. Level 2 – includes vector-matrix operations
  3. Level 3 – includes matrix-matrix operations

LAPACK (Linear Algebra Package), on the 180px-lapack_logo-svgother hand, is aimed at solving systems of linear equations, linear least squares (like in ALS), eigenvalue problems and singular value decomposition (SVD). A user-guide for this interface can be found here.

Implementations of both of those interfaces are naturally written in low-level languages as Fortran and C to achieve top performance, and serve as standard low-level routines for optimized systems.

BLAS and LAPACK Implementations

This is a partial list of common BLAS and LAPACK implementations:

There are more of them, and of course there’s the Java default one (although it is slower than any one of them in most cases). Since different hardware setups can affect the performance and superiority of those alternatives, building those libraries locally on the hardware you use is often essential for an optimized results. Another alternative is to match the right version of the library to your processor-family, where several pre-built versions exist.

Benchmarks of those 3 common libraries can be found in:

Again, it is really hardware-dependent, but Intel’s solution seems to outperform the others on their commonly used hardware.

Wrappers (Netlib, MTJ, Breeze)

Migrating your linear algebra calculations to a new BLAS and LAPACK library is not straightforward and involves some programmatic overhead. Luckily, at least for Java/Scala programmers, there are neat wrappers for easier optimized matrix and vector calculations.

Netlib-Java – a wrapper for low-level BLAS and LAPACK implementations. It includes few functionalities, among them the JNILoader – a simple library loader. It enables a seamless integration with the native optimized linear algebra libraries/implementations, as long as they exist on your system (and properly linked, in linux).

Breeze is a wrapper for Scala and is built on top of Netlib-Java. It is in heavy use by Spark and specifically MLlib. Using breeze in Scala is really as easy as importing it and casting your Scala arrays to Breeze objects such as DenseMatrix, DenseVector, etc. Those objects expose linear algebra methods you can ultimately use. Although, a full link to native libraries does not come out of the box when installing Spark on-premise and even on amazon’s popular AWS cloud, but in the next chapter I’ll show you how to fill the gaps.

MTJ (Matrix Toolkits Java) is also built on top of Netlib-Java. It is a wrapper made for Java programmers. It is well known for its large matrices performance, comparing to other java-based alternatives.

Connecting The Dots Together

As you already figured out, there are few options of how to use native BLAS and LAPACK libraries. On windows, installation and linking your app to the libraries should be short and easy – install netlib for windows here, and don’t forget to get some native implementation, such as pre-built OpenBLAS for Windows.

To get a linux-based Scala/Spark application to use native BLAS/LAPACK libraries, follow the next steps (if you are going to use a distributed environment such as Spark, make sure to run this on all of you nodes):

Step 1: Setup ATLAS (option 1)

Let’s start with the basics. You must setup some BLAS+LAPACK implementation. To get your hands on ATLAS on-premise, download its source, extract it, and run

make build

This step is going to take some time so leave you computer running and go get some coffee. Once finished, execute the following

make install DESTDIR=<native-lib-directory>

where you replace <native-lib-directory> with your desired destination. Done. Note that this step is not necessary if you want to use ATLAS on Amazon’s AWS EMR, since by default ATLAS is pre-intalled on EMR 4.x.x in:


Step 1: Setup OpenBLAS (option 2)

To get OpenBLAS, run the following commands:

wget -P /tmp http://github.com/xianyi/OpenBLAS/archive/v0.2.14.tar.gz
mkdir /tmp/openblas
tar -xvzf /tmp/v0.2.14.tar.gz -C /tmp/openblas
cd /tmp/openblas/OpenBLAS-0.2.14/
make install PREFIX=<native-lib-directory>

OpenBLAS library will be installed in <native-lib-directory>. If you don’t specify PREFIX=<native-lib-directory>, it will be installed at:


Step 2: Linking your libraries

Now, you should link the dynamic link library to your freshly built native library. To do that, run:

echo <native-lib-directory> >> /etc/ld.so.conf
ldconfig -v -p | grep 'blas\|lapack'

replacing <native-lib-directory> with the previous step’s destination folder, where you installed either ATLAS or OpenBLAS (note that for OpenBLAS you should point to <native-lib-directory>/lib). In the result of this step you should see libblas.so.3 and liblapack.so.3 linking to a files in your <native-lib-directory>.

Step 3: Setup Netlib

Download all the Jars from the list of com.github.fommil.netlib >> all >> 1.1.2 and copy them to a folder, let’s denote it as <netlib-jars-directory>, on all of the nodes in your cluster (or on your execution machine).

Step 4: Add pointers to Netlib and Native Libraries

Having Spark installed, you should be able to navigate to <spark>/conf and to edit the spark-defaults.conf file to add the following values:

spark.driver.extraClassPath=<netlib-jars-directory>: ...
spark.driver.extraLibraryPath=<native-lib-directory>: ...
spark.executor.extraClassPath=<netlib-jars-directory>: ...
spark.executor.extraLibraryPath=<native-lib-directory>: ...

replacing <native-lib-directory> with the path you supplied at Step 1, and <netlib-jars-directory> with the path you supplied at Step 3. Note that the “…” stands for previous values of those keys.

Finally, you should add the following to the end of spark-env.sh file:

export LD_LIBRARY_PATH=<native-lib-directory>:$LD_LIBRARY_PATH

Again, replacing <native-lib-directory>.

Step 5: Importing Netlib to your application

The last step should be importing Netlib into your project. This step is really easy assuming you use Maven:


or SBT:

libraryDependencies += "com.github.fommil.netlib" % "all" % "1.1.2"

If you only need MLlib to use Netlib-Java and the native BLAS/LACPACK libraries, you are pretty much done. And this should work in both client and cluster deploy-modes (as long as all the above was executed on all nodes). If you want to use BLAS/LAPACK functions in your own Spark/Scala code, add those imports to your code:

import com.github.fommil.netlib.BLAS.{getInstance => blas}
import com.github.fommil.netlib.LAPACK.{getInstance => lapack}

So you can finally use the methods of blas and lapack objects in your code!

To check that you’ve succeeded, execute:


Netlib for Windows



  • Still stuck on java? here is one tutorial you might want to read.


  • Using a Mac? Do yourself a favour and switch to PC


Like this post? Subscribe to get a notification every time a new post is published! Want to choose what will I write on next? Vote!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s