Debugging TensorFlow ImportError: DLL load failed Exception

I have encountered this error at least twice on two different machines and have spent too much time tracking down all different reasons it can occur.

import tensorflow
Traceback (most recent call last):
File "C:\...\site-packages\tensorflow\python\", line 18, in swig_import_helper
return importlib.import_module(mname)
File "C:\...\", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 986, in _gcd_import
File "", line 969, in _find_and_load
File "", line 958, in _find_and_load_unlocked
File "", line 666, in _load_unlocked
File "", line 577, in module_from_spec
File "", line 906, in create_module
File "", line 222, in _call_with_frames_removed
ImportError: DLL load failed: The specified module could not be found.

So anyone facing this issue (especially with TensorFlow 1.4.0) here is how to debug this:

  1. First make sure you have correct versions of CUDA Toolkit and cuDNN. NVidia has newer versions as default downloads and they won't work. See my post on TensorFlow installation.
  2. I would highly recommend using Python 3.5 version instead of 3.6 with TensorFlow 1.4. If you have latest Anaconda version, you probably have Python 3.6. You can check this by using command conda info. If you indeed have Python 3.6 then you can downgrade to 3.5 by using command conda install python=3.5.
  3. Make sure you have NVidia's CUDA Toolkit path as well as cuDNN path - both - listed before Anaconda path in Environment variable Path. Anaconda now seem to supply same DLLs in its own folder but they seem to cause ImportError.
  4. Use where command to actually see if you can find these DLLs on path:
                 where cuDNN64_6.dll
                 where curand64_80.dll

    The first path should be where you downloaded cuDNN 6 and the second path should be in NVidia's CUDA Toolkit folder.

  5. If you still get this error, download Process Monitor from sysinternals.  You will see icons to monitor registry, disk etc in toolbar. Disable those except for icon that says "Show Process and Thread Activities". Then click on filter icon and add a filter for ImagePath contains python. Now you should see only process and thread activities from python.exe. Close all python instances, open a new one and execute
    import tensorflow as tf

    . Now Process Monitor will show you DLLs being loaded by TensorFlow. The last DLL in this list is usually is the one causing problem.

Installing TensorFlow GPU Version on Windows

TensorFlow 1.4 installation on Windows is still not as straightforward so here are quick steps:

  1. Install Anaconda. Grab the version that has Python 3.6. After installation, you will need to downgrade to Python 3.5 as quite a few libraries like OpenCV still aren't compatible with Python 3.6 and also TensorFlow does seem to have few issues with 3.6. So after installing Anaconda run following command to downgrade:
    conda install python=3.5

    . Another alternative is to create virtual environment but I want to keep this post short.

  2. Next, install NVidia CUDA Toolkit 8.0. This is currently not the version so you can only find in archives and you might have to register on NVidia website first. You will see Base Installer and Patch 2. Install both in one after another. Note that you must do this after step 1. If you did this before then go to Environment Path editor and make sure NVidia Cuda Toolkit's path is listed before Anaconda path. Currently Anaconda also supplies CUDA 8.0 DLLs and they don't seem to be compatible causing DLL ImportError.
  3. Next, Download cuDNN v6.0 (April 27, 2017), for CUDA 8.0. Extract the zip file to some folder and add path path cuDNN6\cuda\bin in to your machine's Environment Path. Make sure this path comes before Anaconda's path in your Environment Path variable. TensorFlow 1.4.0 looks for cuDNN64_6.dll and this folder has this DLL. Again this step should be done after step 1 otherwise Anaconda's cuDNN DLL will be found first on path and you will get DLL ImportError.
  4. Open Command Prompt as Administrator and install TensorFlow using command:
    pip install --ignore-installed --upgrade tensorflow-gpu

    . Sometime this command might fail. In which case, try to update pip (if you see warning) and run this command again.

  5. Validate your installation.

If you are getting ImportError for DLL load then see my post on how to debug it.

Writing Generic Container Functions in C++11

Let's say we want to write function to append one vector to another. It can be done like,

template<typename T>
void append(Vector<T>& to, const Vector<T>& from)
    to.insert(to.end(), from.begin(), from.end());

One problem with this approach is that we can only use this function with Vector. So what about all other container types? In languages such as C#, we have IEnumerable that simplifies lot of things but with C++ templates are duck typed and it takes bit more to make above function generic for various container types. Another quick and dirty route is this:

template<typename Container>
void append(Container& to, const Container& from)
    to.insert(to.end(), from.begin(), from.end());

The problem with this approach is that any class with begin() and end() will now qualify for this call. In fact, just in case someone has class with these methods which actually isn't implemented as iterator, you can get some nasty surprises. A simple modification is to make sure we call begin() and end() from std namespace instead of the ones defined on class:

template<typename Container>
void append(Container& to, const Container& from)
    using std::begin;
    using std::end;
    to.insert(end(to), begin(from), end(from));

Sure, this is better but wouldn't it be nice if can we restrict the types passed on to this function to only those which strictly behaves like STL containers? Enter type traits! First, we need to define SFINAE type trait for containers. Fortunately Louis Delacroix who developed prettyprint library has already fine tuned this code extensively. Below is mostly his code with a my slight modification that allows to pass it through GCC strict mode compilation. This is lot of code so I would usually put this in a separate file, say, type_utils.hpp, so you can use it for many generic container methods:

#ifndef commn_utils_type_utils_hpp
#define commn_utils_type_utils_hpp

#include <type_traits>
#include <valarray>

namespace common_utils { namespace type_utils {
	//also see
    namespace detail
        // SFINAE type trait to detect whether T::const_iterator exists.

        struct sfinae_base
            using yes = char;
            using no  = yes[2];

        template <typename T>
        struct has_const_iterator : private sfinae_base
            template <typename C> static yes & test(typename C::const_iterator*);
            template <typename C> static no  & test(...);
            static const bool value = sizeof(test<T>(nullptr)) == sizeof(yes);
            using type =  T;

            void dummy(); //for GCC to supress -Wctor-dtor-privacy

        template <typename T>
        struct has_begin_end : private sfinae_base
            template <typename C>
            static yes & f(typename std::enable_if<
                std::is_same<decltype(static_cast<typename C::const_iterator(C::*)() const>(&C::begin)),
                             typename C::const_iterator(C::*)() const>::value>::type *);

            template <typename C> static no & f(...);

            template <typename C>
            static yes & g(typename std::enable_if<
                std::is_same<decltype(static_cast<typename C::const_iterator(C::*)() const>(&C::end)),
                             typename C::const_iterator(C::*)() const>::value, void>::type*);

            template <typename C> static no & g(...);

            static bool const beg_value = sizeof(f<T>(nullptr)) == sizeof(yes);
            static bool const end_value = sizeof(g<T>(nullptr)) == sizeof(yes);

            void dummy(); //for GCC to supress -Wctor-dtor-privacy

    }  // namespace detail

    // Basic is_container template; specialize to derive from std::true_type for all desired container types

    template <typename T>
    struct is_container : public std::integral_constant<bool,
                                                        detail::has_const_iterator<T>::value &&
                                                        detail::has_begin_end<T>::beg_value  &&
                                                        detail::has_begin_end<T>::end_value> { };

    template <typename T, std::size_t N>
    struct is_container<T[N]> : std::true_type { };

    template <std::size_t N>
    struct is_container<char[N]> : std::false_type { };

    template <typename T>
    struct is_container<std::valarray<T>> : std::true_type { };

    template <typename T1, typename T2>
    struct is_container<std::pair<T1, T2>> : std::true_type { };

    template <typename ...Args>
    struct is_container<std::tuple<Args...>> : std::true_type { };

}}	//namespace

Now you can use these traits to enforce what types gets accepted in to your generic function:

#include "type_utils.hpp"

template<typename Container>
static typename std::enable_if<type_utils::is_container<Container>::value, void>::type
append(Container& to, const Container& from)
    using std::begin;
    using std::end;
    to.insert(end(to), begin(from), end(from));

Much better!

How to use Windows network share from domain joined machine on Linux

I'm seeing lot of websites with bit outdated or incomplete instructions. So here are the full steps that works for Ubuntu 14 for mounting Windows network file share on Ubuntu through active directory domain account:

First you need to install cifs-utils. Check if you already have it:

dpkg -l cifs-utils

If not, just install it:

sudo apt-get install cifs-utils

You can mount Windows shares anywhere but /mnt is generally preferred. Another often used location is /media but in modern environments /mnt is more preferred for things that users mount manually while /media is more preferred for things that system would mount for you. Regardless, you should create a folder where the content of your share will appear. Run following command to do that:

Note: In this guide, replace ALL_CAPS words with values you want.

mkdir -p /mnt/FOLDER

Then run the mount command:

sudo mount -t cifs //SERVER/FOLDER /mnt/FOLDER -o username=USER,domain=DOMAIN,iocharset=utf8

Note that,

  1. We set iocharset to utf8. This is optional but better than default charset ISO 8859-1 that Linux uses for mount.
  2. Some websites uses filemod/dirmode to 777 (i.e. grant all permissions). This is usually not necessary.

That's it! If you need to unmount share then run the command,

sudo umount //SERVER/FOLDER

One problem is that, you will need to run mount command every time after you restart again. While there are ways to connect network shares at startup, they often involve storing your passwords and not recommended. So usually I would just add a line in my ~/.bash-aliases file like this:

alias mountshare='sudo mount -t cifs //SERVER/FOLDER /mnt/FOLDER -o username=USER,domain=DOMAIN,iocharset=utf8'

So next time when I need share, I just type mountshare on command line.