A bug's life #1: "libiomp5md.dll already loaded" in Deep Learning on Windows
Fixing a dll loading to actually use your dependencies
Python is a great language for machine learning because it comes with so many libraries that have already solved more problems than we can imagine. However, while libraries help to solve problems fast, sometimes their usage is not seamless and we have to struggle against lower-level problems to make them work.
This has recently happened to me while using Ctranslate2, a great tool for running your encoder-decoder Transformer models for inference, but which will also make your software crash on Windows because of libiomp5md.dll.
What is libiomp5md.dll?
libiomp5md.dll is the dynamically-linked library for OpenMP, a multiprocessing library for C and C++, which is used in the context of deep learning to accelerate our computation by using the power of multiple CPU cores. It is used by Torch and Ctranslate2, but can also be a dependency of other libraries like numpy or scikit-learn.
A dynamically-linked library is not linked together with the executable, unlike statically-linked libraries, but it is loaded at runtime. This reduces the size of the executable files by sharing libraries among many executables.
The Error
When I started my project I only had CTranslate2 as a dependency using libiomp5.dll and it worked without problems. However, I then needed to install pytorch to use Transformers and this is when the problems began:
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
The error message clearly tells us that the library is loaded multiple times and this should not happen. So, let’s inspect where it exist (my virtual environment is called .venv):
> Get-ChildItem -Path .\.venv\ -Filter libiomp5md.dll -Recurse
Directory: C:\Users\path\.venv\Lib\site-packages\ctranslate2
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 08/07/2024 16:00 2030632 libiomp5md.dll
Directory: C:\Users\path\.venv\Lib\site-packages\torch
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 08/07/2024 16:00 1942464 libiomp5md.dll
Directory: C:\Users\path\.venv\Library\bin
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 08/07/2024 15:59 1942464 libiomp5md.dll
so we can see that there are at least two different versions of the same library, one long 2030632 MB
and one long 1942464 MB.
The solutions that are found on Stack Overflow (like this) suggest to reinstall the libraries until it works, or to just remove it. I tried them and they resulted in the same or other errors.
The Solution
Just to be clear, if your error comes from an environment that is different from mine, for instance it is not caused by CTranslate2, this solution may not work.
I found out that it cannot be removed from the CTranslate2 folder or the program would fail to load CTranslate2 itself. So what I did was to remove it from the torch folder and from the external Library folder:
mv C:\path\.venv\Library\bin\libiomp5md.dll C:\path\.venv\Library\bin\libiomp5md__bkup.dll
mv C:\path\.venv\Lib\site-packages\torch\lib\libiomp5md.dll C:\path\.venv\Lib\site-packages\torch\lib\libiomp5md__bkup.dll
just renaming them in case we need them later. This prevents the conflict when importing CTranslate2, but fails to load torch. So we need the library in the external Library folder, but we cannot copy it or we get the original error again.
We need a soft link to the library in the ctranslate2 folder, which in Windows is not as straightforward as in *nix systems. First, we need to enable soft links by enabling the developer mode (Windows’s developer mode) or just open the command prompt (please note: cmd, not Powershell!) and finally run:
mklink C:\path\.venv\Library\bin\libiomp5md.dll C:\path\.venv\Lib\site-packages\ctranslate2\libiomp5md.dll
and finally CTranslate2 and Pytorch can both be imported with no further issues!
Conclusion
This kind of errors can be very hard to solve, and is particularly annoying that it seems to show up only on Windows, while most of the machine learning community seems to use linux boxes, so there is a lack of support on this side. This is why I think it is important to share such solutions and help an underserved part of the community to save valuable hours. Also, I hope that the reasoning behind can help you solving similar issues.
That is all for today and I hope that you like the series “A bug’s life” where I share my experience solving errors in real projects.