Python fork: 'Cannot allocate memory' if process consumes more than 50% avail. memory

Python fork: 'Cannot allocate memory' if process consumes more than 50% avail. memory

I encountered a memory allocation problem when forking processes in Python. I know the issue was already discussed in some other posts here, however I couldn't find a good solution in any of them.

Here is a sample script illustrating the Problem:

import os import psutil import subprocess pid = os.getpid() this_proc = psutil.Process(pid) MAX_MEM = int(psutil.virtual_memory().free*1E-9) # in GB def consume_memory(size): """ Size in GB """ memory_consumer = while get_mem_usage() < size: memory_consumer.append(" "*1000000) # Adding ~1MB return(memory_consumer) def get_mem_usage(): return(this_proc.memory_info()[0]/2.**30) def get_free_mem(): return(psutil.virtual_memory().free/2.**30) if __name__ == "__main__": for i in range(1, MAX_MEM): consumer = consume_memory(i) mem_usage = get_mem_usage() print("n## Memory usage %d/%d GB (%2d%%) ##" % (int(mem_usage), MAX_MEM, int(mem_usage*100/MAX_MEM))) try: subprocess.call(['echo', '[OK] Fork worked.']) except OSError as e: print("[ERROR] Fork failed. Got OSError.") print(e) del consumer

The script was tested with Python 2.7 and 3.6 on Arch Linux and uses psutils to keep track of memory usage. It gradually increases memory usage of the Python process and tries to fork a process using subprocess.call(). Forking fails if more then 50% of the avail. memory is consumed by the parent process.

## Memory usage 1/19 GB ( 5%) ## [OK] Fork worked. ## Memory usage 2/19 GB (10%) ## [OK] Fork worked. ## Memory usage 3/19 GB (15%) ## [OK] Fork worked. [...] ## Memory usage 9/19 GB (47%) ## [OK] Fork worked. ## Memory usage 10/19 GB (52%) ## [ERROR] Fork failed. Got OSError. [Errno 12] Cannot allocate memory ## Memory usage 11/19 GB (57%) ## [ERROR] Fork failed. Got OSError. [Errno 12] Cannot allocate memory ## Memory usage 12/19 GB (63%) ## [ERROR] Fork failed. Got OSError. [Errno 12] Cannot allocate memory ## Memory usage 13/19 GB (68%) ## [ERROR] Fork failed. Got OSError. [Errno 12] Cannot allocate memory [...]

Note that I had no Swap activated when running this test.

There seem to be two options to solve this problem:

I tried the latter on my desktop machine and the above script finished without errors.
However on the Computing Cluster I'm working on I can't use any of these options.

Also forking the required processes in advance, before consuming the memory, is not an option unfortunately.

Does anybody have an other suggestion on how to solve this problem?

Thank you!

Best

Leonhard

Have you looked at the solution suggested in this answer ? TLDR: start a small side process at the start, that you can then use to start the other subprocesses on demand.
– mbrig
Jun 26 at 14:22

Yes, I thought about this. However it would require quite some changes in my code. Also the big object consuming all the memory should be shared among the workers. So I'd like to have this object available when forking the workers. I was hoping that there is an easier way to solve this problem.
– Leo
Jun 26 at 14:32

1 Answer
1

The problem you are facing is not really Python related and also not something you could really do much to change with Python alone. Starting a forking process (executor) up front as suggested by mbrig in the comments really seems to be the best and cleanest option for this scenario.

Python or not, you are dealing with how Linux (or similar system) create new processes. Your parent process first calls fork(2) which creates a new child process as a copy of itself. It does not actually copy itself elsewhere at that time (it uses copy-on-write), nonetheless, it checks if sufficient space is available and if not fails setting errno to 12: ENOMEM -> the OSError exception you're seeing.

errno

12: ENOMEM

OSError

Yes, allowing VMS to overcommit memory can suppress this error popping up... and if you exec new program (which would also end up being smaller) in the child. It does not have to cause any immediate failures. But it sounds like possibly kicking the problem further down the road.

Growing memory (adding swap). Pushes the limit and as long twice your running process still fits into available memory, the fork could succeed. With the follow-up exec, the swap would not even need to get utilized.

There seems to be one more option, but it looks... dirty. There is another syscall vfork() which creates a new process which initially shares memory with its parent whose execution is suspended at that point. This newly created child process can only set variable returned by vfork, it can _exit or exec. As such, it is not exposed through any Python interface and if you tried (I did) loading it directly into Python using ctypes it would segfault (I presume because Python would still do something other then just those three actions mentioned after vfork and before I could exec something else in the child).

vfork

_exit

exec

ctypes

vfork

exec

That said, you can delegate the whole vfork and exec to a shared object you load in. As a very rough proof of concept, I did just that:

vfork

exec

#include <errno.h> #include <stdio.h> #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> char run(char * const arg) { pid_t child; int wstatus; char ret_val = -1; child = vfork(); if (child < 0) { printf("run: Failed to fork: %in", errno); } else if (child == 0) { printf("arg: %sn", arg[0]); execv(arg[0], arg); _exit(-1); } else { child = waitpid(child, &wstatus, 0); if (WIFEXITED(wstatus)) ret_val = WEXITSTATUS(wstatus); } return ret_val; }

And I've modified your sample code in the following way (bulk of the change is in and around replacement of subprocess.call):

subprocess.call

import ctypes import os import psutil pid = os.getpid() this_proc = psutil.Process(pid) MAX_MEM = int(psutil.virtual_memory().free*1E-9) # in GB def consume_memory(size): """ Size in GB """ memory_consumer = while get_mem_usage() < size: memory_consumer.append(" "*1000000) # Adding ~1MB return(memory_consumer) def get_mem_usage(): return(this_proc.memory_info()[0]/2.**30) def get_free_mem(): return(psutil.virtual_memory().free/2.**30) if __name__ == "__main__": forker = ctypes.CDLL("forker.so", use_errno=True) for i in range(1, MAX_MEM): consumer = consume_memory(i) mem_usage = get_mem_usage() print("n## Memory usage %d/%d GB (%2d%%) ##" % (int(mem_usage), MAX_MEM, int(mem_usage*100/MAX_MEM))) try: cmd = [b"/bin/echo", b"[OK] Fork worked."] c_cmd = (ctypes.c_char_p * (len(cmd) + 1))() c_cmd[:] = cmd + [None] ret = forker.run(c_cmd) errno = ctypes.get_errno() if errno: raise OSError(errno, os.strerror(errno)) except OSError as e: print("[ERROR] Fork failed. Got OSError.") print(e) del consumer

With that, I could still fork at 3/4 of available memory reported filled.

In theory it could all be written "properly" and also wrapped nicely to integrate with Python code well, but while it seems to be one additional option. I'd still go back to the executor process.

I've only briefly scanned through the concurrent.futures.process module, but once it spawns a worker process, it does not seem to clobber it before done, so perhaps abusing existing ProcessPoolExecutor would be a quick and cheap option. I've added these close to the script top (main part):

concurrent.futures.process

ProcessPoolExecutor

def nop(): pass executor = concurrent.futures.ProcessPoolExecutor(max_workers=1) executor.submit(nop) # start a worker process in the pool

And then submit the subprocess.call to it:

subprocess.call

proc = executor.submit(subprocess.call, ['echo', '[OK] Fork worked.']) proc.result() # can also collect the return value

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

eAgK3yOJf,75Jq8YRMq2Eyx,P WQ N

搜尋此網誌

Gtjkyu