ResourceExhaustedError With CNN

ResourceExhaustedError With CNN

I am trying to complete this tutorial with this source code

I have tried using their large images data as well as my own small data set of 52 images (46x46) but I keep running into ResourceExhaustedError

ResourceExhaustedError OOM when allocating tensor with shape[1016064,1024]

Is there any way I can edit this code so it trains on smaller training sets so I dont run into this error?

I have tried changing batch sizes in the code but this accomplished nothing. I also made sure I dont have any previous tensorflow projects running (i restarted my computer)

my label.txt contains these two lines:

cat dog

and my train and validation folders contain 2 subfolders with the same name that contain the images.

I am using:
GeForce GTX 850M major: 5 minor: 0 memoryClockRate(GHz): 0.9015

totalMemory: 4.00GiB freeMemory: 3.35GiB

before I hit the error I get this print out:

Limit: 3235767910 InUse: 223232 MaxInUse: 223232 NumAllocs: 17 MaxAllocSize: 204800

Here is my full error:

2018-07-01 14:55:45.724585: W C:tf_jenkinsworkspacerel-winMwindows-gpuPY36tensorflowcorecommon_runtimebfc_allocator.cc:279] *___________________________________________________________________________________________________ 2018-07-01 14:55:45.725147: W C:tf_jenkinsworkspacerel-winMwindows-gpuPY36tensorflowcoreframeworkop_kernel.cc:1202] OP_REQUIRES failed at random_op.cc:202 : Resource exhausted: OOM when allocating tensor with shape[1016064,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc Traceback (most recent call last): File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonclientsession.py", line 1361, in _do_call return fn(*args) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonclientsession.py", line 1340, in _run_fn target_list, status, run_metadata) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonframeworkerrors_impl.py", line 516, in __exit__ c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1016064,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[Node: dense/kernel/Initializer/random_uniform/RandomUniform = RandomUniform[T=DT_INT32, _class=["loc:@dense/kernel"], dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](dense/kernel/Initializer/random_uniform/shape)]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:/Users/Mac/Desktop/tensorflow/cnn_dog_vs_cat-master/cnn_dog_cat.py", line 175, in <module> tf.app.run() File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonplatformapp.py", line 126, in run _sys.exit(main(argv)) File "C:/Users/Mac/Desktop/tensorflow/cnn_dog_vs_cat-master/cnn_dog_cat.py", line 167, in main classifier.train(input_fn=lambda: train_input_fn(train_list), steps=10, hooks=[logging_hook]) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonestimatorestimator.py", line 352, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonestimatorestimator.py", line 888, in _train_model log_step_count_steps=self._config.log_step_count_steps) as mon_sess: File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythontrainingmonitored_session.py", line 384, in MonitoredTrainingSession stop_grace_period_secs=stop_grace_period_secs) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythontrainingmonitored_session.py", line 795, in __init__ stop_grace_period_secs=stop_grace_period_secs) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythontrainingmonitored_session.py", line 518, in __init__ self._sess = _RecoverableSession(self._coordinated_creator) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythontrainingmonitored_session.py", line 981, in __init__ _WrappedSession.__init__(self, self._create_session()) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythontrainingmonitored_session.py", line 986, in _create_session return self._sess_creator.create_session() File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythontrainingmonitored_session.py", line 675, in create_session self.tf_sess = self._session_creator.create_session() File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythontrainingmonitored_session.py", line 446, in create_session init_fn=self._scaffold.init_fn) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythontrainingsession_manager.py", line 281, in prepare_session sess.run(init_op, feed_dict=init_feed_dict) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonclientsession.py", line 905, in run run_metadata_ptr) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonclientsession.py", line 1137, in _run feed_dict_tensor, options, run_metadata) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonclientsession.py", line 1355, in _do_run options, run_metadata) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonclientsession.py", line 1374, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1016064,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[Node: dense/kernel/Initializer/random_uniform/RandomUniform = RandomUniform[T=DT_INT32, _class=["loc:@dense/kernel"], dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](dense/kernel/Initializer/random_uniform/shape)]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. Caused by op 'dense/kernel/Initializer/random_uniform/RandomUniform', defined at: File "C:/Users/Mac/Desktop/tensorflow/cnn_dog_vs_cat-master/cnn_dog_cat.py", line 175, in <module> tf.app.run() File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonplatformapp.py", line 126, in run _sys.exit(main(argv)) File "C:/Users/Mac/Desktop/tensorflow/cnn_dog_vs_cat-master/cnn_dog_cat.py", line 167, in main classifier.train(input_fn=lambda: train_input_fn(train_list), steps=10, hooks=[logging_hook]) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonestimatorestimator.py", line 352, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonestimatorestimator.py", line 812, in _train_model features, labels, model_fn_lib.ModeKeys.TRAIN, self.config) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonestimatorestimator.py", line 793, in _call_model_fn model_fn_results = self._model_fn(features=features, **kwargs) File "C:/Users/Mac/Desktop/tensorflow/cnn_dog_vs_cat-master/cnn_dog_cat.py", line 50, in cnn_model_fn dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonlayerscore.py", line 248, in dense return layer.apply(inputs) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonlayersbase.py", line 809, in apply return self.__call__(inputs, *args, **kwargs) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonlayersbase.py", line 680, in __call__ self.build(input_shapes) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonlayerscore.py", line 134, in build trainable=True) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonlayersbase.py", line 533, in add_variable partitioner=partitioner) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsvariable_scope.py", line 1297, in get_variable constraint=constraint) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsvariable_scope.py", line 1093, in get_variable constraint=constraint) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsvariable_scope.py", line 439, in get_variable constraint=constraint) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsvariable_scope.py", line 408, in _true_getter use_resource=use_resource, constraint=constraint) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsvariable_scope.py", line 800, in _get_single_variable use_resource=use_resource) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsvariable_scope.py", line 2157, in variable use_resource=use_resource) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsvariable_scope.py", line 2147, in <lambda> previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsvariable_scope.py", line 2130, in default_variable_creator constraint=constraint) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsvariables.py", line 233, in __init__ constraint=constraint) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsvariables.py", line 327, in _init_from_args initial_value(), name="initial_value", dtype=dtype) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsvariable_scope.py", line 784, in <lambda> shape.as_list(), dtype=dtype, partition_info=partition_info) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsinit_ops.py", line 472, in __call__ shape, -limit, limit, dtype, seed=self.seed) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsrandom_ops.py", line 244, in random_uniform shape, dtype, seed=seed1, seed2=seed2) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonopsgen_random_ops.py", line 473, in _random_uniform name=name) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonframeworkop_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonframeworkops.py", line 3271, in create_op op_def=op_def) File "C:UsersMacAppDataLocalProgramsPythonPython36libsite-packagestensorflowpythonframeworkops.py", line 1650, in __init__ self._traceback = self._graph._extract_stack() # pylint: disable=protected-access ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1016064,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[Node: dense/kernel/Initializer/random_uniform/RandomUniform = RandomUniform[T=DT_INT32, _class=["loc:@dense/kernel"], dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](dense/kernel/Initializer/random_uniform/shape)]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

If this code is hard to fix I would also accept a link to a code that could work on my machine. I just want to get a cnn working on my own dataset
– Xitcod13
Jul 1 at 23:13

Would you share your source code? The exhaustion happens on a rather large matrix (assuming you are on simple trial, the numbers look big): [1016064, 1024]. There may be a problem in the code, without formal syntax error. If you are sure that everything’s fine, have you tried a batch of size 1 ? The matrix only seems to take up roughly 1Gb in its current shape (really rough, it depends on the data type...).
– Eric Platon
Jul 1 at 23:47

@EricPlaton My source code is identical to tutorial source code with the exception of directory names. I did tried to change all the batch sizes to 1. In this line and in the same code couple lines below " return input_fn(True, file_path, 100, None, 10)" -> return input_fn(True, file_path, 1, None, 10)
– Xitcod13
Jul 2 at 0:23

@EricPlaton I guess they make you scroll all the way down in tutorial to dl source code. So here it is github.com/Thumar/cnn_dog_vs_cat
– Xitcod13
Jul 2 at 0:24

1 Answer
1

The OOM is caused by the allocation of the dense layer line 50 :

pool2_flat = tf.reshape(pool2, [-1, 126 * 126 * 64]) dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)

You can either:

BTW I strongly recommend against using hard coded shapes in tf.reshape. Maybe use tf.layers.flatten which is robust to architecture modifications instead.

Hey thanks for the answer. I have reduced the image size to 46 by 46 I am guessing that it is still too big for my gpu? And btw this isnt my code. This is the only tutorial that i could find that allows me to use CNN on my own images. If you have any other tutorials you could recommend I would greatly appreciate it. I will mark the answer when I fix my code
– Xitcod13
Jul 3 at 13:17

I would not recommend this tutorial at all. It didn't see the first time I read the code but there is quite a huge mistake: tf.reshape(pool2, [-1, 126 * 126 * 64]) should be tf.reshape(pool2, [-1, 62 * 62 * 64]) and in your case with a 46x46 image tf.reshape(pool2, [-1, 11 * 11 * 64]). Please use tf.layers.flatten(pool2) instead.
– Olivier Dehaene
Jul 3 at 13:49

As of good computer vision tutorials, the official Tensorflow documentation is quite good and will give you nice examples to modify for your own needs.
– Olivier Dehaene
Jul 3 at 13:53

Yes I did change that piece of the code to tf.layers.flatten! And it works now with your other suggestions but I had to use smaller images and 64 units on my dense layer. I guess i cant expect much from a laptop GPU. I hate this tutorial but I couldnt find a better one that explained how to use my own images. I did the Tensorflow official tutorial but I couldnt load anything but minst dataset into it. (i guess ill look around their website a bit more)
– Xitcod13
Jul 3 at 13:55

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

dMGShrp,Dq l g2mQ5N,HJOo M8auS GpA7TDr8Sgr

搜尋此網誌

Gtjkyu