Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Function profiling
- ==================
- Message: sb/convnet/sb_resnet.py:238
- Time in 1 calls to Function.__call__: 1.692772e-05s
- Time in Function.fn.__call__: 5.006790e-06s (29.577%)
- Time in thunks: 1.907349e-06s (11.268%)
- Total compile time: 9.513402e-02s
- Number of Apply nodes: 1
- Theano Optimizer time: 6.597400e-02s
- Theano validate time: 4.982948e-05s
- Theano Linker time (includes C, CUDA code generation/compiling): 3.857851e-03s
- Import time 2.822876e-03s
- Node make_thunk time 3.688097e-03s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.707s
- Class
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
- 100.0% 100.0% 0.000s 1.91e-06s C 1 1 theano.compile.ops.Shape_i
- ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
- Ops
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
- 100.0% 100.0% 0.000s 1.91e-06s C 1 1 Shape_i{0}
- ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
- Apply
- ------
- <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
- 100.0% 100.0% 0.000s 1.91e-06s 1 0 Shape_i{0}(<GpuArrayType<None>(float32, (False,))>)
- ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 0.066s for 2/1 nodes before/after optimization
- 0.001s for callback
- 0.000s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 0.056725s - ('add_destroy_handler', 'AddDestroyHandler', 23, 1, 1) - 0.000s
- 0.001584s - ('canonicalize', 'EquilibriumOptimizer', 6, 2, 1) - 0.000s
- EquilibriumOptimizer canonicalize
- time 0.001s for 3 passes
- nb nodes (start, end, max) 2 1 3
- time io_toposort 0.000s
- time in local optimizers 0.001s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.001s 1 (0.000s in global opts, 0.000s io_toposort) - 2 nodes - ('local_shape_to_shape_i', 1)
- 1 - 0.000s 1 (0.000s in global opts, 0.000s io_toposort) - 3 nodes - ('local_subtensor_make_vector', 1)
- 2 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- times - times applied - nb node created - name:
- 0.000s - 1 - 2 - local_shape_to_shape_i
- 0.000s - 1 - 0 - local_subtensor_make_vector
- 0.000s - in 87 optimization that were not used (display only those with a runtime > 0)
- 0.000s - MergeOptimizer
- 0.000s - topo_constant_folding
- 0.000s - local_useless_subtensor
- 0.000s - local_subtensor_remove_broadcastable_index
- 0.000s - local_track_shape_i
- 0.000s - local_useless_slice
- 0.000s - local_subtensor_merge
- 0.000s - local_subtensor_lift
- 0.000s - local_subtensor_of_dot
- 0.000s - local_subtensor_of_alloc
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (3, 3, 0)
- init io_toposort 4.10079956055e-05
- loop time 8.10623168945e-06
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.90734863281e-05
- loop time 2.14576721191e-06
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Iter 2
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000795s - ('gpuarray_opt', 'SeqOptimizer', 16, 1, 1) - 0.000s
- SeqOptimizer gpuarray_opt time 0.001s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000347s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 1, 1) - 0.000s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- 0.000221s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 1, 1) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.000s
- Total time taken by local optimizers 0.000s
- 0.000092s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 1, 1) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- 0.000009s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 1, 1) - 0.000s
- 0.000573s - ('elemwise_fusion', 'SeqOptimizer', 19, 1, 1) - 0.000s
- SeqOptimizer elemwise_fusion time 0.000s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000196s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 1, 1) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000190s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 1, 1) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 1.19209289551e-06
- 0.000548s - ('specialize', 'EquilibriumOptimizer', 13, 1, 1) - 0.000s
- EquilibriumOptimizer specialize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 3.88622283936e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000539s - ('BlasOpt', 'SeqOptimizer', 12, 1, 1) - 0.000s
- SeqOptimizer BlasOpt time 0.000s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000153s - ('gemm_optimizer', 'GemmOptimizer', 1, 1, 1) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0
- time_factor_can 0
- time_factor_list 0
- time_toposort 1.50203704834e-05
- validate_time 0.0
- callback_time 0.0
- 0.000111s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 1, 1) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- 0.000052s - ('use_c_blas', 'TopoOptimizer', 4, 1, 1) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.47819519043e-05
- loop time 1.09672546387e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000045s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 1, 1) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 5.96046447754e-06
- callback_time 0.0
- 0.000041s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 1, 1) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 5.96046447754e-06
- callback_time 0.0
- 0.000039s - ('use_scipy_ger', 'TopoOptimizer', 5, 1, 1) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000471s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 2, 2) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.000s for 2/2 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000078s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 2, 2) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.19480895996e-05
- loop time 1.28746032715e-05
- callback_time 0.0
- 0.000072s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 2, 2) - 0.000s
- 0.000057s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 2, 2) - 0.000s
- 0.000056s - ('scan_pushout_dot1', 'PushOutDot1', 3, 2, 2) - 0.000s
- 0.000053s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 2, 2) - 0.000s
- 0.000316s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 1, 1) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 2.14576721191e-06
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.38282775879e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.38282775879e-05
- loop time 2.14576721191e-06
- callback_time 0.0
- 0.000290s - ('stabilize', 'EquilibriumOptimizer', 8, 1, 1) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000277s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 1, 1) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000261s - ('merge2', 'MergeOptimizer', 22, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 1 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000251s - ('ShapeOpt', 'ShapeOptimizer', 2, 2, 2) - 0.000s
- 0.000202s - ('merge3', 'MergeOptimizer', 51, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000200s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 1, 1) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.98023223877e-05
- loop time 4.79221343994e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000197s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 1, 1) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000178s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 1, 1) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.00271606445e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000169s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 1, 1) - 0.000s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.00271606445e-05
- loop time 1.28746032715e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.000125s - ('merge1', 'MergeOptimizer', 0, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000124s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 1, 1) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 4.10079956055e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000112s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 1, 1) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 1.00135803223e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000111s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 1, 1) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.00135803223e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.000108s - ('useless', 'TopoOptimizer', 3, 2, 2) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.90870666504e-05
- loop time 4.60147857666e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 1 - local_subtensor_of_alloc - 0
- -0.000s - 0 - 1 - local_subtensor_make_vector - 0
- -0.000s - 0 - 1 - local_useless_slice - 0
- 0.000s - in 16 optimization that were not used (display those with runtime greater than 0)
- 0.000103s - ('blas_opt_inplace', 'TopoOptimizer', 34, 1, 1) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.38282775879e-05
- loop time 1.81198120117e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000096s - ('specialize_device', 'EquilibriumOptimizer', 17, 1, 1) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- 0.000073s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 1, 1) - 0.000s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.8133392334e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000073s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 1, 1) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000065s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 1, 1) - 0.000s
- 0.000061s - ('c_blas_destructive', 'TopoOptimizer', 37, 1, 1) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 8.10623168945e-06
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000061s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 1, 1) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 5.96046447754e-06
- callback_time 0.0
- 0.000060s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 1, 1) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000060s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 1, 1) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000059s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 1, 1) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 2.14576721191e-06
- callback_time 0.0
- 0.000059s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 1, 1) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000057s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 2, 2) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.78949737549e-05
- loop time 6.91413879395e-06
- callback_time 0.0
- 0.000056s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 1, 1) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000056s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 1, 1) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000048s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 1, 1) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 8.10623168945e-06
- callback_time 0.0
- 0.000047s - ('make_ger_destructive', 'TopoOptimizer', 41, 1, 1) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 9.05990600586e-06
- callback_time 0.0
- 0.000044s - ('cond_make_inplace', 'TopoOptimizer', 47, 1, 1) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.38282775879e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000044s - ('AbstractConvCheck', 'TopoOptimizer', 18, 1, 1) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 7.15255737305e-06
- callback_time 0.0
- 0.000044s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 1, 1) - 0.000s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000043s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 1, 1) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000043s - ('local_destructive', 'TopoOptimizer', 48, 1, 1) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000041s - ('random_make_inplace', 'TopoOptimizer', 49, 1, 1) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000040s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 1, 1) - 0.000s
- 0.000039s - ('merge1.2', 'MergeOptimizer', 7, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000038s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 1, 1) - 0.000s
- 0.000038s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 1, 1) - 0.000s
- 0.000038s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 1, 1) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.31130218506e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000025s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 1, 1) - 0.000s
- 0.000023s - ('merge1.1', 'MergeOptimizer', 4, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000008s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 1, 1) - 0.000s
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: sb/convnet/sb_resnet.py:239
- Time in 1 calls to Function.__call__: 1.287460e-05s
- Time in Function.fn.__call__: 5.006790e-06s (38.889%)
- Time in thunks: 2.145767e-06s (16.667%)
- Total compile time: 2.890587e-02s
- Number of Apply nodes: 1
- Theano Optimizer time: 9.943962e-03s
- Theano validate time: 5.078316e-05s
- Theano Linker time (includes C, CUDA code generation/compiling): 5.960464e-04s
- Import time 0.000000e+00s
- Node make_thunk time 4.749298e-04s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.713s
- Class
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
- 100.0% 100.0% 0.000s 2.15e-06s C 1 1 theano.compile.ops.Shape_i
- ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
- Ops
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
- 100.0% 100.0% 0.000s 2.15e-06s C 1 1 Shape_i{0}
- ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
- Apply
- ------
- <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
- 100.0% 100.0% 0.000s 2.15e-06s 1 0 Shape_i{0}(<GpuArrayType<None>(float32, (False,))>)
- ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 0.010s for 2/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 0.001802s - ('canonicalize', 'EquilibriumOptimizer', 6, 2, 1) - 0.000s
- EquilibriumOptimizer canonicalize
- time 0.001s for 3 passes
- nb nodes (start, end, max) 2 1 3
- time io_toposort 0.000s
- time in local optimizers 0.001s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.001s 1 (0.000s in global opts, 0.000s io_toposort) - 2 nodes - ('local_shape_to_shape_i', 1)
- 1 - 0.000s 1 (0.000s in global opts, 0.000s io_toposort) - 3 nodes - ('local_subtensor_make_vector', 1)
- 2 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- times - times applied - nb node created - name:
- 0.000s - 1 - 2 - local_shape_to_shape_i
- 0.000s - 1 - 0 - local_subtensor_make_vector
- 0.001s - in 87 optimization that were not used (display only those with a runtime > 0)
- 0.000s - MergeOptimizer
- 0.000s - topo_constant_folding
- 0.000s - local_useless_subtensor
- 0.000s - local_subtensor_remove_broadcastable_index
- 0.000s - local_track_shape_i
- 0.000s - local_useless_slice
- 0.000s - local_subtensor_lift
- 0.000s - local_subtensor_of_dot
- 0.000s - local_subtensor_merge
- 0.000s - local_subtensor_of_alloc
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (3, 3, 0)
- init io_toposort 4.88758087158e-05
- loop time 5.96046447754e-06
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.19345092773e-05
- loop time 2.14576721191e-06
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Iter 2
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.69277191162e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000850s - ('gpuarray_opt', 'SeqOptimizer', 16, 1, 1) - 0.000s
- SeqOptimizer gpuarray_opt time 0.001s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000375s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 1, 1) - 0.000s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- 0.000224s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 1, 1) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.000s
- Total time taken by local optimizers 0.000s
- 0.000106s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 1, 1) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- 0.000010s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 1, 1) - 0.000s
- 0.000619s - ('BlasOpt', 'SeqOptimizer', 12, 1, 1) - 0.000s
- SeqOptimizer BlasOpt time 0.000s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000181s - ('gemm_optimizer', 'GemmOptimizer', 1, 1, 1) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0
- time_factor_can 0
- time_factor_list 0
- time_toposort 1.69277191162e-05
- validate_time 0.0
- callback_time 0.0
- 0.000119s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 1, 1) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- 0.000059s - ('use_c_blas', 'TopoOptimizer', 4, 1, 1) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.69277191162e-05
- loop time 1.21593475342e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000047s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 1, 1) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.69277191162e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000045s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 1, 1) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.71661376953e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000044s - ('use_scipy_ger', 'TopoOptimizer', 5, 1, 1) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000602s - ('elemwise_fusion', 'SeqOptimizer', 19, 1, 1) - 0.000s
- SeqOptimizer elemwise_fusion time 0.000s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000282s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 1, 1) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 1.90734863281e-06
- 0.000192s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 1, 1) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 2.14576721191e-06
- 0.000530s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 2, 2) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.000s for 2/2 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000086s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 2, 2) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.81469726562e-05
- loop time 8.10623168945e-06
- callback_time 0.0
- 0.000079s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 2, 2) - 0.000s
- 0.000073s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 2, 2) - 0.000s
- 0.000067s - ('scan_pushout_dot1', 'PushOutDot1', 3, 2, 2) - 0.000s
- 0.000064s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 2, 2) - 0.000s
- 0.000508s - ('specialize', 'EquilibriumOptimizer', 13, 1, 1) - 0.000s
- EquilibriumOptimizer specialize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.81198120117e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000492s - ('add_destroy_handler', 'AddDestroyHandler', 23, 1, 1) - 0.000s
- 0.000369s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 1, 1) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.81198120117e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 2.14576721191e-06
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.62124633789e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000337s - ('stabilize', 'EquilibriumOptimizer', 8, 1, 1) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.69277191162e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000257s - ('ShapeOpt', 'ShapeOptimizer', 2, 2, 2) - 0.000s
- 0.000251s - ('merge2', 'MergeOptimizer', 22, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 1 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000226s - ('blas_opt_inplace', 'TopoOptimizer', 34, 1, 1) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 0.000138998031616
- loop time 1.31130218506e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000210s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 1, 1) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000188s - ('merge3', 'MergeOptimizer', 51, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000187s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 1, 1) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000160s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 1, 1) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000138s - ('merge1', 'MergeOptimizer', 0, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000115s - ('useless', 'TopoOptimizer', 3, 2, 2) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.2901763916e-05
- loop time 4.6968460083e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 1 - local_subtensor_of_alloc - 0
- -0.000s - 0 - 1 - local_subtensor_make_vector - 0
- -0.000s - 0 - 1 - local_useless_slice - 0
- 0.000s - in 16 optimization that were not used (display those with runtime greater than 0)
- 0.000107s - ('specialize_device', 'EquilibriumOptimizer', 17, 1, 1) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- 0.000104s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 1, 1) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.78813934326e-05
- loop time 1.09672546387e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000103s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 1, 1) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.00135803223e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000099s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 1, 1) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 1.00135803223e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.000095s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 1, 1) - 0.000s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.00135803223e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.000065s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 1, 1) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000065s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 2, 2) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.21865081787e-05
- loop time 6.91413879395e-06
- callback_time 0.0
- 0.000058s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 1, 1) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000057s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 1, 1) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000057s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 1, 1) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000057s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 1, 1) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.47819519043e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000056s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 1, 1) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000055s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 1, 1) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000055s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 1, 1) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000054s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 1, 1) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000050s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 1, 1) - 0.000s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.90734863281e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000050s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 1, 1) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.69277191162e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000048s - ('c_blas_destructive', 'TopoOptimizer', 37, 1, 1) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 9.05990600586e-06
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000047s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 1, 1) - 0.000s
- 0.000046s - ('merge1.2', 'MergeOptimizer', 7, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000045s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 1, 1) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.69277191162e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000044s - ('AbstractConvCheck', 'TopoOptimizer', 18, 1, 1) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000042s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 1, 1) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000041s - ('cond_make_inplace', 'TopoOptimizer', 47, 1, 1) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000040s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 1, 1) - 0.000s
- 0.000039s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 1, 1) - 0.000s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.19209289551e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000039s - ('local_destructive', 'TopoOptimizer', 48, 1, 1) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.28746032715e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000038s - ('make_ger_destructive', 'TopoOptimizer', 41, 1, 1) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.38282775879e-05
- loop time 2.14576721191e-06
- callback_time 0.0
- 0.000038s - ('random_make_inplace', 'TopoOptimizer', 49, 1, 1) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.31130218506e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000037s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 1, 1) - 0.000s
- 0.000030s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 1, 1) - 0.000s
- 0.000025s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 1, 1) - 0.000s
- 0.000024s - ('merge1.1', 'MergeOptimizer', 4, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000006s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 1, 1) - 0.000s
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: sb/convnet/sb_resnet.py:240
- Time in 1 calls to Function.__call__: 1.502037e-05s
- Time in Function.fn.__call__: 3.814697e-06s (25.397%)
- Time in thunks: 1.907349e-06s (12.698%)
- Total compile time: 2.578807e-02s
- Number of Apply nodes: 1
- Theano Optimizer time: 8.777857e-03s
- Theano validate time: 4.410744e-05s
- Theano Linker time (includes C, CUDA code generation/compiling): 6.120205e-04s
- Import time 0.000000e+00s
- Node make_thunk time 4.870892e-04s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.721s
- Class
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
- 100.0% 100.0% 0.000s 1.91e-06s C 1 1 theano.compile.ops.Shape_i
- ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
- Ops
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
- 100.0% 100.0% 0.000s 1.91e-06s C 1 1 Shape_i{0}
- ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
- Apply
- ------
- <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
- 100.0% 100.0% 0.000s 1.91e-06s 1 0 Shape_i{0}(<GpuArrayType<None>(float32, (False,))>)
- ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 0.009s for 2/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 0.001467s - ('canonicalize', 'EquilibriumOptimizer', 6, 2, 1) - 0.000s
- EquilibriumOptimizer canonicalize
- time 0.001s for 3 passes
- nb nodes (start, end, max) 2 1 3
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.001s 1 (0.000s in global opts, 0.000s io_toposort) - 2 nodes - ('local_shape_to_shape_i', 1)
- 1 - 0.000s 1 (0.000s in global opts, 0.000s io_toposort) - 3 nodes - ('local_subtensor_make_vector', 1)
- 2 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- times - times applied - nb node created - name:
- 0.000s - 1 - 2 - local_shape_to_shape_i
- 0.000s - 1 - 0 - local_subtensor_make_vector
- 0.000s - in 87 optimization that were not used (display only those with a runtime > 0)
- 0.000s - MergeOptimizer
- 0.000s - topo_constant_folding
- 0.000s - local_useless_subtensor
- 0.000s - local_subtensor_remove_broadcastable_index
- 0.000s - local_track_shape_i
- 0.000s - local_useless_slice
- 0.000s - local_subtensor_lift
- 0.000s - local_subtensor_of_dot
- 0.000s - local_subtensor_merge
- 0.000s - local_subtensor_of_alloc
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (3, 3, 0)
- init io_toposort 3.71932983398e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.81198120117e-05
- loop time 2.14576721191e-06
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Iter 2
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.69277191162e-05
- loop time 9.53674316406e-07
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000796s - ('gpuarray_opt', 'SeqOptimizer', 16, 1, 1) - 0.000s
- SeqOptimizer gpuarray_opt time 0.001s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000393s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 1, 1) - 0.000s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- 0.000184s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 1, 1) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.000s
- Total time taken by local optimizers 0.000s
- 0.000086s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 1, 1) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- 0.000008s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 1, 1) - 0.000s
- 0.000520s - ('elemwise_fusion', 'SeqOptimizer', 19, 1, 1) - 0.000s
- SeqOptimizer elemwise_fusion time 0.000s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000197s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 1, 1) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 2.14576721191e-06
- 0.000193s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 1, 1) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 1.90734863281e-06
- 0.000513s - ('BlasOpt', 'SeqOptimizer', 12, 1, 1) - 0.000s
- SeqOptimizer BlasOpt time 0.000s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000150s - ('gemm_optimizer', 'GemmOptimizer', 1, 1, 1) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0
- time_factor_can 0
- time_factor_list 0
- time_toposort 1.4066696167e-05
- validate_time 0.0
- callback_time 0.0
- 0.000105s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 1, 1) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- 0.000047s - ('use_c_blas', 'TopoOptimizer', 4, 1, 1) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 1.00135803223e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000039s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 1, 1) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000038s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 1, 1) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000037s - ('use_scipy_ger', 'TopoOptimizer', 5, 1, 1) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.28746032715e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000504s - ('add_destroy_handler', 'AddDestroyHandler', 23, 1, 1) - 0.000s
- 0.000416s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 1, 1) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.8835067749e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.38282775879e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.31130218506e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.31130218506e-05
- loop time 1.19209289551e-06
- callback_time 0.0
- 0.000406s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 2, 2) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.000s for 2/2 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000066s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 2, 2) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.90870666504e-05
- loop time 7.15255737305e-06
- callback_time 0.0
- 0.000063s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 2, 2) - 0.000s
- 0.000053s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 2, 2) - 0.000s
- 0.000051s - ('scan_pushout_dot1', 'PushOutDot1', 3, 2, 2) - 0.000s
- 0.000050s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 2, 2) - 0.000s
- 0.000383s - ('specialize', 'EquilibriumOptimizer', 13, 1, 1) - 0.000s
- EquilibriumOptimizer specialize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.69277191162e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000276s - ('stabilize', 'EquilibriumOptimizer', 8, 1, 1) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.62124633789e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000249s - ('merge2', 'MergeOptimizer', 22, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 1 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000219s - ('ShapeOpt', 'ShapeOptimizer', 2, 2, 2) - 0.000s
- 0.000191s - ('merge3', 'MergeOptimizer', 51, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000188s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 1, 1) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000187s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 1, 1) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 1.90734863281e-06
- 0.000144s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 1, 1) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000102s - ('useless', 'TopoOptimizer', 3, 2, 2) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.09944152832e-05
- loop time 4.00543212891e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 1 - local_subtensor_of_alloc - 0
- -0.000s - 0 - 1 - local_subtensor_make_vector - 0
- -0.000s - 0 - 1 - local_useless_slice - 0
- 0.000s - in 16 optimization that were not used (display those with runtime greater than 0)
- 0.000101s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 1, 1) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 1.59740447998e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000101s - ('blas_opt_inplace', 'TopoOptimizer', 34, 1, 1) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 1.31130218506e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000100s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 1, 1) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.00135803223e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000100s - ('merge1', 'MergeOptimizer', 0, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000098s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 1, 1) - 0.000s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.09672546387e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.000096s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 1, 1) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.00135803223e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.000095s - ('specialize_device', 'EquilibriumOptimizer', 17, 1, 1) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- 0.000062s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 1, 1) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.47819519043e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000057s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 1, 1) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000056s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 1, 1) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000055s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 1, 1) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.28746032715e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000055s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 1, 1) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000055s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 2, 2) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.69412994385e-05
- loop time 6.19888305664e-06
- callback_time 0.0
- 0.000055s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 1, 1) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 2.14576721191e-06
- callback_time 0.0
- 0.000054s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 1, 1) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.38282775879e-05
- loop time 2.14576721191e-06
- callback_time 0.0
- 0.000054s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 1, 1) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000053s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 1, 1) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.31130218506e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000050s - ('cond_make_inplace', 'TopoOptimizer', 47, 1, 1) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.28881835938e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000050s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 1, 1) - 0.000s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.90734863281e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000049s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 1, 1) - 0.000s
- 0.000047s - ('c_blas_destructive', 'TopoOptimizer', 37, 1, 1) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 1.00135803223e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000046s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 1, 1) - 0.000s
- 0.000043s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 1, 1) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000041s - ('AbstractConvCheck', 'TopoOptimizer', 18, 1, 1) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000040s - ('local_destructive', 'TopoOptimizer', 48, 1, 1) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000039s - ('make_ger_destructive', 'TopoOptimizer', 41, 1, 1) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.31130218506e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000039s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 1, 1) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.4066696167e-05
- loop time 3.81469726562e-06
- callback_time 0.0
- 0.000039s - ('merge1.2', 'MergeOptimizer', 7, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000038s - ('random_make_inplace', 'TopoOptimizer', 49, 1, 1) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.28746032715e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000038s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 1, 1) - 0.000s
- 0.000038s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 1, 1) - 0.000s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.31130218506e-05
- loop time 1.90734863281e-06
- callback_time 0.0
- 0.000036s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 1, 1) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.38282775879e-05
- loop time 2.14576721191e-06
- callback_time 0.0
- 0.000034s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 1, 1) - 0.000s
- 0.000029s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 1, 1) - 0.000s
- 0.000020s - ('merge1.1', 'MergeOptimizer', 4, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000006s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 1, 1) - 0.000s
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: sb/convnet/sb_resnet.py:260
- Time in 1 calls to Function.__call__: 8.409023e-04s
- Time in Function.fn.__call__: 8.080006e-04s (96.087%)
- Time in thunks: 7.898808e-04s (93.933%)
- Total compile time: 3.525209e-02s
- Number of Apply nodes: 2
- Theano Optimizer time: 1.099181e-02s
- Theano validate time: 1.215935e-05s
- Theano Linker time (includes C, CUDA code generation/compiling): 6.864071e-03s
- Import time 5.328894e-03s
- Node make_thunk time 6.590128e-03s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.762s
- Class
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
- 98.0% 98.0% 0.001s 7.74e-04s C 1 1 theano.gpuarray.basic_ops.HostFromGpu
- 2.0% 100.0% 0.000s 1.60e-05s C 1 1 theano.gpuarray.subtensor.GpuSubtensor
- ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
- Ops
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
- 98.0% 98.0% 0.001s 7.74e-04s C 1 1 HostFromGpu(gpuarray)
- 2.0% 100.0% 0.000s 1.60e-05s C 1 1 GpuSubtensor{:int64:}
- ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
- Apply
- ------
- <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
- 98.0% 98.0% 0.001s 7.74e-04s 1 1 HostFromGpu(gpuarray)(GpuSubtensor{:int64:}.0)
- 2.0% 100.0% 0.000s 1.60e-05s 1 0 GpuSubtensor{:int64:}(<GpuArrayType<None>(float32, (False, False, False, False))>, Constant{128})
- ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 0.011s for 2/2 nodes before/after optimization
- 0.001s for callback
- 0.000s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 0.001547s - ('gpuarray_opt', 'SeqOptimizer', 16, 2, 2) - 0.000s
- SeqOptimizer gpuarray_opt time 0.001s for 2/2 nodes before/after optimization
- 0.001s for callback
- 0.000s for fgraph.validate()
- 0.000959s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 2, 2) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.000s
- Total time taken by local optimizers 0.000s
- times - times applied - Node created - name:
- 0.000s - 1 - 1 - local_gpua_subtensor_graph
- 0.000s - in 0 optimization that were not used (display only those with a runtime > 0)
- 0.000353s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 2, 2) - 0.000s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- 0.000113s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 2, 2) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- 0.000010s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 2, 2) - 0.000s
- 0.000818s - ('ShapeOpt', 'ShapeOptimizer', 2, 2, 2) - 0.000s
- 0.000686s - ('canonicalize', 'EquilibriumOptimizer', 6, 2, 2) - 0.000s
- EquilibriumOptimizer canonicalize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.90870666504e-05
- loop time 3.81469726562e-06
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000629s - ('specialize', 'EquilibriumOptimizer', 13, 2, 2) - 0.000s
- EquilibriumOptimizer specialize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.78949737549e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000599s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 2, 2) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.001s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.001s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.69412994385e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 0.000126123428345
- loop time 3.81469726562e-06
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.69412994385e-05
- loop time 3.81469726562e-06
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.40802764893e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000584s - ('BlasOpt', 'SeqOptimizer', 12, 2, 2) - 0.000s
- SeqOptimizer BlasOpt time 0.000s for 2/2 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000149s - ('gemm_optimizer', 'GemmOptimizer', 1, 2, 2) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0
- time_factor_can 0
- time_factor_list 0
- time_toposort 2.50339508057e-05
- validate_time 0.0
- callback_time 0.0
- 0.000120s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 2, 2) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- 0.000067s - ('use_c_blas', 'TopoOptimizer', 4, 2, 2) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.69412994385e-05
- loop time 1.81198120117e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000053s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 2, 2) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 6.19888305664e-06
- callback_time 0.0
- 0.000053s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 2, 2) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000051s - ('use_scipy_ger', 'TopoOptimizer', 5, 2, 2) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000577s - ('add_destroy_handler', 'AddDestroyHandler', 23, 2, 2) - 0.000s
- 0.000558s - ('elemwise_fusion', 'SeqOptimizer', 19, 2, 2) - 0.000s
- SeqOptimizer elemwise_fusion time 0.000s for 2/2 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000228s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 2, 2) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 5.41210174561e-05
- 0.000216s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 2, 2) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 4.60147857666e-05
- 0.000421s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 2, 2) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.000s for 2/2 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000068s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 2, 2) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.00407409668e-05
- loop time 5.96046447754e-06
- callback_time 0.0
- 0.000064s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 2, 2) - 0.000s
- 0.000055s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 2, 2) - 0.000s
- 0.000052s - ('scan_pushout_dot1', 'PushOutDot1', 3, 2, 2) - 0.000s
- 0.000051s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 2, 2) - 0.000s
- 0.000368s - ('stabilize', 'EquilibriumOptimizer', 8, 2, 2) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.69412994385e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000252s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 2, 2) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 0.000180006027222
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000216s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 2, 2) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 4.6968460083e-05
- 0.000212s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 2, 2) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 4.38690185547e-05
- 0.000200s - ('merge3', 'MergeOptimizer', 51, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000199s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 2, 2) - 0.000s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 5.00679016113e-05
- loop time 5.48362731934e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.000177s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 2, 2) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000119s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 2, 2) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.69412994385e-05
- loop time 1.59740447998e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.000116s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 2, 2) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 1.59740447998e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000116s - ('blas_opt_inplace', 'TopoOptimizer', 34, 2, 2) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 2.09808349609e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000115s - ('useless', 'TopoOptimizer', 3, 2, 2) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.31401824951e-05
- loop time 4.6968460083e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 1 - local_subtensor_of_alloc - 0
- -0.000s - 0 - 1 - local_subtensor_make_vector - 0
- -0.000s - 0 - 1 - local_useless_slice - 0
- 0.000s - in 16 optimization that were not used (display those with runtime greater than 0)
- 0.000113s - ('specialize_device', 'EquilibriumOptimizer', 17, 2, 2) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- 0.000113s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 2, 2) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 1.59740447998e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000111s - ('merge2', 'MergeOptimizer', 22, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000109s - ('merge1', 'MergeOptimizer', 0, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000090s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 2, 2) - 0.000s
- 0.000090s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 2, 2) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.60012054443e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000086s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 2, 2) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.60012054443e-05
- loop time 5.96046447754e-06
- callback_time 0.0
- 0.000081s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 2, 2) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.09944152832e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000079s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 2, 2) - 0.000s
- 0.000071s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 2, 2) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000070s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 2, 2) - 0.000s
- 0.000069s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 2, 2) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000069s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 2, 2) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000068s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 2, 2) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000067s - ('c_blas_destructive', 'TopoOptimizer', 37, 2, 2) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 1.59740447998e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000067s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 2, 2) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000067s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 2, 2) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000066s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 2, 2) - 0.000s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.2901763916e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000063s - ('make_ger_destructive', 'TopoOptimizer', 41, 2, 2) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.09944152832e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000060s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 2, 2) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.8133392334e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000059s - ('cond_make_inplace', 'TopoOptimizer', 47, 2, 2) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000057s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 2, 2) - 0.000s
- 0.000056s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 2, 2) - 0.000s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000055s - ('local_destructive', 'TopoOptimizer', 48, 2, 2) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000055s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 2, 2) - 0.000s
- 0.000054s - ('random_make_inplace', 'TopoOptimizer', 49, 2, 2) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000054s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 2, 2) - 0.000s
- 0.000054s - ('AbstractConvCheck', 'TopoOptimizer', 18, 2, 2) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 4.76837158203e-06
- callback_time 0.0
- 0.000054s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 2, 2) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000053s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 2, 2) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.69412994385e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000030s - ('merge1.2', 'MergeOptimizer', 7, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000021s - ('merge1.1', 'MergeOptimizer', 4, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: sb/convnet/sb_resnet.py:261
- Time in 1 calls to Function.__call__: 8.106232e-05s
- Time in Function.fn.__call__: 6.794930e-05s (83.824%)
- Time in thunks: 6.198883e-05s (76.471%)
- Total compile time: 3.346705e-02s
- Number of Apply nodes: 2
- Theano Optimizer time: 1.266003e-02s
- Theano validate time: 1.287460e-05s
- Theano Linker time (includes C, CUDA code generation/compiling): 4.466057e-03s
- Import time 3.094912e-03s
- Node make_thunk time 4.228830e-03s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.768s
- Class
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
- 93.8% 93.8% 0.000s 5.82e-05s C 1 1 theano.gpuarray.basic_ops.HostFromGpu
- 6.2% 100.0% 0.000s 3.81e-06s C 1 1 theano.gpuarray.subtensor.GpuSubtensor
- ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
- Ops
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
- 93.8% 93.8% 0.000s 5.82e-05s C 1 1 HostFromGpu(gpuarray)
- 6.2% 100.0% 0.000s 3.81e-06s C 1 1 GpuSubtensor{:int64:}
- ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
- Apply
- ------
- <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
- 93.8% 93.8% 0.000s 5.82e-05s 1 1 HostFromGpu(gpuarray)(GpuSubtensor{:int64:}.0)
- 6.2% 100.0% 0.000s 3.81e-06s 1 0 GpuSubtensor{:int64:}(<GpuArrayType<None>(float32, (False,))>, Constant{128})
- ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 0.012s for 2/2 nodes before/after optimization
- 0.001s for callback
- 0.000s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 0.002204s - ('canonicalize', 'EquilibriumOptimizer', 6, 2, 2) - 0.000s
- EquilibriumOptimizer canonicalize
- time 0.002s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.001s
- time in cleanup optimizers 0.000s
- 0 - 0.002s 0 (0.002s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 0.00145483016968
- loop time 5.96046447754e-06
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.001610s - ('gpuarray_opt', 'SeqOptimizer', 16, 2, 2) - 0.000s
- SeqOptimizer gpuarray_opt time 0.001s for 2/2 nodes before/after optimization
- 0.001s for callback
- 0.000s for fgraph.validate()
- 0.000982s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 2, 2) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.000s
- Total time taken by local optimizers 0.000s
- times - times applied - Node created - name:
- 0.000s - 1 - 1 - local_gpua_subtensor_graph
- 0.000s - in 0 optimization that were not used (display only those with a runtime > 0)
- 0.000393s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 2, 2) - 0.000s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- 0.000111s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 2, 2) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- 0.000010s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 2, 2) - 0.000s
- 0.000769s - ('ShapeOpt', 'ShapeOptimizer', 2, 2, 2) - 0.000s
- 0.000677s - ('BlasOpt', 'SeqOptimizer', 12, 2, 2) - 0.000s
- SeqOptimizer BlasOpt time 0.001s for 2/2 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000175s - ('gemm_optimizer', 'GemmOptimizer', 1, 2, 2) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0
- time_factor_can 0
- time_factor_list 0
- time_toposort 2.90870666504e-05
- validate_time 0.0
- callback_time 0.0
- 0.000139s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 2, 2) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- 0.000077s - ('use_c_blas', 'TopoOptimizer', 4, 2, 2) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.00407409668e-05
- loop time 1.90734863281e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000062s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 2, 2) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.00407409668e-05
- loop time 5.96046447754e-06
- callback_time 0.0
- 0.000060s - ('use_scipy_ger', 'TopoOptimizer', 5, 2, 2) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.09944152832e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000060s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 2, 2) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.88486480713e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000647s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 2, 2) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.001s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.001s 0 (0.001s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.000s for 2/2 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000113s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 2, 2) - 0.000s
- 0.000105s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 2, 2) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.00407409668e-05
- loop time 6.91413879395e-06
- callback_time 0.0
- 0.000084s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 2, 2) - 0.000s
- 0.000084s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 2, 2) - 0.000s
- 0.000077s - ('scan_pushout_dot1', 'PushOutDot1', 3, 2, 2) - 0.000s
- 0.000638s - ('specialize', 'EquilibriumOptimizer', 13, 2, 2) - 0.000s
- EquilibriumOptimizer specialize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 4.41074371338e-05
- loop time 3.81469726562e-06
- callback_time 0.0
- 0.000580s - ('add_destroy_handler', 'AddDestroyHandler', 23, 2, 2) - 0.000s
- 0.000564s - ('elemwise_fusion', 'SeqOptimizer', 19, 2, 2) - 0.000s
- SeqOptimizer elemwise_fusion time 0.000s for 2/2 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000233s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 2, 2) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 5.29289245605e-05
- 0.000217s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 2, 2) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 4.60147857666e-05
- 0.000563s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 2, 2) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.09944152832e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.88486480713e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.8133392334e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.90870666504e-05
- loop time 3.81469726562e-06
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.8133392334e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000438s - ('stabilize', 'EquilibriumOptimizer', 8, 2, 2) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.31401824951e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000216s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 2, 2) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 4.60147857666e-05
- 0.000215s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 2, 2) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 4.60147857666e-05
- 0.000190s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 2, 2) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.69412994385e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000189s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 2, 2) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.09944152832e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000180s - ('merge3', 'MergeOptimizer', 51, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000135s - ('useless', 'TopoOptimizer', 3, 2, 2) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 4.10079956055e-05
- loop time 5.3882598877e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 1 - local_subtensor_of_alloc - 0
- -0.000s - 0 - 1 - local_subtensor_make_vector - 0
- -0.000s - 0 - 1 - local_useless_slice - 0
- 0.000s - in 16 optimization that were not used (display those with runtime greater than 0)
- 0.000117s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 2, 2) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.90870666504e-05
- loop time 1.59740447998e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000115s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 2, 2) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.8133392334e-05
- loop time 1.59740447998e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.000114s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 2, 2) - 0.000s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 1.81198120117e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.000114s - ('blas_opt_inplace', 'TopoOptimizer', 34, 2, 2) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 2.00271606445e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000112s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 2, 2) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 1.47819519043e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000112s - ('specialize_device', 'EquilibriumOptimizer', 17, 2, 2) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.000s for 1 passes
- nb nodes (start, end, max) 2 2 2
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 2 nodes -
- 0.000111s - ('merge2', 'MergeOptimizer', 22, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000109s - ('merge1', 'MergeOptimizer', 0, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000079s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 2, 2) - 0.000s
- 0.000077s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 2, 2) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.90870666504e-05
- loop time 3.81469726562e-06
- callback_time 0.0
- 0.000071s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 2, 2) - 0.000s
- 0.000070s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 2, 2) - 0.000s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.40938568115e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000070s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 2, 2) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.50475311279e-05
- loop time 6.91413879395e-06
- callback_time 0.0
- 0.000070s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 2, 2) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000068s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 2, 2) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000068s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 2, 2) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000067s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 2, 2) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000067s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 2, 2) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000067s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 2, 2) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000066s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 2, 2) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.40802764893e-05
- loop time 3.09944152832e-06
- callback_time 0.0
- 0.000064s - ('c_blas_destructive', 'TopoOptimizer', 37, 2, 2) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.50339508057e-05
- loop time 1.4066696167e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000063s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 2, 2) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.09944152832e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000061s - ('local_destructive', 'TopoOptimizer', 48, 2, 2) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.50475311279e-05
- loop time 2.86102294922e-06
- callback_time 0.0
- 0.000061s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 2, 2) - 0.000s
- 0.000059s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 2, 2) - 0.000s
- 0.000058s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 2, 2) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 3.00407409668e-05
- loop time 3.81469726562e-06
- callback_time 0.0
- 0.000056s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 2, 2) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.78949737549e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000055s - ('make_ger_destructive', 'TopoOptimizer', 41, 2, 2) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.59876251221e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000054s - ('random_make_inplace', 'TopoOptimizer', 49, 2, 2) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.38418579102e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000054s - ('AbstractConvCheck', 'TopoOptimizer', 18, 2, 2) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.71797180176e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000053s - ('cond_make_inplace', 'TopoOptimizer', 47, 2, 2) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.40802764893e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000052s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 2, 2) - 0.000s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (2, 2, 0)
- init io_toposort 2.38418579102e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000050s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 2, 2) - 0.000s
- 0.000050s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 2, 2) - 0.000s
- 0.000035s - ('merge1.2', 'MergeOptimizer', 7, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000025s - ('merge1.1', 'MergeOptimizer', 4, 2, 2) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: /Users/Ramana/projects/SBRNN/sb/utils.py:56
- Time in 1 calls to Function.__call__: 2.908707e-05s
- Time in Function.fn.__call__: 1.001358e-05s (34.426%)
- Time in thunks: 5.006790e-06s (17.213%)
- Total compile time: 3.316283e-02s
- Number of Apply nodes: 1
- Theano Optimizer time: 1.118994e-02s
- Theano validate time: 7.295609e-05s
- Theano Linker time (includes C, CUDA code generation/compiling): 5.037069e-03s
- Import time 4.065990e-03s
- Node make_thunk time 4.762173e-03s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.808s
- Class
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
- 100.0% 100.0% 0.000s 5.01e-06s C 1 1 theano.compile.ops.DeepCopyOp
- ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
- Ops
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
- 100.0% 100.0% 0.000s 5.01e-06s C 1 1 DeepCopyOp
- ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
- Apply
- ------
- <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
- 100.0% 100.0% 0.000s 5.01e-06s 1 0 DeepCopyOp(TensorConstant{-0.577215671539})
- ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 0.011s for 1/0 nodes before/after optimization
- 0.001s for callback
- 0.000s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 0.003724s - ('canonicalize', 'EquilibriumOptimizer', 6, 1, 0) - 0.000s
- EquilibriumOptimizer canonicalize
- time 0.003s for 2 passes
- nb nodes (start, end, max) 1 0 1
- time io_toposort 0.000s
- time in local optimizers 0.001s
- time in global optimizers 0.000s
- time in final optimizers 0.002s
- time in cleanup optimizers 0.000s
- 0 - 0.003s 4 (0.002s in global opts, 0.000s io_toposort) - 1 nodes - ('topo_constant_folding', 1) ('local_upcast_elemwise_constant_inputs', 1) ('local_dimshuffle_lift', 1) ('MergeOptimizer', 1)
- 1 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- times - times applied - nb node created - name:
- 0.002s - 1 - 0 - topo_constant_folding
- 0.001s - 1 - 3 - local_upcast_elemwise_constant_inputs
- 0.000s - 1 - 1 - local_dimshuffle_lift
- 0.000s - 1 - 1 - MergeOptimizer
- 0.000s - in 85 optimization that were not used (display only those with a runtime > 0)
- 0.000s - local_func_inv
- 0.000s - local_useless_elemwise
- 0.000s - local_fill_sink
- 0.000s - local_expm1
- 0.000s - local_track_shape_i
- 0.000s - local_merge_switch_same_cond
- 0.000s - local_cast_cast
- 0.000s - local_fill_cut
- 0.000s - local_useless_switch
- 0.000s - local_useless_elemwise_comparison
- 0.000s - local_lift_transpose_through_dot
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 0, 2)
- init io_toposort 2.59876251221e-05
- loop time 0.00179696083069
- callback_time 0.00019383430481
- MergeOptimizer
- nb fail= 0 merged= 1 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.001101s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 0, 0) - 0.000s
- 0.000595s - ('add_destroy_handler', 'AddDestroyHandler', 23, 0, 0) - 0.000s
- 0.000571s - ('elemwise_fusion', 'SeqOptimizer', 19, 0, 0) - 0.000s
- SeqOptimizer elemwise_fusion time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000229s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000203s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 2.14576721191e-06
- 0.000536s - ('gpuarray_opt', 'SeqOptimizer', 16, 0, 0) - 0.000s
- SeqOptimizer gpuarray_opt time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000300s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 0, 0) - 0.000s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000068s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 0, 0) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000045s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 0, 0) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.000s
- Total time taken by local optimizers 0.000s
- 0.000007s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 0, 0) - 0.000s
- 0.000424s - ('BlasOpt', 'SeqOptimizer', 12, 0, 0) - 0.000s
- SeqOptimizer BlasOpt time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000135s - ('gemm_optimizer', 'GemmOptimizer', 1, 0, 0) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0
- time_factor_can 0
- time_factor_list 0
- time_toposort 6.91413879395e-06
- validate_time 0.0
- callback_time 0.0
- 0.000089s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 0, 0) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000029s - ('use_c_blas', 'TopoOptimizer', 4, 0, 0) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000027s - ('use_scipy_ger', 'TopoOptimizer', 5, 0, 0) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000027s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 0, 0) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.19888305664e-06
- loop time 0.0
- callback_time 0.0
- 0.000027s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 0, 0) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000361s - ('specialize', 'EquilibriumOptimizer', 13, 0, 0) - 0.000s
- EquilibriumOptimizer specialize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000274s - ('stabilize', 'EquilibriumOptimizer', 8, 0, 0) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000241s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 0, 0) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.19888305664e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000234s - ('merge3', 'MergeOptimizer', 51, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000208s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 2.14576721191e-06
- 0.000205s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 1, 1) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.000s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000061s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 1, 1) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.09808349609e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000013s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 1, 1) - 0.000s
- 0.000008s - ('scan_pushout_dot1', 'PushOutDot1', 3, 1, 1) - 0.000s
- 0.000008s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 1, 1) - 0.000s
- 0.000008s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 1, 1) - 0.000s
- 0.000196s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000131s - ('merge2', 'MergeOptimizer', 22, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000124s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 0, 0) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000108s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 0, 0) - 0.000s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.000104s - ('specialize_device', 'EquilibriumOptimizer', 17, 0, 0) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000096s - ('blas_opt_inplace', 'TopoOptimizer', 34, 0, 0) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000094s - ('merge1', 'MergeOptimizer', 0, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000085s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 0, 0) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000085s - ('useless', 'TopoOptimizer', 3, 1, 1) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.81198120117e-05
- loop time 3.69548797607e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 1 - local_useless_switch - 0
- -0.000s - 0 - 1 - local_useless_elemwise_comparison - 0
- -0.000s - 0 - 1 - local_useless_elemwise - 0
- 0.000s - in 16 optimization that were not used (display those with runtime greater than 0)
- 0.000084s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 0, 0) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.000083s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 0, 0) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000072s - ('ShapeOpt', 'ShapeOptimizer', 2, 1, 1) - 0.000s
- 0.000065s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 0, 0) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000058s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 0, 0) - 0.000s
- 0.000058s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000058s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 0, 0) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.82148742676e-06
- loop time 0.0
- callback_time 0.0
- 0.000055s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- 0.000052s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 0, 0) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000051s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 0, 0) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000050s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 0, 0) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000050s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- 0.000049s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000046s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 1, 1) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 7.15255737305e-06
- callback_time 0.0
- 0.000045s - ('cond_make_inplace', 'TopoOptimizer', 47, 0, 0) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.00135803223e-05
- loop time 0.0
- callback_time 0.0
- 0.000045s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 0, 0) - 0.000s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.31130218506e-05
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000043s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 0, 0) - 0.000s
- 0.000042s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 0, 0) - 0.000s
- 0.000039s - ('local_destructive', 'TopoOptimizer', 48, 0, 0) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 9.05990600586e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- 0.000037s - ('random_make_inplace', 'TopoOptimizer', 49, 0, 0) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- 0.000036s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 0, 0) - 0.000s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000036s - ('merge1.2', 'MergeOptimizer', 7, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000034s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 0, 0) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000033s - ('make_ger_destructive', 'TopoOptimizer', 41, 0, 0) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000032s - ('c_blas_destructive', 'TopoOptimizer', 37, 0, 0) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000030s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 0, 0) - 0.000s
- 0.000030s - ('AbstractConvCheck', 'TopoOptimizer', 18, 0, 0) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 0, 0) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000026s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 0, 0) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.19888305664e-06
- loop time 0.0
- callback_time 0.0
- 0.000020s - ('merge1.1', 'MergeOptimizer', 4, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000005s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 0, 0) - 0.000s
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: /Users/Ramana/projects/SBRNN/sb/utils.py:27
- Time in 1 calls to Function.__call__: 1.621246e-05s
- Time in Function.fn.__call__: 5.960464e-06s (36.765%)
- Time in thunks: 4.053116e-06s (25.000%)
- Total compile time: 2.882910e-02s
- Number of Apply nodes: 1
- Theano Optimizer time: 9.731054e-03s
- Theano validate time: 8.273125e-05s
- Theano Linker time (includes C, CUDA code generation/compiling): 6.179810e-04s
- Import time 0.000000e+00s
- Node make_thunk time 4.899502e-04s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.814s
- Class
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
- 100.0% 100.0% 0.000s 4.05e-06s C 1 1 theano.compile.ops.DeepCopyOp
- ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
- Ops
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
- 100.0% 100.0% 0.000s 4.05e-06s C 1 1 DeepCopyOp
- ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
- Apply
- ------
- <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
- 100.0% 100.0% 0.000s 4.05e-06s 1 0 DeepCopyOp(TensorConstant{-0.577215671539})
- ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 0.009s for 1/0 nodes before/after optimization
- 0.001s for callback
- 0.000s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 0.002802s - ('canonicalize', 'EquilibriumOptimizer', 6, 1, 0) - 0.000s
- EquilibriumOptimizer canonicalize
- time 0.002s for 2 passes
- nb nodes (start, end, max) 1 0 1
- time io_toposort 0.000s
- time in local optimizers 0.001s
- time in global optimizers 0.000s
- time in final optimizers 0.001s
- time in cleanup optimizers 0.000s
- 0 - 0.002s 4 (0.001s in global opts, 0.000s io_toposort) - 1 nodes - ('topo_constant_folding', 1) ('local_upcast_elemwise_constant_inputs', 1) ('local_dimshuffle_lift', 1) ('MergeOptimizer', 1)
- 1 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- times - times applied - nb node created - name:
- 0.001s - 1 - 3 - local_upcast_elemwise_constant_inputs
- 0.001s - 1 - 0 - topo_constant_folding
- 0.000s - 1 - 1 - local_dimshuffle_lift
- 0.000s - 1 - 1 - MergeOptimizer
- 0.000s - in 85 optimization that were not used (display only those with a runtime > 0)
- 0.000s - local_fill_sink
- 0.000s - local_useless_elemwise
- 0.000s - local_func_inv
- 0.000s - local_useless_elemwise_comparison
- 0.000s - local_merge_switch_same_cond
- 0.000s - local_track_shape_i
- 0.000s - local_cast_cast
- 0.000s - local_fill_cut
- 0.000s - local_expm1
- 0.000s - local_lift_transpose_through_dot
- 0.000s - local_useless_switch
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 0, 2)
- init io_toposort 3.31401824951e-05
- loop time 0.000442981719971
- callback_time 0.000193357467651
- MergeOptimizer
- nb fail= 0 merged= 1 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000772s - ('BlasOpt', 'SeqOptimizer', 12, 0, 0) - 0.000s
- SeqOptimizer BlasOpt time 0.001s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000245s - ('gemm_optimizer', 'GemmOptimizer', 1, 0, 0) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0
- time_factor_can 0
- time_factor_list 0
- time_toposort 1.50203704834e-05
- validate_time 0.0
- callback_time 0.0
- 0.000178s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 0, 0) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000059s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 0, 0) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.71661376953e-05
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000056s - ('use_c_blas', 'TopoOptimizer', 4, 0, 0) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.50203704834e-05
- loop time 9.53674316406e-07
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000054s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 0, 0) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.50203704834e-05
- loop time 1.19209289551e-06
- callback_time 0.0
- 0.000048s - ('use_scipy_ger', 'TopoOptimizer', 5, 0, 0) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.28746032715e-05
- loop time 1.19209289551e-06
- callback_time 0.0
- 0.000682s - ('gpuarray_opt', 'SeqOptimizer', 16, 0, 0) - 0.000s
- SeqOptimizer gpuarray_opt time 0.001s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000387s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 0, 0) - 0.000s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000085s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 0, 0) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000066s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 0, 0) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.000s
- Total time taken by local optimizers 0.000s
- 0.000009s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 0, 0) - 0.000s
- 0.000558s - ('add_destroy_handler', 'AddDestroyHandler', 23, 0, 0) - 0.000s
- 0.000556s - ('elemwise_fusion', 'SeqOptimizer', 19, 0, 0) - 0.000s
- SeqOptimizer elemwise_fusion time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000215s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 2.14576721191e-06
- 0.000206s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000501s - ('specialize', 'EquilibriumOptimizer', 13, 0, 0) - 0.000s
- EquilibriumOptimizer specialize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 9.05990600586e-06
- loop time 0.0
- callback_time 0.0
- 0.000425s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 0, 0) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.50203704834e-05
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000274s - ('stabilize', 'EquilibriumOptimizer', 8, 0, 0) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000224s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 1, 1) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.000s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000057s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 1, 1) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.09808349609e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000013s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 1, 1) - 0.000s
- 0.000009s - ('scan_pushout_dot1', 'PushOutDot1', 3, 1, 1) - 0.000s
- 0.000009s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 1, 1) - 0.000s
- 0.000008s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 1, 1) - 0.000s
- 0.000196s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000193s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 1.19209289551e-06
- 0.000189s - ('merge3', 'MergeOptimizer', 51, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000153s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 0, 0) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000129s - ('merge2', 'MergeOptimizer', 22, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000096s - ('specialize_device', 'EquilibriumOptimizer', 17, 0, 0) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000093s - ('merge1', 'MergeOptimizer', 0, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000085s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 0, 0) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000082s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 0, 0) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.000079s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 0, 0) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000079s - ('blas_opt_inplace', 'TopoOptimizer', 34, 0, 0) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000077s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 0, 0) - 0.000s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.000077s - ('useless', 'TopoOptimizer', 3, 1, 1) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.78813934326e-05
- loop time 2.88486480713e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 1 - local_useless_switch - 0
- -0.000s - 0 - 1 - local_useless_elemwise_comparison - 0
- -0.000s - 0 - 1 - local_useless_elemwise - 0
- 0.000s - in 16 optimization that were not used (display those with runtime greater than 0)
- 0.000072s - ('ShapeOpt', 'ShapeOptimizer', 2, 1, 1) - 0.000s
- 0.000059s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 0, 0) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 9.05990600586e-06
- loop time 0.0
- callback_time 0.0
- 0.000055s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 0, 0) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000054s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- 0.000051s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 0, 0) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 9.05990600586e-06
- loop time 0.0
- callback_time 0.0
- 0.000048s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 0, 0) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000047s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- 0.000047s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000046s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 0, 0) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000046s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 1, 1) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.81198120117e-05
- loop time 7.15255737305e-06
- callback_time 0.0
- 0.000046s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000041s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 0, 0) - 0.000s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.19209289551e-05
- loop time 0.0
- callback_time 0.0
- 0.000040s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 0, 0) - 0.000s
- 0.000038s - ('merge1.2', 'MergeOptimizer', 7, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000034s - ('AbstractConvCheck', 'TopoOptimizer', 18, 0, 0) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000034s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 0, 0) - 0.000s
- 0.000033s - ('cond_make_inplace', 'TopoOptimizer', 47, 0, 0) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000032s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 0, 0) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000031s - ('c_blas_destructive', 'TopoOptimizer', 37, 0, 0) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000031s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 0, 0) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000030s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 0, 0) - 0.000s
- 0.000030s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 0, 0) - 0.000s
- 0.000029s - ('random_make_inplace', 'TopoOptimizer', 49, 0, 0) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('local_destructive', 'TopoOptimizer', 48, 0, 0) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000029s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 0, 0) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000028s - ('make_ger_destructive', 'TopoOptimizer', 41, 0, 0) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000028s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 0, 0) - 0.000s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.19888305664e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- 0.000025s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 0, 0) - 0.000s
- 0.000021s - ('merge1.1', 'MergeOptimizer', 4, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000004s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 0, 0) - 0.000s
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: /Users/Ramana/projects/SBRNN/sb/utils.py:56
- Time in 1 calls to Function.__call__: 1.502037e-05s
- Time in Function.fn.__call__: 5.960464e-06s (39.683%)
- Time in thunks: 3.099442e-06s (20.635%)
- Total compile time: 2.807999e-02s
- Number of Apply nodes: 1
- Theano Optimizer time: 9.526968e-03s
- Theano validate time: 8.201599e-05s
- Theano Linker time (includes C, CUDA code generation/compiling): 5.979538e-04s
- Import time 0.000000e+00s
- Node make_thunk time 4.749298e-04s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.820s
- Class
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
- 100.0% 100.0% 0.000s 3.10e-06s C 1 1 theano.compile.ops.DeepCopyOp
- ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
- Ops
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
- 100.0% 100.0% 0.000s 3.10e-06s C 1 1 DeepCopyOp
- ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
- Apply
- ------
- <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
- 100.0% 100.0% 0.000s 3.10e-06s 1 0 DeepCopyOp(TensorConstant{-0.577215671539})
- ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 0.009s for 1/0 nodes before/after optimization
- 0.001s for callback
- 0.000s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 0.002808s - ('canonicalize', 'EquilibriumOptimizer', 6, 1, 0) - 0.000s
- EquilibriumOptimizer canonicalize
- time 0.002s for 2 passes
- nb nodes (start, end, max) 1 0 1
- time io_toposort 0.000s
- time in local optimizers 0.001s
- time in global optimizers 0.000s
- time in final optimizers 0.001s
- time in cleanup optimizers 0.000s
- 0 - 0.002s 4 (0.001s in global opts, 0.000s io_toposort) - 1 nodes - ('topo_constant_folding', 1) ('local_upcast_elemwise_constant_inputs', 1) ('local_dimshuffle_lift', 1) ('MergeOptimizer', 1)
- 1 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- times - times applied - nb node created - name:
- 0.001s - 1 - 3 - local_upcast_elemwise_constant_inputs
- 0.001s - 1 - 0 - topo_constant_folding
- 0.000s - 1 - 1 - local_dimshuffle_lift
- 0.000s - 1 - 1 - MergeOptimizer
- 0.000s - in 85 optimization that were not used (display only those with a runtime > 0)
- 0.000s - local_fill_sink
- 0.000s - local_useless_elemwise
- 0.000s - local_func_inv
- 0.000s - local_track_shape_i
- 0.000s - local_merge_switch_same_cond
- 0.000s - local_fill_cut
- 0.000s - local_cast_cast
- 0.000s - local_useless_switch
- 0.000s - local_expm1
- 0.000s - local_useless_elemwise_comparison
- 0.000s - local_lift_transpose_through_dot
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 0, 2)
- init io_toposort 3.09944152832e-05
- loop time 0.000473022460938
- callback_time 0.000208616256714
- MergeOptimizer
- nb fail= 0 merged= 1 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.00135803223e-05
- loop time 0.0
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000602s - ('add_destroy_handler', 'AddDestroyHandler', 23, 0, 0) - 0.000s
- 0.000592s - ('gpuarray_opt', 'SeqOptimizer', 16, 0, 0) - 0.000s
- SeqOptimizer gpuarray_opt time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000336s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 0, 0) - 0.000s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000070s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 0, 0) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000058s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 0, 0) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.000s
- Total time taken by local optimizers 0.000s
- 0.000007s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 0, 0) - 0.000s
- 0.000566s - ('elemwise_fusion', 'SeqOptimizer', 19, 0, 0) - 0.000s
- SeqOptimizer elemwise_fusion time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000195s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000186s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 1.19209289551e-06
- 0.000460s - ('BlasOpt', 'SeqOptimizer', 12, 0, 0) - 0.000s
- SeqOptimizer BlasOpt time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000144s - ('gemm_optimizer', 'GemmOptimizer', 1, 0, 0) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0
- time_factor_can 0
- time_factor_list 0
- time_toposort 6.91413879395e-06
- validate_time 0.0
- callback_time 0.0
- 0.000097s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 0, 0) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000031s - ('use_c_blas', 'TopoOptimizer', 4, 0, 0) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000030s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 0, 0) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('use_scipy_ger', 'TopoOptimizer', 5, 0, 0) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 0, 0) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000395s - ('specialize', 'EquilibriumOptimizer', 13, 0, 0) - 0.000s
- EquilibriumOptimizer specialize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000281s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 0, 0) - 0.000s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.09672546387e-05
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.000276s - ('stabilize', 'EquilibriumOptimizer', 8, 0, 0) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000272s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 1, 1) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.000s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000073s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 1, 1) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.88486480713e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000018s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 1, 1) - 0.000s
- 0.000011s - ('scan_pushout_dot1', 'PushOutDot1', 3, 1, 1) - 0.000s
- 0.000011s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 1, 1) - 0.000s
- 0.000010s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 1, 1) - 0.000s
- 0.000261s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 0, 0) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 9.05990600586e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.19888305664e-06
- loop time 0.0
- callback_time 0.0
- 0.000234s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 1.90734863281e-06
- 0.000221s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 1.90734863281e-06
- 0.000188s - ('merge2', 'MergeOptimizer', 22, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000181s - ('merge3', 'MergeOptimizer', 51, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000137s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 0, 0) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000117s - ('merge1', 'MergeOptimizer', 0, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000091s - ('blas_opt_inplace', 'TopoOptimizer', 34, 0, 0) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000090s - ('useless', 'TopoOptimizer', 3, 1, 1) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.21729278564e-05
- loop time 3.31401824951e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 1 - local_useless_elemwise_comparison - 0
- -0.000s - 0 - 1 - local_useless_switch - 0
- -0.000s - 0 - 1 - local_useless_elemwise - 0
- 0.000s - in 16 optimization that were not used (display those with runtime greater than 0)
- 0.000090s - ('ShapeOpt', 'ShapeOptimizer', 2, 1, 1) - 0.000s
- 0.000086s - ('specialize_device', 'EquilibriumOptimizer', 17, 0, 0) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000076s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 0, 0) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.000075s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 0, 0) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000075s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 0, 0) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000059s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 0, 0) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 9.05990600586e-06
- loop time 0.0
- callback_time 0.0
- 0.000053s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 1, 1) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.00271606445e-05
- loop time 8.10623168945e-06
- callback_time 0.0
- 0.000053s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 0, 0) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 9.05990600586e-06
- loop time 0.0
- callback_time 0.0
- 0.000051s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 0, 0) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000051s - ('merge1.2', 'MergeOptimizer', 7, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000050s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000050s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 0, 0) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000049s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 0, 0) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000049s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000049s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000049s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000048s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 0, 0) - 0.000s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.38282775879e-05
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000041s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 0, 0) - 0.000s
- 0.000038s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 0, 0) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.82148742676e-06
- loop time 0.0
- callback_time 0.0
- 0.000035s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 0, 0) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- 0.000031s - ('cond_make_inplace', 'TopoOptimizer', 47, 0, 0) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000031s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 0, 0) - 0.000s
- 0.000030s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 0, 0) - 0.000s
- 0.000029s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 0, 0) - 0.000s
- 0.000029s - ('make_ger_destructive', 'TopoOptimizer', 41, 0, 0) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('c_blas_destructive', 'TopoOptimizer', 37, 0, 0) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000029s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 0, 0) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('random_make_inplace', 'TopoOptimizer', 49, 0, 0) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('AbstractConvCheck', 'TopoOptimizer', 18, 0, 0) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000028s - ('local_destructive', 'TopoOptimizer', 48, 0, 0) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000027s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 0, 0) - 0.000s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000024s - ('merge1.1', 'MergeOptimizer', 4, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000023s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 0, 0) - 0.000s
- 0.000005s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 0, 0) - 0.000s
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: /Users/Ramana/projects/SBRNN/sb/utils.py:27
- Time in 1 calls to Function.__call__: 1.287460e-05s
- Time in Function.fn.__call__: 5.006790e-06s (38.889%)
- Time in thunks: 1.907349e-06s (14.815%)
- Total compile time: 2.661610e-02s
- Number of Apply nodes: 1
- Theano Optimizer time: 9.035826e-03s
- Theano validate time: 8.153915e-05s
- Theano Linker time (includes C, CUDA code generation/compiling): 5.738735e-04s
- Import time 0.000000e+00s
- Node make_thunk time 4.589558e-04s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.860s
- Class
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
- 100.0% 100.0% 0.000s 1.91e-06s C 1 1 theano.compile.ops.DeepCopyOp
- ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
- Ops
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
- 100.0% 100.0% 0.000s 1.91e-06s C 1 1 DeepCopyOp
- ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
- Apply
- ------
- <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
- 100.0% 100.0% 0.000s 1.91e-06s 1 0 DeepCopyOp(TensorConstant{-0.577215671539})
- ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 0.009s for 1/0 nodes before/after optimization
- 0.001s for callback
- 0.000s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 0.002627s - ('canonicalize', 'EquilibriumOptimizer', 6, 1, 0) - 0.000s
- EquilibriumOptimizer canonicalize
- time 0.002s for 2 passes
- nb nodes (start, end, max) 1 0 1
- time io_toposort 0.000s
- time in local optimizers 0.001s
- time in global optimizers 0.000s
- time in final optimizers 0.001s
- time in cleanup optimizers 0.000s
- 0 - 0.002s 4 (0.001s in global opts, 0.000s io_toposort) - 1 nodes - ('topo_constant_folding', 1) ('local_upcast_elemwise_constant_inputs', 1) ('local_dimshuffle_lift', 1) ('MergeOptimizer', 1)
- 1 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- times - times applied - nb node created - name:
- 0.001s - 1 - 0 - topo_constant_folding
- 0.001s - 1 - 3 - local_upcast_elemwise_constant_inputs
- 0.000s - 1 - 1 - MergeOptimizer
- 0.000s - 1 - 1 - local_dimshuffle_lift
- 0.000s - in 85 optimization that were not used (display only those with a runtime > 0)
- 0.000s - local_useless_elemwise
- 0.000s - local_fill_sink
- 0.000s - local_func_inv
- 0.000s - local_cast_cast
- 0.000s - local_track_shape_i
- 0.000s - local_merge_switch_same_cond
- 0.000s - local_fill_cut
- 0.000s - local_expm1
- 0.000s - local_useless_switch
- 0.000s - local_useless_elemwise_comparison
- 0.000s - local_lift_transpose_through_dot
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 0, 2)
- init io_toposort 3.48091125488e-05
- loop time 0.000529050827026
- callback_time 0.000237703323364
- MergeOptimizer
- nb fail= 0 merged= 1 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.00135803223e-05
- loop time 0.0
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000537s - ('gpuarray_opt', 'SeqOptimizer', 16, 0, 0) - 0.000s
- SeqOptimizer gpuarray_opt time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000301s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 0, 0) - 0.000s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000067s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 0, 0) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000044s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 0, 0) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.000s
- Total time taken by local optimizers 0.000s
- 0.000007s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 0, 0) - 0.000s
- 0.000476s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 0, 0) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.31130218506e-05
- loop time 1.19209289551e-06
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.00135803223e-05
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 9.05990600586e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000474s - ('elemwise_fusion', 'SeqOptimizer', 19, 0, 0) - 0.000s
- SeqOptimizer elemwise_fusion time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000180s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000174s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000467s - ('BlasOpt', 'SeqOptimizer', 12, 0, 0) - 0.000s
- SeqOptimizer BlasOpt time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000150s - ('gemm_optimizer', 'GemmOptimizer', 1, 0, 0) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0
- time_factor_can 0
- time_factor_list 0
- time_toposort 8.10623168945e-06
- validate_time 0.0
- callback_time 0.0
- 0.000090s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 0, 0) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000042s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 0, 0) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.78813934326e-05
- loop time 0.0
- callback_time 0.0
- 0.000037s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 0, 0) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.62124633789e-05
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000029s - ('use_c_blas', 'TopoOptimizer', 4, 0, 0) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000026s - ('use_scipy_ger', 'TopoOptimizer', 5, 0, 0) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000455s - ('add_destroy_handler', 'AddDestroyHandler', 23, 0, 0) - 0.000s
- 0.000364s - ('specialize', 'EquilibriumOptimizer', 13, 0, 0) - 0.000s
- EquilibriumOptimizer specialize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000341s - ('stabilize', 'EquilibriumOptimizer', 8, 0, 0) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.00135803223e-05
- loop time 0.0
- callback_time 0.0
- 0.000232s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 0.0
- 0.000198s - ('merge3', 'MergeOptimizer', 51, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000197s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 1, 1) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.000s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000055s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 1, 1) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.09808349609e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000013s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 1, 1) - 0.000s
- 0.000008s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 1, 1) - 0.000s
- 0.000008s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 1, 1) - 0.000s
- 0.000008s - ('scan_pushout_dot1', 'PushOutDot1', 3, 1, 1) - 0.000s
- 0.000179s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 0.0
- 0.000148s - ('blas_opt_inplace', 'TopoOptimizer', 34, 0, 0) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000130s - ('merge2', 'MergeOptimizer', 22, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000120s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 0, 0) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000116s - ('merge1', 'MergeOptimizer', 0, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000113s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 0, 0) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.4066696167e-05
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000087s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 0, 0) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 9.05990600586e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000085s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 0, 0) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.00135803223e-05
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.000084s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 0, 0) - 0.000s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.000077s - ('specialize_device', 'EquilibriumOptimizer', 17, 0, 0) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000076s - ('useless', 'TopoOptimizer', 3, 1, 1) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.78813934326e-05
- loop time 2.90870666504e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 1 - local_useless_elemwise_comparison - 0
- -0.000s - 0 - 1 - local_useless_switch - 0
- -0.000s - 0 - 1 - local_useless_elemwise - 0
- 0.000s - in 16 optimization that were not used (display those with runtime greater than 0)
- 0.000076s - ('ShapeOpt', 'ShapeOptimizer', 2, 1, 1) - 0.000s
- 0.000053s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 0, 0) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000049s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 0, 0) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000049s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 0, 0) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000048s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 0, 0) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000047s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000047s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- 0.000046s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000046s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000046s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 0, 0) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000046s - ('c_blas_destructive', 'TopoOptimizer', 37, 0, 0) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000044s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 1, 1) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.59740447998e-05
- loop time 5.96046447754e-06
- callback_time 0.0
- 0.000044s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 0, 0) - 0.000s
- 0.000044s - ('merge1.2', 'MergeOptimizer', 7, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000040s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 0, 0) - 0.000s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.21593475342e-05
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000037s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 0, 0) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.00135803223e-05
- loop time 0.0
- callback_time 0.0
- 0.000034s - ('cond_make_inplace', 'TopoOptimizer', 47, 0, 0) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000034s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 0, 0) - 0.000s
- 0.000034s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 0, 0) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000033s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 0, 0) - 0.000s
- 0.000033s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 0, 0) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000032s - ('local_destructive', 'TopoOptimizer', 48, 0, 0) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000032s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 0, 0) - 0.000s
- 0.000031s - ('random_make_inplace', 'TopoOptimizer', 49, 0, 0) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000031s - ('make_ger_destructive', 'TopoOptimizer', 41, 0, 0) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000030s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 0, 0) - 0.000s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('AbstractConvCheck', 'TopoOptimizer', 18, 0, 0) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- 0.000025s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 0, 0) - 0.000s
- 0.000020s - ('merge1.1', 'MergeOptimizer', 4, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000004s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 0, 0) - 0.000s
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: /Users/Ramana/projects/SBRNN/sb/utils.py:56
- Time in 1 calls to Function.__call__: 1.502037e-05s
- Time in Function.fn.__call__: 5.960464e-06s (39.683%)
- Time in thunks: 4.053116e-06s (26.984%)
- Total compile time: 1.754310e-01s
- Number of Apply nodes: 1
- Theano Optimizer time: 1.580000e-01s
- Theano validate time: 7.796288e-05s
- Theano Linker time (includes C, CUDA code generation/compiling): 6.060600e-04s
- Import time 0.000000e+00s
- Node make_thunk time 4.839897e-04s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.866s
- Class
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
- 100.0% 100.0% 0.000s 4.05e-06s C 1 1 theano.compile.ops.DeepCopyOp
- ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
- Ops
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
- 100.0% 100.0% 0.000s 4.05e-06s C 1 1 DeepCopyOp
- ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
- Apply
- ------
- <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
- 100.0% 100.0% 0.000s 4.05e-06s 1 0 DeepCopyOp(TensorConstant{-0.577215671539})
- ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 0.158s for 1/0 nodes before/after optimization
- 0.001s for callback
- 0.000s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 0.149964s - ('BlasOpt', 'SeqOptimizer', 12, 0, 0) - 0.000s
- SeqOptimizer BlasOpt time 0.150s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.149595s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 0, 0) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000132s - ('gemm_optimizer', 'GemmOptimizer', 1, 0, 0) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0
- time_factor_can 0
- time_factor_list 0
- time_toposort 8.10623168945e-06
- validate_time 0.0
- callback_time 0.0
- 0.000042s - ('use_c_blas', 'TopoOptimizer', 4, 0, 0) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.00135803223e-05
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000030s - ('use_scipy_ger', 'TopoOptimizer', 5, 0, 0) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000027s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 0, 0) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000027s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 0, 0) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- 0.002318s - ('canonicalize', 'EquilibriumOptimizer', 6, 1, 0) - 0.000s
- EquilibriumOptimizer canonicalize
- time 0.002s for 2 passes
- nb nodes (start, end, max) 1 0 1
- time io_toposort 0.000s
- time in local optimizers 0.001s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.002s 4 (0.000s in global opts, 0.000s io_toposort) - 1 nodes - ('topo_constant_folding', 1) ('local_upcast_elemwise_constant_inputs', 1) ('local_dimshuffle_lift', 1) ('MergeOptimizer', 1)
- 1 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- times - times applied - nb node created - name:
- 0.001s - 1 - 3 - local_upcast_elemwise_constant_inputs
- 0.000s - 1 - 0 - topo_constant_folding
- 0.000s - 1 - 1 - local_dimshuffle_lift
- 0.000s - 1 - 1 - MergeOptimizer
- 0.000s - in 85 optimization that were not used (display only those with a runtime > 0)
- 0.000s - local_fill_sink
- 0.000s - local_useless_elemwise
- 0.000s - local_func_inv
- 0.000s - local_track_shape_i
- 0.000s - local_merge_switch_same_cond
- 0.000s - local_fill_cut
- 0.000s - local_cast_cast
- 0.000s - local_useless_elemwise_comparison
- 0.000s - local_lift_transpose_through_dot
- 0.000s - local_expm1
- 0.000s - local_useless_switch
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 0, 2)
- init io_toposort 2.59876251221e-05
- loop time 0.00040602684021
- callback_time 0.000176191329956
- MergeOptimizer
- nb fail= 0 merged= 1 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000641s - ('gpuarray_opt', 'SeqOptimizer', 16, 0, 0) - 0.000s
- SeqOptimizer gpuarray_opt time 0.001s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000343s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 0, 0) - 0.000s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000088s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 0, 0) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000067s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 0, 0) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.000s
- Total time taken by local optimizers 0.000s
- 0.000009s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 0, 0) - 0.000s
- 0.000533s - ('elemwise_fusion', 'SeqOptimizer', 19, 0, 0) - 0.000s
- SeqOptimizer elemwise_fusion time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000201s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000199s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000514s - ('add_destroy_handler', 'AddDestroyHandler', 23, 0, 0) - 0.000s
- 0.000436s - ('specialize', 'EquilibriumOptimizer', 13, 0, 0) - 0.000s
- EquilibriumOptimizer specialize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- 0.000254s - ('stabilize', 'EquilibriumOptimizer', 8, 0, 0) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000245s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 0, 0) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.19888305664e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000208s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 1, 1) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.000s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000057s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 1, 1) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.09808349609e-05
- loop time 5.00679016113e-06
- callback_time 0.0
- 0.000013s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 1, 1) - 0.000s
- 0.000009s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 1, 1) - 0.000s
- 0.000008s - ('scan_pushout_dot1', 'PushOutDot1', 3, 1, 1) - 0.000s
- 0.000008s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 1, 1) - 0.000s
- 0.000194s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000193s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 0.0
- 0.000188s - ('merge3', 'MergeOptimizer', 51, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000136s - ('merge2', 'MergeOptimizer', 22, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000136s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 0, 0) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- 0.000096s - ('merge1', 'MergeOptimizer', 0, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000094s - ('specialize_device', 'EquilibriumOptimizer', 17, 0, 0) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000080s - ('blas_opt_inplace', 'TopoOptimizer', 34, 0, 0) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000079s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 0, 0) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.000079s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 0, 0) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000079s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 0, 0) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000078s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 0, 0) - 0.000s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.000076s - ('useless', 'TopoOptimizer', 3, 1, 1) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.69277191162e-05
- loop time 3.00407409668e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 1 - local_useless_elemwise_comparison - 0
- -0.000s - 0 - 1 - local_useless_switch - 0
- -0.000s - 0 - 1 - local_useless_elemwise - 0
- 0.000s - in 16 optimization that were not used (display those with runtime greater than 0)
- 0.000072s - ('ShapeOpt', 'ShapeOptimizer', 2, 1, 1) - 0.000s
- 0.000056s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 0, 0) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000049s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 0, 0) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000048s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 0, 0) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- 0.000047s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000047s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 0, 0) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000047s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 0, 0) - 0.000s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.09672546387e-05
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000046s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 0, 0) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000046s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000046s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000046s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000045s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 1, 1) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.62124633789e-05
- loop time 6.91413879395e-06
- callback_time 0.0
- 0.000041s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 0, 0) - 0.000s
- 0.000040s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 0, 0) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.28746032715e-05
- loop time 0.0
- callback_time 0.0
- 0.000034s - ('merge1.2', 'MergeOptimizer', 7, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000033s - ('AbstractConvCheck', 'TopoOptimizer', 18, 0, 0) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000032s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 0, 0) - 0.000s
- 0.000031s - ('cond_make_inplace', 'TopoOptimizer', 47, 0, 0) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000031s - ('c_blas_destructive', 'TopoOptimizer', 37, 0, 0) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000030s - ('local_destructive', 'TopoOptimizer', 48, 0, 0) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- 0.000030s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 0, 0) - 0.000s
- 0.000030s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 0, 0) - 0.000s
- 0.000030s - ('make_ger_destructive', 'TopoOptimizer', 41, 0, 0) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 0, 0) - 0.000s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 0, 0) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('random_make_inplace', 'TopoOptimizer', 49, 0, 0) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000026s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 0, 0) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000024s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 0, 0) - 0.000s
- 0.000020s - ('merge1.1', 'MergeOptimizer', 4, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000005s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 0, 0) - 0.000s
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: /Users/Ramana/projects/SBRNN/sb/utils.py:27
- Time in 1 calls to Function.__call__: 1.311302e-05s
- Time in Function.fn.__call__: 4.053116e-06s (30.909%)
- Time in thunks: 2.145767e-06s (16.364%)
- Total compile time: 2.555609e-02s
- Number of Apply nodes: 1
- Theano Optimizer time: 8.413076e-03s
- Theano validate time: 7.486343e-05s
- Theano Linker time (includes C, CUDA code generation/compiling): 5.640984e-04s
- Import time 0.000000e+00s
- Node make_thunk time 4.489422e-04s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.907s
- Class
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
- 100.0% 100.0% 0.000s 2.15e-06s C 1 1 theano.compile.ops.DeepCopyOp
- ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
- Ops
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
- 100.0% 100.0% 0.000s 2.15e-06s C 1 1 DeepCopyOp
- ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
- Apply
- ------
- <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
- 100.0% 100.0% 0.000s 2.15e-06s 1 0 DeepCopyOp(TensorConstant{-0.577215671539})
- ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 0.008s for 1/0 nodes before/after optimization
- 0.001s for callback
- 0.000s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 0.002406s - ('canonicalize', 'EquilibriumOptimizer', 6, 1, 0) - 0.000s
- EquilibriumOptimizer canonicalize
- time 0.002s for 2 passes
- nb nodes (start, end, max) 1 0 1
- time io_toposort 0.000s
- time in local optimizers 0.001s
- time in global optimizers 0.000s
- time in final optimizers 0.001s
- time in cleanup optimizers 0.000s
- 0 - 0.002s 4 (0.000s in global opts, 0.000s io_toposort) - 1 nodes - ('topo_constant_folding', 1) ('local_upcast_elemwise_constant_inputs', 1) ('local_dimshuffle_lift', 1) ('MergeOptimizer', 1)
- 1 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- times - times applied - nb node created - name:
- 0.001s - 1 - 3 - local_upcast_elemwise_constant_inputs
- 0.001s - 1 - 0 - topo_constant_folding
- 0.000s - 1 - 1 - local_dimshuffle_lift
- 0.000s - 1 - 1 - MergeOptimizer
- 0.000s - in 85 optimization that were not used (display only those with a runtime > 0)
- 0.000s - local_fill_sink
- 0.000s - local_useless_elemwise
- 0.000s - local_func_inv
- 0.000s - local_track_shape_i
- 0.000s - local_fill_cut
- 0.000s - local_merge_switch_same_cond
- 0.000s - local_cast_cast
- 0.000s - local_useless_elemwise_comparison
- 0.000s - local_expm1
- 0.000s - local_useless_switch
- 0.000s - local_lift_transpose_through_dot
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 0, 2)
- init io_toposort 2.69412994385e-05
- loop time 0.000415802001953
- callback_time 0.00018310546875
- MergeOptimizer
- nb fail= 0 merged= 1 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 9.05990600586e-06
- loop time 0.0
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000558s - ('gpuarray_opt', 'SeqOptimizer', 16, 0, 0) - 0.000s
- SeqOptimizer gpuarray_opt time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000309s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 0, 0) - 0.000s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000071s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 0, 0) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000049s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 0, 0) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.000s
- Total time taken by local optimizers 0.000s
- 0.000008s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 0, 0) - 0.000s
- 0.000496s - ('elemwise_fusion', 'SeqOptimizer', 19, 0, 0) - 0.000s
- SeqOptimizer elemwise_fusion time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000190s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 1.90734863281e-06
- 0.000184s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000471s - ('add_destroy_handler', 'AddDestroyHandler', 23, 0, 0) - 0.000s
- 0.000442s - ('BlasOpt', 'SeqOptimizer', 12, 0, 0) - 0.000s
- SeqOptimizer BlasOpt time 0.000s for 0/0 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000139s - ('gemm_optimizer', 'GemmOptimizer', 1, 0, 0) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0
- time_factor_can 0
- time_factor_list 0
- time_toposort 6.91413879395e-06
- validate_time 0.0
- callback_time 0.0
- 0.000094s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 0, 0) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000030s - ('use_c_blas', 'TopoOptimizer', 4, 0, 0) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000029s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 0, 0) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 0, 0) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- 0.000027s - ('use_scipy_ger', 'TopoOptimizer', 5, 0, 0) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000393s - ('stabilize', 'EquilibriumOptimizer', 8, 0, 0) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 9.05990600586e-06
- loop time 0.0
- callback_time 0.0
- 0.000374s - ('specialize', 'EquilibriumOptimizer', 13, 0, 0) - 0.000s
- EquilibriumOptimizer specialize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000254s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 0, 0) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000218s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 1, 1) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.000s for 1 passes
- nb nodes (start, end, max) 1 1 1
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 1 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.000s for 1/1 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.000061s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 1, 1) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 2.40802764893e-05
- loop time 4.05311584473e-06
- callback_time 0.0
- 0.000015s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 1, 1) - 0.000s
- 0.000009s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 1, 1) - 0.000s
- 0.000008s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 1, 1) - 0.000s
- 0.000008s - ('scan_pushout_dot1', 'PushOutDot1', 3, 1, 1) - 0.000s
- 0.000185s - ('merge3', 'MergeOptimizer', 51, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000181s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000180s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 0, 0) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 9.53674316406e-07
- 0.000127s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 0, 0) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000120s - ('merge2', 'MergeOptimizer', 22, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000106s - ('merge1', 'MergeOptimizer', 0, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000083s - ('specialize_device', 'EquilibriumOptimizer', 17, 0, 0) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.000s for 1 passes
- nb nodes (start, end, max) 0 0 0
- time io_toposort 0.000s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.000s 0 (0.000s in global opts, 0.000s io_toposort) - 0 nodes -
- 0.000081s - ('ShapeOpt', 'ShapeOptimizer', 2, 1, 1) - 0.000s
- 0.000080s - ('blas_opt_inplace', 'TopoOptimizer', 34, 0, 0) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.09672546387e-05
- loop time 9.53674316406e-07
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000080s - ('useless', 'TopoOptimizer', 3, 1, 1) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.90734863281e-05
- loop time 3.09944152832e-05
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 1 - local_useless_switch - 0
- -0.000s - 0 - 1 - local_useless_elemwise_comparison - 0
- -0.000s - 0 - 1 - local_useless_elemwise - 0
- 0.000s - in 16 optimization that were not used (display those with runtime greater than 0)
- 0.000077s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 0, 0) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000077s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 0, 0) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.000077s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 0, 0) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000076s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 0, 0) - 0.000s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.000052s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 0, 0) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 8.10623168945e-06
- loop time 0.0
- callback_time 0.0
- 0.000048s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 0, 0) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 0.0
- callback_time 0.0
- 0.000047s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 1, 1) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (1, 1, 0)
- init io_toposort 1.71661376953e-05
- loop time 7.86781311035e-06
- callback_time 0.0
- 0.000046s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000046s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000046s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 0, 0) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000046s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 0, 0) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000045s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 0, 0) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000045s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000045s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 0, 0) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000040s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 0, 0) - 0.000s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 1.19209289551e-05
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000038s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 0, 0) - 0.000s
- 0.000036s - ('merge1.2', 'MergeOptimizer', 7, 0, 0) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000035s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 0, 0) - 0.000s
- 0.000033s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 0, 0) - 0.000s
- 0.000032s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 0, 0) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.86781311035e-06
- loop time 1.19209289551e-06
- callback_time 0.0
- 0.000031s - ('local_destructive', 'TopoOptimizer', 48, 0, 0) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 0.0
- callback_time 0.0
- 0.000031s - ('cond_make_inplace', 'TopoOptimizer', 47, 0, 0) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000031s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 0, 0) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 9.05990600586e-06
- loop time 0.0
- callback_time 0.0
- 0.000030s - ('c_blas_destructive', 'TopoOptimizer', 37, 0, 0) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.000029s - ('make_ger_destructive', 'TopoOptimizer', 41, 0, 0) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 5.96046447754e-06
- loop time 0.0
- callback_time 0.0
- 0.000029s - ('AbstractConvCheck', 'TopoOptimizer', 18, 0, 0) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 7.15255737305e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000028s - ('random_make_inplace', 'TopoOptimizer', 49, 0, 0) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000028s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 0, 0) - 0.000s
- 0.000028s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 0, 0) - 0.000s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.19888305664e-06
- loop time 9.53674316406e-07
- callback_time 0.0
- 0.000027s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 0, 0) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (0, 0, 0)
- init io_toposort 6.91413879395e-06
- loop time 0.0
- callback_time 0.0
- 0.000023s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 0, 0) - 0.000s
- 0.000021s - ('merge1.1', 'MergeOptimizer', 4, 1, 1) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000006s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 0, 0) - 0.000s
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: sb/convnet/sb_resnet.py:327
- Time in 0 calls to Function.__call__: 0.000000e+00s
- Total compile time: 4.351297e+02s
- Number of Apply nodes: 2957
- Theano Optimizer time: 6.785364e+01s
- Theano validate time: 9.809670e+00s
- Theano Linker time (includes C, CUDA code generation/compiling): 3.640262e+02s
- Import time 1.506580e+00s
- Node make_thunk time 3.635044e+02s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.913s
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 67.853s for 10599/2957 nodes before/after optimization
- 22.259s for callback
- 9.810s for fgraph.validate()
- callbacks_time
- <theano.gof.destroyhandler.DestroyHandler object at 0x1291a3f10> , 7.663418293
- <theano.tensor.opt.ShapeFeature object at 0x121562590> , 4.24110126495
- Updater{canonicalize} , 3.05698680878
- <theano.compile.function_module.Supervisor instance at 0x1227d3f38> , 2.56386876106
- <theano.gof.opt.MergeFeature object at 0x1227e7190> , 1.74654364586
- Updater{canonicalize} , 1.31303954124
- Updater{gpuarray_local_optimizations} , 0.261435270309
- Updater{gpuarray_cut_transfers} , 0.261384248734
- Updater{canonicalize} , 0.194210767746
- Updater{specialize} , 0.130860090256
- <theano.gof.toolbox.ReplaceValidate object at 0x12087ee50> , 0.11890411377
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x122195850> , 0.0774388313293
- Updater{canonicalize} , 0.0659112930298
- <theano.gof.opt.ChangeTracker instance at 0x11f7d8f38> , 0.0626258850098
- Updater{canonicalize} , 0.0376682281494
- Updater{canonicalize} , 0.0349521636963
- Updater{gpuarray_local_optimizations} , 0.0294797420502
- Updater{specialize} , 0.026261806488
- <theano.gof.opt.ChangeTracker instance at 0x125a3e1b8> , 0.012256860733
- Updater{local_elemwise_alloc} , 0.00888752937317
- <theano.gof.opt.ChangeTracker instance at 0x124d8a098> , 0.00651884078979
- Updater{pre_local_IncSubtensor_serialize} , 0.00562787055969
- Updater{specialize} , 0.00509214401245
- Updater{dimshuffle_as_view} , 0.00395131111145
- Updater{topo_constant_folding} , 0.00292682647705
- Updater{useless} , 0.00235295295715
- <theano.gof.opt.ChangeTracker instance at 0x127399f38> , 0.00211262702942
- Updater{local_inplace_setsubtensor} , 0.0013701915741
- Updater{local_dnna_conv_inplace} , 0.00122761726379
- Updater{specialize} , 0.00121068954468
- Updater{constant_folding_for_scan2} , 0.000903367996216
- Updater{stabilize} , 0.000802278518677
- Updater{specialize} , 0.000703573226929
- Updater{topo_constant_folding} , 0.000659465789795
- <theano.gof.opt.ChangeTracker instance at 0x124bccfc8> , 0.000591039657593
- Updater{specialize} , 0.000572443008423
- Updater{GemmOptimizer} , 0.000359296798706
- Updater{local_dot_to_dot22} , 0.000351905822754
- Updater{topo_constant_folding} , 0.000331401824951
- <theano.gof.opt.ChangeTracker instance at 0x1226b5560> , 0.000254154205322
- Updater{random_make_inplace_mrg} , 0.000191688537598
- Updater{topo_constant_folding} , 0.000139713287354
- Updater{InplaceGpuaBlasOpt} , 9.91821289062e-05
- Updater{topo_constant_folding} , 6.60419464111e-05
- Updater{specialize} , 4.02927398682e-05
- Updater{specialize} , 2.93254852295e-05
- Updater{specialize} , 2.76565551758e-05
- Updater{specialize} , 2.55107879639e-05
- Updater{local_dot22_to_dot22scalar} , 2.121925354e-05
- Updater{topo_constant_folding} , 1.52587890625e-05
- Updater{topo_constant_folding} , 1.43051147461e-05
- Updater{topo_constant_folding} , 1.38282775879e-05
- Updater{topo_constant_folding} , 1.28746032715e-05
- Updater{topo_constant_folding} , 8.82148742676e-06
- Updater{topo_constant_folding} , 6.67572021484e-06
- Updater{gpuarray_local_optimizations} , 5.72204589844e-06
- time - (name, class, index, nodes before, nodes after) - validate time
- 19.936074s - ('canonicalize', 'EquilibriumOptimizer', 6, 7639, 4559) - 0.143s
- EquilibriumOptimizer canonicalize
- time 19.936s for 7 passes
- nb nodes (start, end, max) 7639 4559 7639
- time io_toposort 1.026s
- time in local optimizers 12.701s
- time in global optimizers 0.000s
- time in final optimizers 1.199s
- time in cleanup optimizers 4.571s
- 0 - 10.519s 4693 (0.420s in global opts, 0.381s io_toposort) - 7635 nodes - ('MergeOptimizer', 1723) ('local_useless_fill', 643) ('local_mul_canonizer', 358) ('local_fill_sink', 315) ('local_neg_to_mul', 306) ...
- 1 - 4.410s 1690 (0.136s in global opts, 0.426s io_toposort) - 6297 nodes - ('MergeOptimizer', 630) ('local_dimshuffle_lift', 216) ('local_mul_canonizer', 206) ('local_fill_sink', 203) ('local_upcast_elemwise_constant_inputs', 131) ...
- 2 - 1.501s 497 (0.055s in global opts, 0.045s io_toposort) - 4794 nodes - ('MergeOptimizer', 142) ('local_fill_sink', 114) ('local_useless_fill', 57) ('local_zero_div', 57) ('local_sum_prod_div_dimshuffle', 56) ...
- 3 - 0.844s 123 (0.049s in global opts, 0.043s io_toposort) - 4568 nodes - ('MergeOptimizer', 60) ('local_dimshuffle_lift', 56) ('local_useless_fill', 3) ('local_mul_zero', 3) ('topo_constant_folding', 1)
- 4 - 0.762s 67 (0.048s in global opts, 0.041s io_toposort) - 4568 nodes - ('MergeOptimizer', 32) ('local_sum_prod_div_dimshuffle', 28) ('local_zero_div', 3) ('local_fill_sink', 3) ('topo_constant_folding', 1)
- 5 - 1.199s 56 (0.441s in global opts, 0.047s io_toposort) - 4559 nodes - ('local_dimshuffle_lift', 28) ('MergeOptimizer', 28)
- 6 - 0.702s 0 (0.051s in global opts, 0.044s io_toposort) - 4559 nodes -
- times - times applied - nb node created - name:
- 4.571s - 2615 - 17 - MergeOptimizer
- 2.217s - 565 - 1522 - local_mul_canonizer
- 2.124s - 186 - 975 - local_greedy_distributor
- 1.672s - 635 - 790 - local_fill_sink
- 1.520s - 544 - 1738 - local_dimshuffle_lift
- 1.199s - 5 - 0 - topo_constant_folding
- 0.804s - 260 - 1317 - local_mul_zero
- 0.773s - 251 - 401 - local_add_canonizer
- 0.376s - 799 - 0 - local_useless_fill
- 0.335s - 137 - 411 - local_upcast_elemwise_constant_inputs
- 0.330s - 306 - 606 - local_neg_to_mul
- 0.327s - 168 - 504 - local_sum_prod_div_dimshuffle
- 0.169s - 134 - 0 - local_cut_gpu_transfers
- 0.159s - 3 - 3 - local_useless_elemwise
- 0.143s - 12 - 24 - local_reshape_to_dimshuffle
- 0.141s - 14 - 28 - local_subtensor_merge
- 0.138s - 146 - 449 - local_shape_to_shape_i
- 0.135s - 90 - 180 - local_zero_div
- 0.117s - 36 - 108 - local_mul_switch_sink
- 0.088s - 33 - 99 - local_div_switch_sink
- 0.073s - 16 - 9 - local_useless_switch
- 0.065s - 9 - 0 - local_join_1
- 0.049s - 9 - 9 - local_useless_dimshuffle_in_reshape
- 0.047s - 81 - 32 - local_subtensor_make_vector
- 0.035s - 31 - 62 - local_inv_canon
- 0.020s - 19 - 0 - local_pow_canonicalize
- 0.014s - 10 - 20 - local_subtensor_lift
- 0.010s - 12 - 0 - local_intdiv_by_one
- 0.820s - in 61 optimization that were not used (display only those with a runtime > 0)
- 0.179s - local_func_inv
- 0.113s - local_one_minus_erf2
- 0.093s - local_merge_switch_same_cond
- 0.089s - local_useless_elemwise_comparison
- 0.062s - local_track_shape_i
- 0.053s - local_fill_cut
- 0.044s - local_expm1
- 0.039s - local_cast_cast
- 0.032s - local_IncSubtensor_serialize
- 0.030s - local_one_minus_erf
- 0.015s - local_useless_subtensor
- 0.010s - local_sum_prod_all_to_none
- 0.007s - local_lift_transpose_through_dot
- 0.007s - local_op_of_op
- 0.006s - local_useless_reduce
- 0.006s - local_useless_slice
- 0.005s - local_sumsqr2dot
- 0.004s - local_reduce_join
- 0.004s - local_dimshuffle_no_inplace_at_canonicalize
- 0.004s - f
- 0.004s - local_subtensor_remove_broadcastable_index
- 0.002s - local_0_dot_x
- 0.001s - local_abs_lift
- 0.001s - local_incsubtensor_of_zeros
- 0.001s - local_useless_reshape
- 0.001s - local_subtensor_of_dot
- 0.001s - local_subtensor_of_alloc
- 0.001s - local_reshape_lift
- 0.001s - local_useless_inc_subtensor
- 0.001s - local_canonicalize_alloc
- 0.000s - local_useless_inc_subtensor_alloc
- 0.000s - local_useless_alloc
- 0.000s - local_setsubtensor_of_constants
- 0.000s - local_merge_alloc
- 0.000s - local_scalar_tensor_scalar
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (6389, 6305, 84)
- init io_toposort 0.0622019767761
- loop time 0.357506990433
- callback_time 0.245550394058
- MergeOptimizer
- nb fail= 0 merged= 4140 constant= 1491
- time replace=3.41 validate=0.05 callback=3.05
- callbacks_time
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x122195850> , 0.00784587860107
- <theano.gof.opt.ChangeTracker instance at 0x11f7d8f38> , 0.0107350349426
- <theano.compile.function_module.Supervisor instance at 0x1227d3f38> , 0.0117847919464
- <theano.gof.toolbox.ReplaceValidate object at 0x12087ee50> , 0.0135762691498
- <theano.gof.opt.MergeFeature object at 0x1227e7190> , 0.25992846489
- <theano.tensor.opt.ShapeFeature object at 0x121562590> , 0.968868017197
- Updater{canonicalize} , 1.68857598305
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4829, 4794, 35)
- init io_toposort 0.0512568950653
- loop time 0.083508014679
- callback_time 0.0425992012024
- MergeOptimizer
- nb fail= 0 merged= 1652 constant= 351
- time replace=0.88 validate=0.02 callback=0.73
- Iter 2
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4570, 4568, 2)
- init io_toposort 0.0442609786987
- loop time 0.0108721256256
- callback_time 0.004061460495
- MergeOptimizer
- nb fail= 0 merged= 350 constant= 86
- time replace=0.18 validate=0.00 callback=0.14
- Iter 3
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4569, 4568, 1)
- init io_toposort 0.0423350334167
- loop time 0.00659894943237
- callback_time 0.000518798828125
- MergeOptimizer
- nb fail= 0 merged= 62 constant= 4
- time replace=0.04 validate=0.00 callback=0.04
- Iter 4
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4563, 4562, 1)
- init io_toposort 0.0421988964081
- loop time 0.00592303276062
- callback_time 0.000275611877441
- MergeOptimizer
- nb fail= 0 merged= 37 constant= 4
- time replace=0.02 validate=0.00 callback=0.02
- Iter 5
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4559, 4559, 0)
- init io_toposort 0.435209989548
- loop time 0.00607204437256
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 28 constant= 0
- time replace=0.02 validate=0.00 callback=0.02
- Iter 6
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4559, 4559, 0)
- init io_toposort 0.0452370643616
- loop time 0.0053391456604
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 14.695019s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 2957, 2957) - 2.875s
- 6.134235s - ('gpuarray_opt', 'SeqOptimizer', 16, 4633, 5796) - 0.029s
- SeqOptimizer gpuarray_opt time 6.134s for 4633/5796 nodes before/after optimization
- 1.806s for callback
- 0.029s for fgraph.validate()
- callbacks_time
- <theano.tensor.opt.ShapeFeature object at 0x121562590> , 0.800983190536
- <theano.gof.opt.MergeFeature object at 0x1227e7190> , 0.281701803207
- Updater{gpuarray_local_optimizations} , 0.261435270309
- Updater{gpuarray_cut_transfers} , 0.261384248734
- Updater{gpuarray_local_optimizations} , 0.0294797420502
- <theano.gof.toolbox.ReplaceValidate object at 0x12087ee50> , 0.0172119140625
- <theano.gof.opt.ChangeTracker instance at 0x125a3e1b8> , 0.012256860733
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x122195850> , 0.0115115642548
- <theano.compile.function_module.Supervisor instance at 0x1227d3f38> , 0.00647950172424
- <theano.gof.opt.ChangeTracker instance at 0x127399f38> , 0.00211262702942
- Updater{gpuarray_local_optimizations} , 5.72204589844e-06
- 3.286718s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 5910, 7186) - 0.018s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 3.286s for 4 passes
- nb nodes (start, end, max) 5910 7186 7755
- time io_toposort 0.280s
- time in local optimizers 2.832s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 2.157s 1135 (0.000s in global opts, 0.055s io_toposort) - 5910 nodes - ('constant_folding', 747) ('local_gpua_dimshuffle', 135) ('local_gpua_elemwise', 73) ('local_gpua_subtensor', 48) ('local_abstractconv_gw_cudnn', 43) ...
- 1 - 0.729s 319 (0.000s in global opts, 0.079s io_toposort) - 7755 nodes - ('constant_folding', 271) ('local_dnn_convw_output_merge', 30) ('local_dnn_convi_output_merge', 9) ('local_gpualloc_memset_0', 9)
- 2 - 0.201s 1 (0.000s in global opts, 0.074s io_toposort) - 7187 nodes - ('constant_folding', 1)
- 3 - 0.199s 0 (0.000s in global opts, 0.072s io_toposort) - 7186 nodes -
- times - times applied - nb node created - name:
- 1.373s - 1019 - 0 - constant_folding
- 0.243s - 30 - 60 - local_dnn_convw_output_merge
- 0.190s - 135 - 405 - local_gpua_dimshuffle
- 0.173s - 36 - 684 - local_abstractconv_gi_cudnn
- 0.149s - 43 - 774 - local_abstractconv_gw_cudnn
- 0.137s - 9 - 18 - local_dnn_convi_output_merge
- 0.128s - 29 - 734 - local_abstractconv_cudnn
- 0.087s - 73 - 243 - local_gpua_elemwise
- 0.048s - 48 - 144 - local_gpua_subtensor
- 0.036s - 23 - 23 - local_gpu_elemwise_careduce
- 0.029s - 1 - 3 - local_gpua_gemm_output_merge
- 0.010s - 9 - 9 - local_gpualloc_memset_0
- 0.230s - in 58 optimization that were not used (display only those with a runtime > 0)
- 0.031s - local_track_shape_i
- 0.024s - local_gpua_gemm_alpha_merge
- 0.020s - local_dnn_conv_output_merge
- 0.020s - local_gpua_gemmbatch_output_merge
- 0.019s - local_dnn_conv_alpha_merge
- 0.019s - local_gemm16_output_merge
- 0.019s - local_gpua_gemmbatch_alpha_merge
- 0.019s - local_gemm16_alpha_merge
- 0.019s - local_dnn_convi_alpha_merge
- 0.019s - local_dnn_convw_alpha_merge
- 0.013s - local_log_softmax_dnn
- 0.003s - local_gpua_assert
- 0.002s - local_gpua_shape
- 0.001s - local_gpu_contiguous_gpu_contiguous
- 0.001s - local_gpua_abstractconv2d
- 2.210998s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 4633, 5910) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.415s
- Total time taken by local optimizers 0.340s
- times - times applied - Node created - name:
- 0.268s - 199 - 199 - local_gpua_careduce
- 0.054s - 2631 - 3852 - local_gpua_elemwise
- 0.006s - 162 - 162 - local_gpua_assert_graph
- 0.005s - 108 - 108 - local_gpua_lift_abstractconv2d_graph
- 0.003s - 24 - 24 - local_gpua_inc_subtensor
- 0.003s - 226 - 226 - local_gpua_dimshuffle
- 0.001s - 15 - 30 - local_gpua_mrg_graph
- 0.000s - 1 - 8 - local_gpua_dot22scalar
- 0.000s - 22 - 22 - local_gpua_subtensor_graph
- 0.000s - 50 - 50 - local_gpua_reshape
- 0.000s - 36 - 36 - local_gpua_dot22
- 0.000s - 10 - 19 - local_gpua_alloc
- 0.000s - 9 - 27 - local_gpua_gemm
- 0.000s - 1 - 1 - local_gpua_crossentropysoftmaxargmax1hotwithbias
- 0.000s - 1 - 2 - local_gpua_crossentropysoftmax1hotwithbiasdx
- 0.000s - in 1 optimization that were not used (display only those with a runtime > 0)
- 0.636133s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 7186, 5796) - 0.011s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.636s for 2 passes
- nb nodes (start, end, max) 7186 5796 7186
- time io_toposort 0.141s
- time in local optimizers 0.455s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.549s 927 (0.000s in global opts, 0.077s io_toposort) - 7186 nodes - ('local_cut_gpu_transfers', 927)
- 1 - 0.087s 0 (0.000s in global opts, 0.064s io_toposort) - 5796 nodes -
- times - times applied - nb node created - name:
- 0.437s - 927 - 0 - local_cut_gpu_transfers
- 0.018s - in 1 optimization that were not used (display only those with a runtime > 0)
- 0.018s - constant_folding
- 0.000100s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 5910, 5910) - 0.000s
- 5.952148s - ('specialize', 'EquilibriumOptimizer', 13, 5202, 4633) - 0.016s
- EquilibriumOptimizer specialize
- time 5.952s for 11 passes
- nb nodes (start, end, max) 5202 4633 5337
- time io_toposort 0.851s
- time in local optimizers 3.097s
- time in global optimizers 0.582s
- time in final optimizers 1.026s
- time in cleanup optimizers 0.000s
- 0 - 0.868s 520 (0.168s in global opts, 0.048s io_toposort) - 5202 nodes - ('local_reduce_broadcastable', 140) ('local_pow_specialize', 93) ('local_shape_to_shape_i', 72) ('local_add_specialize', 63) ('local_mul_specialize', 52) ...
- 1 - 1.194s 529 (0.193s in global opts, 0.418s io_toposort) - 5337 nodes - ('local_fill_to_alloc', 246) ('local_dimshuffle_lift', 168) ('local_subtensor_make_vector', 72) ('local_useless_elemwise', 30) ('local_mul_specialize', 7) ...
- 2 - 0.428s 65 (0.093s in global opts, 0.043s io_toposort) - 4686 nodes - ('local_remove_useless_assert', 30) ('local_mul_specialize', 28) ('local_func_inv', 4) ('local_elemwise_alloc', 3)
- 3 - 0.386s 3 (0.092s in global opts, 0.044s io_toposort) - 4645 nodes - ('local_neg_div_neg', 3)
- 4 - 0.377s 4 (0.096s in global opts, 0.042s io_toposort) - 4642 nodes - ('local_zero_div', 3) ('topo_constant_folding', 1)
- 5 - 0.387s 4 (0.104s in global opts, 0.043s io_toposort) - 4642 nodes - ('local_elemwise_alloc', 3) ('topo_constant_folding', 1)
- 6 - 0.761s 4 (0.475s in global opts, 0.042s io_toposort) - 4639 nodes - ('local_zero_div', 3) ('topo_constant_folding', 1)
- 7 - 0.413s 4 (0.105s in global opts, 0.043s io_toposort) - 4639 nodes - ('local_elemwise_alloc', 3) ('topo_constant_folding', 1)
- 8 - 0.377s 3 (0.093s in global opts, 0.043s io_toposort) - 4636 nodes - ('local_add_specialize', 3)
- 9 - 0.386s 6 (0.093s in global opts, 0.041s io_toposort) - 4639 nodes - ('local_fill_to_alloc', 6)
- 10 - 0.375s 0 (0.095s in global opts, 0.042s io_toposort) - 4633 nodes -
- times - times applied - nb node created - name:
- 1.026s - 6 - 0 - topo_constant_folding
- 0.582s - 1 - 1 - crossentropy_to_crossentropy_with_softmax_with_bias
- 0.412s - 66 - 255 - local_add_specialize
- 0.231s - 6 - 0 - local_func_inv
- 0.225s - 168 - 112 - local_dimshuffle_lift
- 0.220s - 87 - 78 - local_mul_specialize
- 0.207s - 9 - 15 - local_elemwise_alloc
- 0.198s - 30 - 30 - local_useless_elemwise
- 0.108s - 19 - 19 - local_div_to_inv
- 0.098s - 39 - 30 - local_remove_useless_assert
- 0.076s - 140 - 140 - local_reduce_broadcastable
- 0.075s - 31 - 121 - local_sum_prod_mul_by_scalar
- 0.070s - 7 - 0 - local_useless_switch
- 0.068s - 93 - 93 - local_pow_specialize
- 0.047s - 6 - 12 - local_zero_div
- 0.043s - 252 - 3 - local_fill_to_alloc
- 0.042s - 21 - 21 - local_mul_to_sqr
- 0.039s - 72 - 72 - local_shape_to_shape_i
- 0.039s - 83 - 0 - local_subtensor_make_vector
- 0.007s - 3 - 3 - local_neg_div_neg
- 0.001s - 1 - 1 - local_softmax_with_bias
- 0.001s - 1 - 1 - local_softmax_grad_to_crossentropy_with_softmax_grad
- 0.001s - 1 - 1 - local_useless_crossentropy_softmax_1hot_with_bias_dx_alloc
- 0.888s - in 52 optimization that were not used (display only those with a runtime > 0)
- 0.148s - local_one_minus_erf2
- 0.134s - local_abs_merge
- 0.111s - local_useless_elemwise_comparison
- 0.064s - local_track_shape_i
- 0.061s - local_mul_switch_sink
- 0.049s - local_expm1
- 0.048s - local_elemwise_sub_zeros
- 0.047s - local_logsoftmax
- 0.046s - local_cast_cast
- 0.042s - local_one_minus_erf
- 0.039s - local_alloc_unary
- 0.036s - local_grad_log_erfc_neg
- 0.015s - local_useless_subtensor
- 0.009s - local_sum_prod_div_dimshuffle
- 0.008s - local_log1p
- 0.007s - local_useless_slice
- 0.004s - local_sumsqr2dot
- 0.004s - local_opt_alloc
- 0.003s - local_subtensor_remove_broadcastable_index
- 0.002s - local_neg_neg
- 0.001s - local_useless_inc_subtensor
- 0.001s - local_log_add
- 0.001s - local_subtensor_merge
- 0.001s - local_subtensor_of_alloc
- 0.001s - local_subtensor_of_dot
- 0.001s - local_log_erfc
- 0.001s - local_useless_inc_subtensor_alloc
- 0.001s - local_advanced_indexing_crossentropy_onehot
- 0.001s - local_canonicalize_alloc
- 0.000s - local_useless_alloc
- 0.000s - local_merge_alloc
- 0.000s - local_scalar_tensor_scalar
- 0.000s - local_logsoftmax_grad
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (5411, 5338, 73)
- init io_toposort 0.0502741336823
- loop time 0.0653800964355
- callback_time 0.0122356414795
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4714, 4686, 28)
- init io_toposort 0.0429389476776
- loop time 0.0369338989258
- callback_time 0.0113785266876
- Iter 2
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4645, 4645, 0)
- init io_toposort 0.0422580242157
- loop time 0.00536894798279
- callback_time 0.0
- Iter 3
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4642, 4642, 0)
- init io_toposort 0.0418980121613
- loop time 0.00537490844727
- callback_time 0.0
- Iter 4
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4645, 4642, 3)
- init io_toposort 0.044823884964
- loop time 0.0064218044281
- callback_time 0.000365972518921
- Iter 5
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4642, 4639, 3)
- init io_toposort 0.0490889549255
- loop time 0.00785398483276
- callback_time 0.000781059265137
- Iter 6
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4642, 4639, 3)
- init io_toposort 0.42392206192
- loop time 0.00623798370361
- callback_time 0.000385522842407
- Iter 7
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4639, 4636, 3)
- init io_toposort 0.0444281101227
- loop time 0.00804901123047
- callback_time 0.000782251358032
- Iter 8
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4639, 4639, 0)
- init io_toposort 0.0420498847961
- loop time 0.00549507141113
- callback_time 0.0
- Iter 9
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4633, 4633, 0)
- init io_toposort 0.0423140525818
- loop time 0.00554203987122
- callback_time 0.0
- Iter 10
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4633, 4633, 0)
- init io_toposort 0.0434467792511
- loop time 0.00532102584839
- callback_time 0.0
- 3.769982s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 5272, 3758) - 0.006s
- FusionOptimizer
- nb_iter 3
- nb_replacement 571
- nb_inconsistency_replace 0
- validate_time 0.00642895698547
- callback_time 0.202159881592
- time_toposort 0.672288179398
- 2.975995s - ('merge3', 'MergeOptimizer', 51, 2957, 2957) - 2.962s
- MergeOptimizer
- nb fail= 0 merged= 47 constant= 47
- time replace=2.98 validate=2.96 callback=2.97
- callbacks_time
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x122195850> , 0.000165700912476
- <theano.gof.toolbox.ReplaceValidate object at 0x12087ee50> , 0.000275135040283
- <theano.gof.opt.MergeFeature object at 0x1227e7190> , 0.00250196456909
- <theano.tensor.opt.ShapeFeature object at 0x121562590> , 0.00316381454468
- <theano.compile.function_module.Supervisor instance at 0x1227d3f38> , 0.807505607605
- <theano.gof.destroyhandler.DestroyHandler object at 0x1291a3f10> , 2.15578103065
- 2.688769s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 2956, 2957) - 2.480s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (2956, 2957, 108)
- init io_toposort 0.0287601947784
- loop time 2.65987706184
- callback_time 2.56827378273
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 1.944101s - ('elemwise_fusion', 'SeqOptimizer', 19, 5796, 5272) - 0.004s
- SeqOptimizer elemwise_fusion time 1.944s for 5796/5272 nodes before/after optimization
- 0.120s for callback
- 0.004s for fgraph.validate()
- 1.477741s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 5632, 5272) - 0.001s
- FusionOptimizer
- nb_iter 3
- nb_replacement 101
- nb_inconsistency_replace 0
- validate_time 0.00115561485291
- callback_time 0.0301287174225
- time_toposort 0.670944929123
- 0.466092s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 5796, 5632) - 0.003s
- FusionOptimizer
- nb_iter 3
- nb_replacement 259
- nb_inconsistency_replace 0
- validate_time 0.00277924537659
- callback_time 0.0899620056152
- time_toposort 0.185888051987
- 1.419394s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 5403, 5205) - 0.003s
- EquilibriumOptimizer scan_eqopt2
- time 1.419s for 2 passes
- nb nodes (start, end, max) 5403 5205 5403
- time io_toposort 0.097s
- time in local optimizers 0.000s
- time in global optimizers 1.302s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.982s 1 (0.922s in global opts, 0.049s io_toposort) - 5205 nodes - ('constant_folding', 1)
- 1 - 0.437s 0 (0.381s in global opts, 0.048s io_toposort) - 5205 nodes -
- times - times applied - nb node created - name:
- 0.263s - 1 - 0 - constant_folding
- 1.039s - in 6 optimization that were not used (display only those with a runtime > 0)
- 0.495s - remove_constants_and_unused_inputs_scan
- 0.114s - scan_merge_inouts
- 0.110s - remove_constants_and_unused_inputs_scan
- 0.109s - <theano.scan_module.scan_opt.ScanMerge object at 0x10f69f750>
- 0.108s - remove_constants_and_unused_inputs_scan
- 0.102s - <theano.scan_module.scan_opt.ScanSaveMem object at 0x10f69fa10>
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (5403, 5205, 198)
- init io_toposort 0.0526330471039
- loop time 0.149075984955
- callback_time 0.048490524292
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (5205, 5205, 0)
- init io_toposort 0.0490438938141
- loop time 0.00537800788879
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (5205, 5205, 0)
- init io_toposort 0.43702507019
- loop time 0.00532221794128
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (5205, 5205, 0)
- init io_toposort 0.0544281005859
- loop time 0.00595307350159
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (5205, 5205, 0)
- init io_toposort 0.0488278865814
- loop time 0.00527000427246
- callback_time 0.0
- Iter 1
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (5205, 5205, 0)
- init io_toposort 0.0551071166992
- loop time 0.00627899169922
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (5205, 5205, 0)
- init io_toposort 0.0485699176788
- loop time 0.00670099258423
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (5205, 5205, 0)
- init io_toposort 0.0476229190826
- loop time 0.00529885292053
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (5205, 5205, 0)
- init io_toposort 0.0482048988342
- loop time 0.00564789772034
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (5205, 5205, 0)
- init io_toposort 0.0479300022125
- loop time 0.00541400909424
- callback_time 0.0
- 1.065656s - ('BlasOpt', 'SeqOptimizer', 12, 5205, 5202) - 0.001s
- SeqOptimizer BlasOpt time 1.065s for 5205/5202 nodes before/after optimization
- 0.012s for callback
- 0.001s for fgraph.validate()
- 0.433675s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 5205, 5205) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (5205, 5205, 45)
- init io_toposort 0.406867027283
- loop time 0.0267560482025
- callback_time 0.00478005409241
- 0.384683s - ('gemm_optimizer', 'GemmOptimizer', 1, 5205, 5201) - 0.000s
- GemmOptimizer
- nb_iter 2
- nb_replacement 9
- nb_replacement_didn_t_remove 2
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0.112722396851
- time_factor_can 3.38554382324e-05
- time_factor_list 0.00463604927063
- time_toposort 0.0978970527649
- validate_time 0.000133037567139
- callback_time 0.00729250907898
- 0.068988s - ('use_c_blas', 'TopoOptimizer', 4, 5202, 5202) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (5202, 5202, 0)
- init io_toposort 0.0484161376953
- loop time 0.0205109119415
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.063015s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 5201, 5202) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (5201, 5202, 1)
- init io_toposort 0.0481488704681
- loop time 0.0148110389709
- callback_time 0.000154972076416
- 0.059894s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 5202, 5202) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.060s for 1 passes
- nb nodes (start, end, max) 5202 5202 5202
- time io_toposort 0.047s
- time in local optimizers 0.003s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.060s 0 (0.000s in global opts, 0.047s io_toposort) - 5202 nodes -
- 0.055211s - ('use_scipy_ger', 'TopoOptimizer', 5, 5202, 5202) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (5202, 5202, 0)
- init io_toposort 0.0487020015717
- loop time 0.00645112991333
- callback_time 0.0
- 0.870026s - ('stabilize', 'EquilibriumOptimizer', 8, 4559, 4622) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.870s for 2 passes
- nb nodes (start, end, max) 4559 4622 4622
- time io_toposort 0.086s
- time in local optimizers 0.553s
- time in global optimizers 0.095s
- time in final optimizers 0.100s
- time in cleanup optimizers 0.000s
- 0 - 0.444s 20 (0.097s in global opts, 0.043s io_toposort) - 4559 nodes - ('local_fill_to_alloc', 11) ('local_log1p', 9)
- 1 - 0.426s 0 (0.098s in global opts, 0.043s io_toposort) - 4622 nodes -
- times - times applied - nb node created - name:
- 0.012s - 11 - 66 - local_fill_to_alloc
- 0.008s - 9 - 18 - local_log1p
- 0.729s - in 40 optimization that were not used (display only those with a runtime > 0)
- 0.381s - local_greedy_distributor
- 0.100s - topo_constant_folding
- 0.095s - crossentropy_to_crossentropy_with_softmax_with_bias
- 0.052s - local_sigm_times_exp
- 0.028s - local_one_minus_erf2
- 0.024s - local_exp_over_1_plus_exp
- 0.019s - local_useless_elemwise_comparison
- 0.010s - local_expm1
- 0.008s - local_grad_log_erfc_neg
- 0.007s - local_one_minus_erf
- 0.001s - Elemwise{log,no_inplace}(sigmoid(x)) -> Elemwise{neg,no_inplace}(softplus(Elemwise{neg,no_inplace}(x)))
- 0.001s - Elemwise{log,no_inplace}(Elemwise{sub,no_inplace}(y subject to <function _is_1 at 0x10efc5758>, sigmoid(x))) -> Elemwise{neg,no_inplace}(softplus(x))
- 0.001s - local_0_dot_x
- 0.000s - local_incsubtensor_of_zeros
- 0.000s - local_useless_reshape
- 0.000s - local_canonicalize_alloc
- 0.000s - local_subtensor_of_dot
- 0.000s - local_log_add
- 0.000s - Elemwise{log1p,no_inplace}(Elemwise{exp,no_inplace}(x)) -> softplus(x)
- 0.000s - local_reshape_lift
- 0.000s - local_log_erfc
- 0.000s - Elemwise{log1p,no_inplace}(Elemwise{neg,no_inplace}(sigmoid(x))) -> Elemwise{neg,no_inplace}(softplus(x))
- 0.000s - local_useless_alloc
- 0.000s - local_useless_inc_subtensor_alloc
- 0.000s - local_merge_alloc
- 0.000s - local_setsubtensor_of_constants
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4622, 4622, 0)
- init io_toposort 0.043016910553
- loop time 0.00549912452698
- callback_time 0.0
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4622, 4622, 0)
- init io_toposort 0.0462830066681
- loop time 0.00541090965271
- callback_time 0.0
- 0.837172s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 7753, 7753) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.837s for 1 passes
- nb nodes (start, end, max) 7753 7753 7753
- time io_toposort 0.085s
- time in local optimizers 0.000s
- time in global optimizers 0.740s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.837s 0 (0.740s in global opts, 0.085s io_toposort) - 7753 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.740s for 7753/7753 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.396074s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 7753, 7753) - 0.000s
- 0.091526s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 7753, 7753) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (7753, 7753, 0)
- init io_toposort 0.0833880901337
- loop time 0.0080738067627
- callback_time 0.0
- 0.086859s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 7753, 7753) - 0.000s
- 0.084704s - ('scan_pushout_dot1', 'PushOutDot1', 3, 7753, 7753) - 0.000s
- 0.080556s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 7753, 7753) - 0.000s
- 0.783922s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 2957, 2957) - 0.732s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (2957, 2957, 15)
- init io_toposort 0.0296950340271
- loop time 0.754165887833
- callback_time 0.741482257843
- 0.660968s - ('merge1', 'MergeOptimizer', 0, 10599, 7753) - 0.035s
- MergeOptimizer
- nb fail= 0 merged= 4133 constant= 1287
- time replace=0.45 validate=0.03 callback=0.21
- 0.554039s - ('ShapeOpt', 'ShapeOptimizer', 2, 7753, 7753) - 0.000s
- 0.524648s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 2956, 2956) - 0.091s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (2956, 2956, 299)
- init io_toposort 0.0304479598999
- loop time 0.494140148163
- callback_time 0.354798316956
- 0.368721s - ('merge2', 'MergeOptimizer', 22, 3758, 2956) - 0.038s
- MergeOptimizer
- nb fail= 0 merged= 4273 constant= 1638
- time replace=0.37 validate=0.04 callback=0.23
- 0.330622s - ('add_destroy_handler', 'AddDestroyHandler', 23, 2956, 2956) - 0.000s
- 0.310139s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 2956, 2956) - 0.247s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (2956, 2956, 18)
- init io_toposort 0.0290389060974
- loop time 0.280987024307
- callback_time 0.256593227386
- 0.272342s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 4622, 5403) - 0.002s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (4622, 5403, 132)
- init io_toposort 0.0435831546783
- loop time 0.22869181633
- callback_time 0.080276966095
- 0.265057s - ('useless', 'TopoOptimizer', 3, 7753, 7645) - 0.002s
- TopoOptimizer useless
- nb_node (start, end, changed) (7753, 7645, 166)
- init io_toposort 0.0862708091736
- loop time 0.17871594429
- callback_time 0.0392537117004
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 9 - 9 - local_join_1 - 0
- -0.000s - 0 - 8 - local_merge_alloc - 0
- -0.000s - 0 - 8 - local_useless_alloc - 0
- -0.000s - 0 - 9 - local_join_make_vector - 0
- -0.000s - 0 - 24 - local_useless_inc_subtensor_alloc - 0
- -0.000s - 0 - 9 - local_join_empty - 0
- -0.000s - 0 - 24 - local_useless_inc_subtensor - 0
- -0.000s - 0 - 50 - local_useless_reshape - 0
- -0.000s - 0 - 323 - local_subtensor_of_alloc - 0
- -0.001s - 6 - 323 - local_subtensor_make_vector - 0
- -0.001s - 1 - 102 - local_useless_fill - 0
- -0.001s - 0 - 358 - local_useless_reduce - 0
- -0.002s - 0 - 323 - local_useless_slice - 0
- -0.004s - 6 - 6 - local_useless_split - 60
- -0.004s - 0 - 4958 - local_useless_switch - 0
- -0.010s - 0 - 4958 - local_useless_elemwise_comparison - 0
- -0.024s - 159 - 4958 - local_useless_elemwise - 0
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.199589s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 2956, 2956) - 0.143s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (2956, 2956, 10)
- init io_toposort 0.0286979675293
- loop time 0.170758008957
- callback_time 0.147303819656
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.118040s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 7639, 7639) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (7639, 7639, 9)
- init io_toposort 0.0757310390472
- loop time 0.0422530174255
- callback_time 0.00764083862305
- 0.100773s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 4633, 4633) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.101s for 1 passes
- nb nodes (start, end, max) 4633 4633 4633
- time io_toposort 0.042s
- time in local optimizers 0.001s
- time in global optimizers 0.000s
- time in final optimizers 0.050s
- time in cleanup optimizers 0.000s
- 0 - 0.101s 0 (0.050s in global opts, 0.042s io_toposort) - 4633 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (4633, 4633, 0)
- init io_toposort 0.0444288253784
- loop time 0.00537204742432
- callback_time 0.0
- 0.072703s - ('specialize_device', 'EquilibriumOptimizer', 17, 5796, 5796) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.073s for 1 passes
- nb nodes (start, end, max) 5796 5796 5796
- time io_toposort 0.062s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.073s 0 (0.000s in global opts, 0.062s io_toposort) - 5796 nodes -
- 0.068079s - ('AbstractConvCheck', 'TopoOptimizer', 18, 5796, 5796) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (5796, 5796, 0)
- init io_toposort 0.0607590675354
- loop time 0.00725507736206
- callback_time 0.0
- 0.062603s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 5272, 5272) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 0.0588040351868
- 0.054372s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 2957, 2957) - 0.000s
- 0.054163s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 2957, 2957) - 0.000s
- 0.052785s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 4622, 4622) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (4622, 4622, 0)
- init io_toposort 0.0429711341858
- loop time 0.00975203514099
- callback_time 0.0
- 0.049093s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 4633, 4633) - 0.000s
- 0.046055s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 2956, 2956) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (2956, 2956, 0)
- init io_toposort 0.0284509658813
- loop time 0.0174732208252
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.046001s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 2956, 2956) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (2956, 2956, 0)
- init io_toposort 0.0287458896637
- loop time 0.0171229839325
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.045713s - ('blas_opt_inplace', 'TopoOptimizer', 34, 2956, 2956) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (2956, 2956, 0)
- init io_toposort 0.0284130573273
- loop time 0.0171570777893
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.044733s - ('c_blas_destructive', 'TopoOptimizer', 37, 2956, 2956) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (2956, 2956, 0)
- init io_toposort 0.0276620388031
- loop time 0.0170109272003
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.033650s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 2956, 2956) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (2956, 2956, 0)
- init io_toposort 0.0301179885864
- loop time 0.00347709655762
- callback_time 0.0
- 0.032851s - ('make_ger_destructive', 'TopoOptimizer', 41, 2957, 2957) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (2957, 2957, 0)
- init io_toposort 0.0286540985107
- loop time 0.00413203239441
- callback_time 0.0
- 0.032754s - ('cond_make_inplace', 'TopoOptimizer', 47, 2957, 2957) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (2957, 2957, 0)
- init io_toposort 0.0294511318207
- loop time 0.00323390960693
- callback_time 0.0
- 0.032468s - ('local_destructive', 'TopoOptimizer', 48, 2957, 2957) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (2957, 2957, 0)
- init io_toposort 0.0291059017181
- loop time 0.00328993797302
- callback_time 0.0
- 0.032315s - ('random_make_inplace', 'TopoOptimizer', 49, 2957, 2957) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (2957, 2957, 0)
- init io_toposort 0.0288689136505
- loop time 0.00338506698608
- callback_time 0.0
- 0.031910s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 2956, 2956) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (2956, 2956, 0)
- init io_toposort 0.028608083725
- loop time 0.00320410728455
- callback_time 0.0
- 0.031852s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 2956, 2956) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (2956, 2956, 0)
- init io_toposort 0.0288248062134
- loop time 0.00291800498962
- callback_time 0.0
- 0.031726s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 2956, 2956) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (2956, 2956, 0)
- init io_toposort 0.0286209583282
- loop time 0.00300693511963
- callback_time 0.0
- 0.031622s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 2957, 2957) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (2957, 2957, 0)
- init io_toposort 0.0288269519806
- loop time 0.00267696380615
- callback_time 0.0
- 0.031494s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 2956, 2956) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (2956, 2956, 0)
- init io_toposort 0.0281920433044
- loop time 0.00320482254028
- callback_time 0.0
- 0.031309s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 2957, 2957) - 0.000s
- 0.031218s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 2956, 2956) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (2956, 2956, 0)
- init io_toposort 0.0281801223755
- loop time 0.00294089317322
- callback_time 0.0
- 0.031101s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 2956, 2956) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (2956, 2956, 0)
- init io_toposort 0.0280029773712
- loop time 0.00299906730652
- callback_time 0.0
- 0.031038s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 2956, 2956) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (2956, 2956, 0)
- init io_toposort 0.0279738903046
- loop time 0.00296902656555
- callback_time 0.0
- 0.029802s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 2957, 2957) - 0.000s
- 0.002216s - ('merge1.1', 'MergeOptimizer', 4, 7645, 7639) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 24 constant= 18
- time replace=0.00 validate=0.00 callback=0.00
- 0.000065s - ('merge1.2', 'MergeOptimizer', 7, 4559, 4559) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: sb/convnet/sb_resnet.py:341
- Time in 0 calls to Function.__call__: 0.000000e+00s
- Total compile time: 2.954339e+01s
- Number of Apply nodes: 313
- Theano Optimizer time: 2.045334e+00s
- Theano validate time: 2.595465e-01s
- Theano Linker time (includes C, CUDA code generation/compiling): 2.744651e+01s
- Import time 9.034133e-02s
- Node make_thunk time 2.742749e+01s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.958s
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 2.045s for 385/313 nodes before/after optimization
- 0.470s for callback
- 0.260s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 0.348002s - ('elemwise_fusion', 'SeqOptimizer', 19, 574, 420) - 0.001s
- SeqOptimizer elemwise_fusion time 0.348s for 574/420 nodes before/after optimization
- 0.011s for callback
- 0.001s for fgraph.validate()
- 0.327701s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 561, 420) - 0.000s
- FusionOptimizer
- nb_iter 3
- nb_replacement 37
- nb_inconsistency_replace 0
- validate_time 0.000482320785522
- callback_time 0.0088849067688
- time_toposort 0.0130350589752
- 0.020121s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 574, 561) - 0.000s
- FusionOptimizer
- nb_iter 2
- nb_replacement 23
- nb_inconsistency_replace 0
- validate_time 0.000242233276367
- callback_time 0.00241637229919
- time_toposort 0.00995802879333
- 0.292821s - ('gpuarray_opt', 'SeqOptimizer', 16, 339, 574) - 0.001s
- SeqOptimizer gpuarray_opt time 0.293s for 339/574 nodes before/after optimization
- 0.097s for callback
- 0.001s for fgraph.validate()
- 0.168130s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 313, 603) - 0.001s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.168s for 3 passes
- nb nodes (start, end, max) 313 603 638
- time io_toposort 0.016s
- time in local optimizers 0.142s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.111s 55 (0.000s in global opts, 0.003s io_toposort) - 313 nodes - ('constant_folding', 28) ('local_gpua_elemwise', 13) ('local_abstractconv_cudnn', 13) ('local_gpua_dimshuffle', 1)
- 1 - 0.042s 35 (0.000s in global opts, 0.008s io_toposort) - 638 nodes - ('constant_folding', 35)
- 2 - 0.015s 0 (0.000s in global opts, 0.006s io_toposort) - 603 nodes -
- times - times applied - nb node created - name:
- 0.059s - 63 - 0 - constant_folding
- 0.056s - 13 - 328 - local_abstractconv_cudnn
- 0.013s - 13 - 49 - local_gpua_elemwise
- 0.001s - 1 - 3 - local_gpua_dimshuffle
- 0.014s - in 66 optimization that were not used (display only those with a runtime > 0)
- 0.002s - local_track_shape_i
- 0.001s - local_dnn_conv_output_merge
- 0.001s - local_gpua_gemm_alpha_merge
- 0.001s - local_gpua_gemm_output_merge
- 0.001s - local_dnn_conv_alpha_merge
- 0.001s - local_gpua_gemmbatch_alpha_merge
- 0.001s - local_dnn_convw_alpha_merge
- 0.001s - local_gpua_gemmbatch_output_merge
- 0.001s - local_gemm16_output_merge
- 0.001s - local_dnn_convi_alpha_merge
- 0.001s - local_gemm16_alpha_merge
- 0.001s - local_dnn_convw_output_merge
- 0.001s - local_dnn_convi_output_merge
- 0.001s - local_log_softmax_dnn
- 0.000s - local_gpua_shape
- 0.000s - local_gpu_contiguous_gpu_contiguous
- 0.000s - local_gpua_abstractconv2d
- 0.000s - local_gpua_assert
- 0.000s - local_gpu_elemwise_careduce
- 0.106716s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 339, 313) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.004s
- Total time taken by local optimizers 0.012s
- times - times applied - Node created - name:
- 0.006s - 3 - 3 - local_gpua_careduce
- 0.003s - 120 - 154 - local_gpua_elemwise
- 0.001s - 13 - 13 - local_gpua_lift_abstractconv2d_graph
- 0.001s - 50 - 50 - local_gpua_dimshuffle
- 0.000s - 3 - 6 - local_gpua_mrg_graph
- 0.000s - 8 - 8 - local_gpua_subtensor_graph
- 0.000s - 3 - 3 - local_gpua_assert_graph
- 0.000s - 7 - 7 - local_gpua_reshape
- 0.000s - 4 - 4 - local_gpua_dot22
- 0.000s - 1 - 1 - local_gpua_crossentropysoftmaxargmax1hotwithbias
- 0.000s - in 1 optimization that were not used (display only those with a runtime > 0)
- 0.017739s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 603, 574) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.018s for 2 passes
- nb nodes (start, end, max) 603 574 603
- time io_toposort 0.011s
- time in local optimizers 0.004s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.011s 15 (0.000s in global opts, 0.006s io_toposort) - 603 nodes - ('local_cut_gpu_transfers', 15)
- 1 - 0.007s 0 (0.000s in global opts, 0.005s io_toposort) - 574 nodes -
- times - times applied - nb node created - name:
- 0.003s - 15 - 0 - local_cut_gpu_transfers
- 0.001s - in 1 optimization that were not used (display only those with a runtime > 0)
- 0.001s - constant_folding
- 0.000043s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 313, 313) - 0.000s
- 0.271405s - ('canonicalize', 'EquilibriumOptimizer', 6, 330, 340) - 0.002s
- EquilibriumOptimizer canonicalize
- time 0.271s for 4 passes
- nb nodes (start, end, max) 330 340 360
- time io_toposort 0.015s
- time in local optimizers 0.199s
- time in global optimizers 0.000s
- time in final optimizers 0.030s
- time in cleanup optimizers 0.014s
- 0 - 0.137s 83 (0.016s in global opts, 0.004s io_toposort) - 330 nodes - ('MergeOptimizer', 31) ('local_add_canonizer', 10) ('local_mul_canonizer', 10) ('local_shape_to_shape_i', 6) ('local_subtensor_make_vector', 5) ...
- 1 - 0.064s 46 (0.007s in global opts, 0.004s io_toposort) - 360 nodes - ('MergeOptimizer', 14) ('local_dimshuffle_lift', 7) ('local_subtensor_make_vector', 7) ('local_mul_canonizer', 6) ('local_upcast_elemwise_constant_inputs', 6) ...
- 2 - 0.036s 2 (0.004s in global opts, 0.004s io_toposort) - 340 nodes - ('MergeOptimizer', 1) ('local_mul_canonizer', 1)
- 3 - 0.034s 0 (0.004s in global opts, 0.003s io_toposort) - 340 nodes -
- times - times applied - nb node created - name:
- 0.045s - 5 - 10 - local_subtensor_merge
- 0.030s - 2 - 0 - topo_constant_folding
- 0.026s - 17 - 23 - local_mul_canonizer
- 0.022s - 11 - 22 - local_add_canonizer
- 0.014s - 46 - 1 - MergeOptimizer
- 0.013s - 3 - 6 - local_reshape_to_dimshuffle
- 0.011s - 9 - 27 - local_upcast_elemwise_constant_inputs
- 0.007s - 11 - 18 - local_dimshuffle_lift
- 0.006s - 6 - 58 - local_shape_to_shape_i
- 0.003s - 12 - 0 - local_subtensor_make_vector
- 0.002s - 3 - 0 - local_intdiv_by_one
- 0.002s - 1 - 0 - local_useless_switch
- 0.001s - 2 - 4 - local_subtensor_lift
- 0.001s - 2 - 0 - local_useless_fill
- 0.000s - 1 - 1 - local_neg_to_mul
- 0.061s - in 74 optimization that were not used (display only those with a runtime > 0)
- 0.016s - local_greedy_distributor
- 0.006s - local_mul_zero
- 0.006s - local_fill_sink
- 0.005s - local_func_inv
- 0.004s - local_useless_elemwise
- 0.003s - local_one_minus_erf2
- 0.003s - local_useless_elemwise_comparison
- 0.003s - local_merge_switch_same_cond
- 0.002s - local_one_minus_erf
- 0.002s - local_track_shape_i
- 0.001s - local_fill_cut
- 0.001s - local_mul_switch_sink
- 0.001s - local_useless_subtensor
- 0.001s - local_expm1
- 0.001s - local_cast_cast
- 0.001s - local_IncSubtensor_serialize
- 0.001s - local_useless_slice
- 0.001s - local_cut_gpu_transfers
- 0.000s - f
- 0.000s - local_zero_div
- 0.000s - local_abs_lift
- 0.000s - local_subtensor_remove_broadcastable_index
- 0.000s - local_lift_transpose_through_dot
- 0.000s - local_div_switch_sink
- 0.000s - local_dimshuffle_no_inplace_at_canonicalize
- 0.000s - local_pow_canonicalize
- 0.000s - local_0_dot_x
- 0.000s - local_canonicalize_alloc
- 0.000s - local_useless_dimshuffle_in_reshape
- 0.000s - local_useless_reshape
- 0.000s - local_reshape_lift
- 0.000s - local_sum_prod_div_dimshuffle
- 0.000s - local_subtensor_of_alloc
- 0.000s - local_subtensor_of_dot
- 0.000s - local_useless_alloc
- 0.000s - local_reduce_join
- 0.000s - local_op_of_op
- 0.000s - local_useless_reduce
- 0.000s - local_merge_alloc
- 0.000s - local_sum_prod_all_to_none
- 0.000s - local_sumsqr2dot
- 0.000s - local_scalar_tensor_scalar
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (375, 363, 12)
- init io_toposort 0.00319600105286
- loop time 0.0124809741974
- callback_time 0.00716590881348
- MergeOptimizer
- nb fail= 0 merged= 74 constant= 50
- time replace=0.01 validate=0.00 callback=0.01
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (343, 340, 3)
- init io_toposort 0.00289988517761
- loop time 0.00371909141541
- callback_time 0.00157237052917
- MergeOptimizer
- nb fail= 0 merged= 21 constant= 9
- time replace=0.00 validate=0.00 callback=0.00
- Iter 2
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (340, 340, 0)
- init io_toposort 0.00361585617065
- loop time 0.000491142272949
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 1 constant= 1
- time replace=0.00 validate=0.00 callback=0.00
- Iter 3
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (340, 340, 0)
- init io_toposort 0.00327706336975
- loop time 0.00036096572876
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.232062s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 313, 313) - 0.137s
- 0.221204s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 420, 327) - 0.000s
- FusionOptimizer
- nb_iter 3
- nb_replacement 36
- nb_inconsistency_replace 0
- validate_time 0.000401258468628
- callback_time 0.00900220870972
- time_toposort 0.0129661560059
- 0.144045s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 313, 313) - 0.068s
- 0.079509s - ('specialize', 'EquilibriumOptimizer', 13, 343, 339) - 0.000s
- EquilibriumOptimizer specialize
- time 0.079s for 3 passes
- nb nodes (start, end, max) 343 339 343
- time io_toposort 0.010s
- time in local optimizers 0.040s
- time in global optimizers 0.012s
- time in final optimizers 0.009s
- time in cleanup optimizers 0.000s
- 0 - 0.029s 11 (0.006s in global opts, 0.003s io_toposort) - 343 nodes - ('local_div_to_inv', 6) ('local_mul_specialize', 3) ('local_softmax_with_bias', 1) ('local_argmax_pushdown', 1)
- 1 - 0.025s 1 (0.010s in global opts, 0.003s io_toposort) - 339 nodes - ('crossentropy_to_crossentropy_with_softmax_with_bias', 1)
- 2 - 0.025s 0 (0.006s in global opts, 0.004s io_toposort) - 339 nodes -
- times - times applied - nb node created - name:
- 0.012s - 1 - 1 - crossentropy_to_crossentropy_with_softmax_with_bias
- 0.003s - 3 - 0 - local_mul_specialize
- 0.003s - 6 - 6 - local_div_to_inv
- 0.001s - 1 - 1 - local_argmax_pushdown
- 0.000s - 1 - 1 - local_softmax_with_bias
- 0.043s - in 70 optimization that were not used (display only those with a runtime > 0)
- 0.009s - topo_constant_folding
- 0.005s - local_add_specialize
- 0.003s - local_func_inv
- 0.003s - local_useless_elemwise
- 0.003s - local_elemwise_alloc
- 0.003s - local_one_minus_erf2
- 0.002s - local_one_minus_erf
- 0.002s - local_useless_elemwise_comparison
- 0.001s - local_track_shape_i
- 0.001s - local_abs_merge
- 0.001s - local_mul_switch_sink
- 0.001s - local_useless_switch
- 0.001s - local_elemwise_sub_zeros
- 0.001s - local_expm1
- 0.001s - local_useless_subtensor
- 0.001s - local_logsoftmax
- 0.001s - local_cast_cast
- 0.001s - local_alloc_unary
- 0.000s - local_dimshuffle_lift
- 0.000s - local_mul_to_sqr
- 0.000s - local_remove_useless_assert
- 0.000s - local_useless_slice
- 0.000s - local_pow_specialize
- 0.000s - local_subtensor_remove_broadcastable_index
- 0.000s - local_sum_prod_mul_by_scalar
- 0.000s - local_zero_div
- 0.000s - local_sum_prod_div_dimshuffle
- 0.000s - local_subtensor_make_vector
- 0.000s - local_grad_log_erfc_neg
- 0.000s - local_reduce_broadcastable
- 0.000s - local_subtensor_merge
- 0.000s - local_subtensor_of_dot
- 0.000s - local_subtensor_of_alloc
- 0.000s - local_sumsqr2dot
- 0.000s - local_opt_alloc
- 0.000s - local_scalar_tensor_scalar
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (340, 340, 0)
- init io_toposort 0.0026798248291
- loop time 0.000288963317871
- callback_time 0.0
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (339, 339, 0)
- init io_toposort 0.00300598144531
- loop time 0.000308036804199
- callback_time 0.0
- Iter 2
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (339, 339, 0)
- init io_toposort 0.00263500213623
- loop time 0.000291109085083
- callback_time 0.0
- 0.058268s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 346, 343) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.058s for 2 passes
- nb nodes (start, end, max) 346 343 346
- time io_toposort 0.005s
- time in local optimizers 0.000s
- time in global optimizers 0.052s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.031s 1 (0.027s in global opts, 0.003s io_toposort) - 343 nodes - ('constant_folding', 1)
- 1 - 0.028s 0 (0.024s in global opts, 0.003s io_toposort) - 343 nodes -
- times - times applied - nb node created - name:
- 0.009s - 1 - 0 - constant_folding
- 0.043s - in 6 optimization that were not used (display only those with a runtime > 0)
- 0.008s - <theano.scan_module.scan_opt.ScanMerge object at 0x10f69f750>
- 0.008s - remove_constants_and_unused_inputs_scan
- 0.007s - scan_merge_inouts
- 0.007s - remove_constants_and_unused_inputs_scan
- 0.006s - remove_constants_and_unused_inputs_scan
- 0.006s - <theano.scan_module.scan_opt.ScanSaveMem object at 0x10f69fa10>
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (346, 343, 3)
- init io_toposort 0.00319814682007
- loop time 0.00264716148376
- callback_time 0.000752687454224
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00274419784546
- loop time 0.000353097915649
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00378894805908
- loop time 0.000379085540771
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.0033597946167
- loop time 0.000405073165894
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00262212753296
- loop time 0.000285148620605
- callback_time 0.0
- Iter 1
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.0027289390564
- loop time 0.000293016433716
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00342202186584
- loop time 0.000372171401978
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00345611572266
- loop time 0.000340938568115
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00299596786499
- loop time 0.000335931777954
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.0031099319458
- loop time 0.000379085540771
- callback_time 0.0
- 0.054531s - ('ShapeOpt', 'ShapeOptimizer', 2, 330, 330) - 0.000s
- 0.040722s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 313, 313) - 0.005s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (313, 313, 51)
- init io_toposort 0.00292992591858
- loop time 0.0377469062805
- callback_time 0.0264596939087
- 0.038850s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 313, 313) - 0.025s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (313, 313, 13)
- init io_toposort 0.00262403488159
- loop time 0.0361239910126
- callback_time 0.0301699638367
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.024305s - ('BlasOpt', 'SeqOptimizer', 12, 343, 343) - 0.000s
- SeqOptimizer BlasOpt time 0.024s for 343/343 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.004798s - ('gemm_optimizer', 'GemmOptimizer', 1, 343, 343) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0.000886917114258
- time_factor_can 0
- time_factor_list 0
- time_toposort 0.00261306762695
- validate_time 0.0
- callback_time 0.0
- 0.004567s - ('use_c_blas', 'TopoOptimizer', 4, 343, 343) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00274109840393
- loop time 0.00178909301758
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.004229s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 343, 343) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (343, 343, 4)
- init io_toposort 0.00279211997986
- loop time 0.00139617919922
- callback_time 0.000430583953857
- 0.003866s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 343, 343) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.004s for 1 passes
- nb nodes (start, end, max) 343 343 343
- time io_toposort 0.003s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.004s 0 (0.000s in global opts, 0.003s io_toposort) - 343 nodes -
- 0.003740s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 343, 343) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00274991989136
- loop time 0.000944852828979
- callback_time 0.0
- 0.002956s - ('use_scipy_ger', 'TopoOptimizer', 5, 343, 343) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00257515907288
- loop time 0.000352144241333
- callback_time 0.0
- 0.021485s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 313, 313) - 0.015s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (313, 313, 3)
- init io_toposort 0.00355195999146
- loop time 0.0178859233856
- callback_time 0.0168223381042
- 0.021211s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 330, 330) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.021s for 1 passes
- nb nodes (start, end, max) 330 330 330
- time io_toposort 0.003s
- time in local optimizers 0.000s
- time in global optimizers 0.018s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.021s 0 (0.018s in global opts, 0.003s io_toposort) - 330 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.018s for 330/330 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.004939s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 330, 330) - 0.000s
- 0.004040s - ('scan_pushout_dot1', 'PushOutDot1', 3, 330, 330) - 0.000s
- 0.003186s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 330, 330) - 0.000s
- 0.002830s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 330, 330) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (330, 330, 0)
- init io_toposort 0.00245404243469
- loop time 0.000336885452271
- callback_time 0.0
- 0.002699s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 330, 330) - 0.000s
- 0.018380s - ('stabilize', 'EquilibriumOptimizer', 8, 340, 340) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.018s for 1 passes
- nb nodes (start, end, max) 340 340 340
- time io_toposort 0.003s
- time in local optimizers 0.008s
- time in global optimizers 0.003s
- time in final optimizers 0.003s
- time in cleanup optimizers 0.000s
- 0 - 0.018s 0 (0.006s in global opts, 0.003s io_toposort) - 340 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (340, 340, 0)
- init io_toposort 0.00290107727051
- loop time 0.000303030014038
- callback_time 0.0
- 0.014653s - ('add_destroy_handler', 'AddDestroyHandler', 23, 313, 313) - 0.000s
- 0.014369s - ('merge2', 'MergeOptimizer', 22, 327, 313) - 0.002s
- MergeOptimizer
- nb fail= 0 merged= 192 constant= 144
- time replace=0.01 validate=0.00 callback=0.01
- 0.010946s - ('merge1', 'MergeOptimizer', 0, 385, 330) - 0.001s
- MergeOptimizer
- nb fail= 0 merged= 109 constant= 54
- time replace=0.01 validate=0.00 callback=0.00
- 0.008074s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 339, 339) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.008s for 1 passes
- nb nodes (start, end, max) 339 339 339
- time io_toposort 0.003s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.005s
- time in cleanup optimizers 0.000s
- 0 - 0.008s 0 (0.005s in global opts, 0.003s io_toposort) - 339 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (339, 339, 0)
- init io_toposort 0.00420188903809
- loop time 0.000434875488281
- callback_time 0.0
- 0.007642s - ('useless', 'TopoOptimizer', 3, 330, 330) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (330, 330, 0)
- init io_toposort 0.00319695472717
- loop time 0.00438714027405
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 3 - local_merge_alloc - 0
- -0.000s - 0 - 3 - local_useless_alloc - 0
- -0.000s - 0 - 3 - local_useless_reduce - 0
- -0.000s - 0 - 7 - local_useless_reshape - 0
- -0.000s - 0 - 22 - local_subtensor_of_alloc - 0
- -0.000s - 0 - 22 - local_subtensor_make_vector - 0
- -0.000s - 0 - 137 - local_useless_switch - 0
- -0.000s - 0 - 22 - local_useless_slice - 0
- -0.000s - 0 - 137 - local_useless_elemwise_comparison - 0
- -0.001s - 0 - 137 - local_useless_elemwise - 0
- 0.000s - in 9 optimization that were not used (display those with runtime greater than 0)
- 0.007514s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 420, 420) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 0.00684809684753
- 0.006168s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 313, 313) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.0039050579071
- loop time 0.00214195251465
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.006007s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 340, 346) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (340, 346, 3)
- init io_toposort 0.00270009040833
- loop time 0.00326800346375
- callback_time 0.000940084457397
- 0.005795s - ('specialize_device', 'EquilibriumOptimizer', 17, 574, 574) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.006s for 1 passes
- nb nodes (start, end, max) 574 574 574
- time io_toposort 0.005s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.006s 0 (0.000s in global opts, 0.005s io_toposort) - 574 nodes -
- 0.005605s - ('AbstractConvCheck', 'TopoOptimizer', 18, 574, 574) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (574, 574, 0)
- init io_toposort 0.00474500656128
- loop time 0.000811100006104
- callback_time 0.0
- 0.005336s - ('c_blas_destructive', 'TopoOptimizer', 37, 313, 313) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00340914726257
- loop time 0.00188398361206
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.005236s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 313, 313) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00278186798096
- loop time 0.00232291221619
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.005101s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 313, 313) - 0.000s
- 0.004823s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 313, 313) - 0.000s
- 0.004742s - ('blas_opt_inplace', 'TopoOptimizer', 34, 313, 313) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00268507003784
- loop time 0.00196695327759
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.004588s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 313, 313) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00413489341736
- loop time 0.000356912612915
- callback_time 0.0
- 0.004424s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 313, 313) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00266098976135
- loop time 0.00166201591492
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.004422s - ('random_make_inplace', 'TopoOptimizer', 49, 313, 313) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00393199920654
- loop time 0.000427007675171
- callback_time 0.0
- 0.004200s - ('make_ger_destructive', 'TopoOptimizer', 41, 313, 313) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00368809700012
- loop time 0.000468015670776
- callback_time 0.0
- 0.004082s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 313, 313) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00373101234436
- loop time 0.000315189361572
- callback_time 0.0
- 0.003584s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 313, 313) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00316691398621
- loop time 0.000317811965942
- callback_time 0.0
- 0.003561s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 330, 330) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (330, 330, 0)
- init io_toposort 0.00282597541809
- loop time 0.000701904296875
- callback_time 0.0
- 0.003460s - ('local_destructive', 'TopoOptimizer', 48, 313, 313) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00305008888245
- loop time 0.000365972518921
- callback_time 0.0
- 0.003311s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 340, 340) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (340, 340, 0)
- init io_toposort 0.00266814231873
- loop time 0.000611066818237
- callback_time 0.0
- 0.003242s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 313, 313) - 0.000s
- 0.003147s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 313, 313) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00282001495361
- loop time 0.000253915786743
- callback_time 0.0
- 0.003041s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 313, 313) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00271677970886
- loop time 0.00026798248291
- callback_time 0.0
- 0.003027s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 313, 313) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00271415710449
- loop time 0.000258922576904
- callback_time 0.0
- 0.002971s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 313, 313) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00265693664551
- loop time 0.000259160995483
- callback_time 0.0
- 0.002969s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 313, 313) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00266313552856
- loop time 0.000251054763794
- callback_time 0.0
- 0.002954s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 313, 313) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00263500213623
- loop time 0.000256061553955
- callback_time 0.0
- 0.002952s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 313, 313) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00263714790344
- loop time 0.000253200531006
- callback_time 0.0
- 0.002938s - ('cond_make_inplace', 'TopoOptimizer', 47, 313, 313) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00264000892639
- loop time 0.000258207321167
- callback_time 0.0
- 0.002860s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 339, 339) - 0.000s
- 0.000331s - ('merge3', 'MergeOptimizer', 51, 313, 313) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000049s - ('merge1.2', 'MergeOptimizer', 7, 340, 340) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000028s - ('merge1.1', 'MergeOptimizer', 4, 330, 330) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: sb/convnet/sb_resnet.py:349
- Time in 0 calls to Function.__call__: 0.000000e+00s
- Total compile time: 3.226146e+00s
- Number of Apply nodes: 313
- Theano Optimizer time: 2.992740e+00s
- Theano validate time: 2.675121e-01s
- Theano Linker time (includes C, CUDA code generation/compiling): 1.833398e-01s
- Import time 0.000000e+00s
- Node make_thunk time 1.669791e-01s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 477.967s
- Optimizer Profile
- -----------------
- SeqOptimizer OPT_FAST_RUN time 2.992s for 385/313 nodes before/after optimization
- 0.486s for callback
- 0.268s for fgraph.validate()
- time - (name, class, index, nodes before, nodes after) - validate time
- 1.219774s - ('canonicalize', 'EquilibriumOptimizer', 6, 330, 340) - 0.003s
- EquilibriumOptimizer canonicalize
- time 1.219s for 4 passes
- nb nodes (start, end, max) 330 340 360
- time io_toposort 0.011s
- time in local optimizers 1.152s
- time in global optimizers 0.000s
- time in final optimizers 0.030s
- time in cleanup optimizers 0.013s
- 0 - 0.138s 83 (0.015s in global opts, 0.003s io_toposort) - 330 nodes - ('MergeOptimizer', 31) ('local_add_canonizer', 10) ('local_mul_canonizer', 10) ('local_shape_to_shape_i', 6) ('local_subtensor_make_vector', 5) ...
- 1 - 1.012s 46 (0.007s in global opts, 0.003s io_toposort) - 360 nodes - ('MergeOptimizer', 14) ('local_dimshuffle_lift', 7) ('local_subtensor_make_vector', 7) ('local_mul_canonizer', 6) ('local_upcast_elemwise_constant_inputs', 6) ...
- 2 - 0.036s 2 (0.004s in global opts, 0.003s io_toposort) - 340 nodes - ('MergeOptimizer', 1) ('local_mul_canonizer', 1)
- 3 - 0.033s 0 (0.004s in global opts, 0.003s io_toposort) - 340 nodes -
- times - times applied - nb node created - name:
- 0.963s - 3 - 6 - local_reshape_to_dimshuffle
- 0.044s - 5 - 10 - local_subtensor_merge
- 0.030s - 2 - 0 - topo_constant_folding
- 0.026s - 17 - 23 - local_mul_canonizer
- 0.022s - 11 - 22 - local_add_canonizer
- 0.013s - 46 - 1 - MergeOptimizer
- 0.011s - 9 - 27 - local_upcast_elemwise_constant_inputs
- 0.007s - 6 - 58 - local_shape_to_shape_i
- 0.007s - 11 - 18 - local_dimshuffle_lift
- 0.003s - 12 - 0 - local_subtensor_make_vector
- 0.002s - 3 - 0 - local_intdiv_by_one
- 0.002s - 1 - 0 - local_useless_switch
- 0.001s - 2 - 4 - local_subtensor_lift
- 0.001s - 2 - 0 - local_useless_fill
- 0.000s - 1 - 1 - local_neg_to_mul
- 0.064s - in 74 optimization that were not used (display only those with a runtime > 0)
- 0.016s - local_greedy_distributor
- 0.007s - local_mul_zero
- 0.006s - local_fill_sink
- 0.005s - local_func_inv
- 0.004s - local_useless_elemwise
- 0.003s - local_one_minus_erf2
- 0.003s - local_useless_elemwise_comparison
- 0.003s - local_merge_switch_same_cond
- 0.002s - local_one_minus_erf
- 0.002s - local_track_shape_i
- 0.001s - local_fill_cut
- 0.001s - local_mul_switch_sink
- 0.001s - local_useless_subtensor
- 0.001s - local_expm1
- 0.001s - local_cast_cast
- 0.001s - local_IncSubtensor_serialize
- 0.001s - local_cut_gpu_transfers
- 0.001s - local_useless_slice
- 0.000s - local_zero_div
- 0.000s - f
- 0.000s - local_lift_transpose_through_dot
- 0.000s - local_abs_lift
- 0.000s - local_subtensor_remove_broadcastable_index
- 0.000s - local_dimshuffle_no_inplace_at_canonicalize
- 0.000s - local_div_switch_sink
- 0.000s - local_pow_canonicalize
- 0.000s - local_0_dot_x
- 0.000s - local_canonicalize_alloc
- 0.000s - local_useless_reshape
- 0.000s - local_reshape_lift
- 0.000s - local_useless_dimshuffle_in_reshape
- 0.000s - local_subtensor_of_alloc
- 0.000s - local_subtensor_of_dot
- 0.000s - local_sum_prod_div_dimshuffle
- 0.000s - local_useless_alloc
- 0.000s - local_scalar_tensor_scalar
- 0.000s - local_op_of_op
- 0.000s - local_merge_alloc
- 0.000s - local_sum_prod_all_to_none
- 0.000s - local_useless_reduce
- 0.000s - local_sumsqr2dot
- 0.000s - local_reduce_join
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (375, 363, 12)
- init io_toposort 0.00306487083435
- loop time 0.012228012085
- callback_time 0.00723528862
- MergeOptimizer
- nb fail= 0 merged= 74 constant= 50
- time replace=0.01 validate=0.00 callback=0.01
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (343, 340, 3)
- init io_toposort 0.00357699394226
- loop time 0.00341296195984
- callback_time 0.00148725509644
- MergeOptimizer
- nb fail= 0 merged= 21 constant= 9
- time replace=0.00 validate=0.00 callback=0.00
- Iter 2
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (340, 340, 0)
- init io_toposort 0.00325512886047
- loop time 0.000423908233643
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 1 constant= 1
- time replace=0.00 validate=0.00 callback=0.00
- Iter 3
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (340, 340, 0)
- init io_toposort 0.00319790840149
- loop time 0.000303983688354
- callback_time 0.0
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.340164s - ('elemwise_fusion', 'SeqOptimizer', 19, 574, 420) - 0.001s
- SeqOptimizer elemwise_fusion time 0.340s for 574/420 nodes before/after optimization
- 0.011s for callback
- 0.001s for fgraph.validate()
- 0.316551s - ('composite_elemwise_fusion', 'FusionOptimizer', 1, 561, 420) - 0.000s
- FusionOptimizer
- nb_iter 3
- nb_replacement 37
- nb_inconsistency_replace 0
- validate_time 0.00044584274292
- callback_time 0.00821328163147
- time_toposort 0.0146968364716
- 0.023436s - ('local_add_mul_fusion', 'FusionOptimizer', 0, 574, 561) - 0.000s
- FusionOptimizer
- nb_iter 2
- nb_replacement 23
- nb_inconsistency_replace 0
- validate_time 0.000319242477417
- callback_time 0.00327849388123
- time_toposort 0.0101230144501
- 0.290148s - ('gpuarray_opt', 'SeqOptimizer', 16, 339, 574) - 0.001s
- SeqOptimizer gpuarray_opt time 0.290s for 339/574 nodes before/after optimization
- 0.100s for callback
- 0.001s for fgraph.validate()
- 0.164832s - ('gpuarray_local_optimizations', 'EquilibriumOptimizer', 2, 313, 603) - 0.001s
- EquilibriumOptimizer gpuarray_local_optimizations
- time 0.164s for 3 passes
- nb nodes (start, end, max) 313 603 638
- time io_toposort 0.016s
- time in local optimizers 0.138s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.108s 55 (0.000s in global opts, 0.003s io_toposort) - 313 nodes - ('constant_folding', 28) ('local_gpua_elemwise', 13) ('local_abstractconv_cudnn', 13) ('local_gpua_dimshuffle', 1)
- 1 - 0.037s 35 (0.000s in global opts, 0.006s io_toposort) - 638 nodes - ('constant_folding', 35)
- 2 - 0.020s 0 (0.000s in global opts, 0.008s io_toposort) - 603 nodes -
- times - times applied - nb node created - name:
- 0.054s - 13 - 328 - local_abstractconv_cudnn
- 0.054s - 63 - 0 - constant_folding
- 0.014s - 13 - 49 - local_gpua_elemwise
- 0.001s - 1 - 3 - local_gpua_dimshuffle
- 0.015s - in 66 optimization that were not used (display only those with a runtime > 0)
- 0.002s - local_track_shape_i
- 0.001s - local_gpua_gemm_alpha_merge
- 0.001s - local_dnn_conv_output_merge
- 0.001s - local_gpua_gemm_output_merge
- 0.001s - local_gpua_gemmbatch_alpha_merge
- 0.001s - local_gemm16_alpha_merge
- 0.001s - local_dnn_conv_alpha_merge
- 0.001s - local_dnn_convi_alpha_merge
- 0.001s - local_dnn_convw_alpha_merge
- 0.001s - local_dnn_convw_output_merge
- 0.001s - local_gpua_gemmbatch_output_merge
- 0.001s - local_dnn_convi_output_merge
- 0.001s - local_gemm16_output_merge
- 0.001s - local_log_softmax_dnn
- 0.000s - local_gpua_shape
- 0.000s - local_gpu_contiguous_gpu_contiguous
- 0.000s - local_gpua_abstractconv2d
- 0.000s - local_gpua_assert
- 0.000s - local_gpu_elemwise_careduce
- 0.105904s - ('gpuarray_graph_optimization', 'GraphToGPU', 0, 339, 313) - 0.000s
- GraphToGPUOptimizer gpuarray_graph_optimization
- time io_toposort 0.004s
- Total time taken by local optimizers 0.010s
- times - times applied - Node created - name:
- 0.006s - 3 - 3 - local_gpua_careduce
- 0.003s - 120 - 154 - local_gpua_elemwise
- 0.001s - 13 - 13 - local_gpua_lift_abstractconv2d_graph
- 0.001s - 50 - 50 - local_gpua_dimshuffle
- 0.000s - 3 - 6 - local_gpua_mrg_graph
- 0.000s - 8 - 8 - local_gpua_subtensor_graph
- 0.000s - 3 - 3 - local_gpua_assert_graph
- 0.000s - 7 - 7 - local_gpua_reshape
- 0.000s - 4 - 4 - local_gpua_dot22
- 0.000s - 1 - 1 - local_gpua_crossentropysoftmaxargmax1hotwithbias
- 0.000s - in 1 optimization that were not used (display only those with a runtime > 0)
- 0.019174s - ('gpuarray_cut_transfers', 'EquilibriumOptimizer', 3, 603, 574) - 0.000s
- EquilibriumOptimizer gpuarray_cut_transfers
- time 0.019s for 2 passes
- nb nodes (start, end, max) 603 574 603
- time io_toposort 0.012s
- time in local optimizers 0.004s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.012s 15 (0.000s in global opts, 0.007s io_toposort) - 603 nodes - ('local_cut_gpu_transfers', 15)
- 1 - 0.007s 0 (0.000s in global opts, 0.005s io_toposort) - 574 nodes -
- times - times applied - nb node created - name:
- 0.003s - 15 - 0 - local_cut_gpu_transfers
- 0.001s - in 1 optimization that were not used (display only those with a runtime > 0)
- 0.001s - constant_folding
- 0.000047s - ('InputToGpuArrayOptimizer', 'InputToGpuOptimizer', 1, 313, 313) - 0.000s
- 0.230584s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 45, 313, 313) - 0.137s
- 0.229299s - ('gpua_elemwise_fusion', 'FusionOptimizer', 21, 420, 327) - 0.000s
- FusionOptimizer
- nb_iter 3
- nb_replacement 36
- nb_inconsistency_replace 0
- validate_time 0.000416994094849
- callback_time 0.00961184501648
- time_toposort 0.0111560821533
- 0.156492s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 43, 313, 313) - 0.076s
- 0.071392s - ('specialize', 'EquilibriumOptimizer', 13, 343, 339) - 0.000s
- EquilibriumOptimizer specialize
- time 0.071s for 3 passes
- nb nodes (start, end, max) 343 339 343
- time io_toposort 0.008s
- time in local optimizers 0.035s
- time in global optimizers 0.012s
- time in final optimizers 0.009s
- time in cleanup optimizers 0.000s
- 0 - 0.024s 11 (0.006s in global opts, 0.003s io_toposort) - 343 nodes - ('local_div_to_inv', 6) ('local_mul_specialize', 3) ('local_softmax_with_bias', 1) ('local_argmax_pushdown', 1)
- 1 - 0.025s 1 (0.010s in global opts, 0.003s io_toposort) - 339 nodes - ('crossentropy_to_crossentropy_with_softmax_with_bias', 1)
- 2 - 0.022s 0 (0.006s in global opts, 0.003s io_toposort) - 339 nodes -
- times - times applied - nb node created - name:
- 0.012s - 1 - 1 - crossentropy_to_crossentropy_with_softmax_with_bias
- 0.003s - 3 - 0 - local_mul_specialize
- 0.002s - 6 - 6 - local_div_to_inv
- 0.001s - 1 - 1 - local_argmax_pushdown
- 0.000s - 1 - 1 - local_softmax_with_bias
- 0.039s - in 70 optimization that were not used (display only those with a runtime > 0)
- 0.009s - topo_constant_folding
- 0.005s - local_add_specialize
- 0.003s - local_func_inv
- 0.003s - local_elemwise_alloc
- 0.003s - local_useless_elemwise
- 0.002s - local_one_minus_erf2
- 0.002s - local_one_minus_erf
- 0.002s - local_useless_elemwise_comparison
- 0.001s - local_track_shape_i
- 0.001s - local_abs_merge
- 0.001s - local_mul_switch_sink
- 0.001s - local_useless_switch
- 0.001s - local_expm1
- 0.001s - local_elemwise_sub_zeros
- 0.001s - local_logsoftmax
- 0.001s - local_useless_subtensor
- 0.001s - local_cast_cast
- 0.001s - local_dimshuffle_lift
- 0.001s - local_alloc_unary
- 0.000s - local_mul_to_sqr
- 0.000s - local_remove_useless_assert
- 0.000s - local_useless_slice
- 0.000s - local_pow_specialize
- 0.000s - local_subtensor_remove_broadcastable_index
- 0.000s - local_subtensor_make_vector
- 0.000s - local_zero_div
- 0.000s - local_sum_prod_mul_by_scalar
- 0.000s - local_subtensor_merge
- 0.000s - local_sum_prod_div_dimshuffle
- 0.000s - local_subtensor_of_alloc
- 0.000s - local_grad_log_erfc_neg
- 0.000s - local_subtensor_of_dot
- 0.000s - local_reduce_broadcastable
- 0.000s - local_sumsqr2dot
- 0.000s - local_opt_alloc
- 0.000s - local_scalar_tensor_scalar
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (340, 340, 0)
- init io_toposort 0.00276398658752
- loop time 0.000319004058838
- callback_time 0.0
- Iter 1
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (339, 339, 0)
- init io_toposort 0.00275802612305
- loop time 0.000292062759399
- callback_time 0.0
- Iter 2
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (339, 339, 0)
- init io_toposort 0.00278401374817
- loop time 0.000298976898193
- callback_time 0.0
- 0.055384s - ('scan_eqopt2', 'EquilibriumOptimizer', 11, 346, 343) - 0.000s
- EquilibriumOptimizer scan_eqopt2
- time 0.055s for 2 passes
- nb nodes (start, end, max) 346 343 346
- time io_toposort 0.006s
- time in local optimizers 0.000s
- time in global optimizers 0.048s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.031s 1 (0.027s in global opts, 0.003s io_toposort) - 343 nodes - ('constant_folding', 1)
- 1 - 0.024s 0 (0.021s in global opts, 0.003s io_toposort) - 343 nodes -
- times - times applied - nb node created - name:
- 0.009s - 1 - 0 - constant_folding
- 0.039s - in 6 optimization that were not used (display only those with a runtime > 0)
- 0.007s - scan_merge_inouts
- 0.007s - <theano.scan_module.scan_opt.ScanSaveMem object at 0x10f69fa10>
- 0.007s - remove_constants_and_unused_inputs_scan
- 0.006s - remove_constants_and_unused_inputs_scan
- 0.006s - remove_constants_and_unused_inputs_scan
- 0.006s - <theano.scan_module.scan_opt.ScanMerge object at 0x10f69f750>
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (346, 343, 3)
- init io_toposort 0.00302219390869
- loop time 0.00299000740051
- callback_time 0.000831604003906
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.0029149055481
- loop time 0.000319004058838
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00260496139526
- loop time 0.000286817550659
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.0032901763916
- loop time 0.000686883926392
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00342392921448
- loop time 0.000342845916748
- callback_time 0.0
- Iter 1
- TopoOptimizer constant_folding_for_scan2
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00266885757446
- loop time 0.000299215316772
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs1
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.002690076828
- loop time 0.000297069549561
- callback_time 0.0
- TopoOptimizer scanop_remove_constants_and_unused_inputs2
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00262594223022
- loop time 0.000285863876343
- callback_time 0.0
- TopoOptimizer scanOp_merge_inouts
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00301003456116
- loop time 0.000369071960449
- callback_time 0.0
- TopoOptimizer scanOp_remove_constants_and_unused_inputs3
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00259804725647
- loop time 0.000284910202026
- callback_time 0.0
- 0.050506s - ('ShapeOpt', 'ShapeOptimizer', 2, 330, 330) - 0.000s
- 0.041654s - ('dimshuffle_as_view', 'TopoOptimizer', 24, 313, 313) - 0.005s
- TopoOptimizer dimshuffle_as_view
- nb_node (start, end, changed) (313, 313, 51)
- init io_toposort 0.00297808647156
- loop time 0.0386300086975
- callback_time 0.0270938873291
- 0.039686s - ('local_dnna_conv_inplace', 'TopoOptimizer', 39, 313, 313) - 0.026s
- TopoOptimizer local_dnna_conv_inplace
- nb_node (start, end, changed) (313, 313, 13)
- init io_toposort 0.00298881530762
- loop time 0.0365929603577
- callback_time 0.0307540893555
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.004s - 36 - 72 - local_dnn_convgi_inplace - 36
- -0.004s - 43 - 86 - local_dnn_convgw_inplace - 43
- -0.005s - 55 - 110 - local_dnn_conv_inplace - 56
- 0.000s - in 0 optimization that were not used (display those with runtime greater than 0)
- 0.024131s - ('BlasOpt', 'SeqOptimizer', 12, 343, 343) - 0.000s
- SeqOptimizer BlasOpt time 0.024s for 343/343 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.004802s - ('gemm_optimizer', 'GemmOptimizer', 1, 343, 343) - 0.000s
- GemmOptimizer
- nb_iter 1
- nb_replacement 0
- nb_replacement_didn_t_remove 0
- nb_inconsistency_make 0
- nb_inconsistency_replace 0
- time_canonicalize 0.000934362411499
- time_factor_can 0
- time_factor_list 0
- time_toposort 0.00260996818542
- validate_time 0.0
- callback_time 0.0
- 0.004657s - ('use_c_blas', 'TopoOptimizer', 4, 343, 343) - 0.000s
- TopoOptimizer use_c_blas
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00270009040833
- loop time 0.00190901756287
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.004111s - ('local_dot_to_dot22', 'TopoOptimizer', 0, 343, 343) - 0.000s
- TopoOptimizer local_dot_to_dot22
- nb_node (start, end, changed) (343, 343, 4)
- init io_toposort 0.00284814834595
- loop time 0.0012309551239
- callback_time 0.000388622283936
- 0.003664s - ('local_gemm_to_gemv', 'EquilibriumOptimizer', 3, 343, 343) - 0.000s
- EquilibriumOptimizer local_gemm_to_gemv
- time 0.004s for 1 passes
- nb nodes (start, end, max) 343 343 343
- time io_toposort 0.003s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.004s 0 (0.000s in global opts, 0.003s io_toposort) - 343 nodes -
- 0.003414s - ('local_dot22_to_dot22scalar', 'TopoOptimizer', 2, 343, 343) - 0.000s
- TopoOptimizer local_dot22_to_dot22scalar
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00264620780945
- loop time 0.000739097595215
- callback_time 0.0
- 0.003338s - ('use_scipy_ger', 'TopoOptimizer', 5, 343, 343) - 0.000s
- TopoOptimizer scipy_blas
- nb_node (start, end, changed) (343, 343, 0)
- init io_toposort 0.00294184684753
- loop time 0.000363111495972
- callback_time 0.0
- 0.021946s - ('stabilize', 'EquilibriumOptimizer', 8, 340, 340) - 0.000s
- EquilibriumOptimizer stabilize
- time 0.022s for 1 passes
- nb nodes (start, end, max) 340 340 340
- time io_toposort 0.004s
- time in local optimizers 0.008s
- time in global optimizers 0.006s
- time in final optimizers 0.003s
- time in cleanup optimizers 0.000s
- 0 - 0.022s 0 (0.009s in global opts, 0.004s io_toposort) - 340 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (340, 340, 0)
- init io_toposort 0.00267601013184
- loop time 0.000298976898193
- callback_time 0.0
- 0.020503s - ('mrg_random_make_inplace', 'TopoOptimizer', 50, 313, 313) - 0.015s
- TopoOptimizer random_make_inplace_mrg
- nb_node (start, end, changed) (313, 313, 3)
- init io_toposort 0.00290107727051
- loop time 0.0175631046295
- callback_time 0.0165855884552
- 0.018693s - ('merge2', 'MergeOptimizer', 22, 327, 313) - 0.002s
- MergeOptimizer
- nb fail= 0 merged= 192 constant= 144
- time replace=0.02 validate=0.00 callback=0.01
- 0.017984s - ('scan_eqopt1', 'EquilibriumOptimizer', 1, 330, 330) - 0.000s
- EquilibriumOptimizer scan_eqopt1
- time 0.018s for 1 passes
- nb nodes (start, end, max) 330 330 330
- time io_toposort 0.003s
- time in local optimizers 0.000s
- time in global optimizers 0.015s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.018s 0 (0.015s in global opts, 0.003s io_toposort) - 330 nodes -
- Global, final and clean up optimizers
- Iter 0
- SeqOptimizer all_pushout_opt time 0.015s for 330/330 nodes before/after optimization
- 0.000s for callback
- 0.000s for fgraph.validate()
- 0.003534s - ('remove_constants_and_unused_inputs_scan', 'TopoOptimizer', 0, 330, 330) - 0.000s
- TopoOptimizer scanOp_remove_constants_and_unused_inputs0
- nb_node (start, end, changed) (330, 330, 0)
- init io_toposort 0.00314712524414
- loop time 0.000341892242432
- callback_time 0.0
- 0.003078s - ('scanOp_pushout_nonseqs_ops', 'PushOutNonSeqScan', 1, 330, 330) - 0.000s
- 0.002669s - ('scanOp_pushout_seqs_ops', 'PushOutSeqScan', 2, 330, 330) - 0.000s
- 0.002653s - ('scanOp_pushout_output', 'PushOutScanOutput', 4, 330, 330) - 0.000s
- 0.002631s - ('scan_pushout_dot1', 'PushOutDot1', 3, 330, 330) - 0.000s
- 0.014251s - ('add_destroy_handler', 'AddDestroyHandler', 23, 313, 313) - 0.000s
- 0.010826s - ('merge1', 'MergeOptimizer', 0, 385, 330) - 0.001s
- MergeOptimizer
- nb fail= 0 merged= 109 constant= 54
- time replace=0.01 validate=0.00 callback=0.00
- 0.007993s - ('local_elemwise_alloc', 'TopoOptimizer', 10, 340, 346) - 0.000s
- TopoOptimizer local_elemwise_alloc
- nb_node (start, end, changed) (340, 346, 3)
- init io_toposort 0.00397491455078
- loop time 0.0039701461792
- callback_time 0.00116753578186
- 0.006738s - ('useless', 'TopoOptimizer', 3, 330, 330) - 0.000s
- TopoOptimizer useless
- nb_node (start, end, changed) (330, 330, 0)
- init io_toposort 0.00299000740051
- loop time 0.00368285179138
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.000s - 0 - 3 - local_merge_alloc - 0
- -0.000s - 0 - 3 - local_useless_reduce - 0
- -0.000s - 0 - 3 - local_useless_alloc - 0
- -0.000s - 0 - 7 - local_useless_reshape - 0
- -0.000s - 0 - 22 - local_subtensor_of_alloc - 0
- -0.000s - 0 - 22 - local_subtensor_make_vector - 0
- -0.000s - 0 - 137 - local_useless_switch - 0
- -0.000s - 0 - 22 - local_useless_slice - 0
- -0.000s - 0 - 137 - local_useless_elemwise_comparison - 0
- -0.001s - 0 - 137 - local_useless_elemwise - 0
- 0.000s - in 9 optimization that were not used (display those with runtime greater than 0)
- 0.006556s - ('uncanonicalize', 'EquilibriumOptimizer', 15, 339, 339) - 0.000s
- EquilibriumOptimizer uncanonicalize
- time 0.006s for 1 passes
- nb nodes (start, end, max) 339 339 339
- time io_toposort 0.003s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.003s
- time in cleanup optimizers 0.000s
- 0 - 0.006s 0 (0.003s in global opts, 0.003s io_toposort) - 339 nodes -
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (339, 339, 0)
- init io_toposort 0.00266122817993
- loop time 0.000303983688354
- callback_time 0.0
- 0.005904s - ('local_inplace_sparse_block_outer', 'TopoOptimizer', 31, 313, 313) - 0.000s
- TopoOptimizer local_inplace_sparse_block_outer
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.0051109790802
- loop time 0.000679016113281
- callback_time 0.0
- 0.005808s - ('specialize_device', 'EquilibriumOptimizer', 17, 574, 574) - 0.000s
- EquilibriumOptimizer specialize_device
- time 0.006s for 1 passes
- nb nodes (start, end, max) 574 574 574
- time io_toposort 0.005s
- time in local optimizers 0.000s
- time in global optimizers 0.000s
- time in final optimizers 0.000s
- time in cleanup optimizers 0.000s
- 0 - 0.006s 0 (0.000s in global opts, 0.005s io_toposort) - 574 nodes -
- 0.005567s - ('scanOp_make_inplace', 'ScanInplaceOptimizer', 46, 313, 313) - 0.000s
- 0.005320s - ('AbstractConvCheck', 'TopoOptimizer', 18, 574, 574) - 0.000s
- TopoOptimizer AbstractConvCheck
- nb_node (start, end, changed) (574, 574, 0)
- init io_toposort 0.00467395782471
- loop time 0.000613927841187
- callback_time 0.0
- 0.005308s - ('blas_opt_inplace', 'TopoOptimizer', 34, 313, 313) - 0.000s
- TopoOptimizer InplaceBlasOpt
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00336408615112
- loop time 0.00183486938477
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.004798s - ('cond_make_inplace', 'TopoOptimizer', 47, 313, 313) - 0.000s
- TopoOptimizer cond_make_inplace
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.0043408870697
- loop time 0.000392913818359
- callback_time 0.0
- 0.004715s - ('local_advincsub1_gpua_inplace', 'TopoOptimizer', 25, 313, 313) - 0.000s
- TopoOptimizer local_advincsub1_gpua_inplace
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00419616699219
- loop time 0.000470161437988
- callback_time 0.0
- 0.004631s - ('local_dnn_conv_inplace', 'TopoOptimizer', 38, 313, 313) - 0.000s
- TopoOptimizer local_dnn_conv_inplace
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00288701057434
- loop time 0.00163793563843
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.004615s - ('InplaceGpuBlasOpt', 'TopoOptimizer', 35, 313, 313) - 0.000s
- TopoOptimizer InplaceGpuBlasOpt
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00274896621704
- loop time 0.00176906585693
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.004609s - ('local_inplace_sparseblockgemv', 'TopoOptimizer', 32, 313, 313) - 0.000s
- TopoOptimizer local_inplace_sparseblockgemv
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.0040340423584
- loop time 0.000416040420532
- callback_time 0.0
- 0.004513s - ('gpu_elemwise_fusion', 'FusionOptimizer', 20, 420, 420) - 0.000s
- FusionOptimizer
- nb_iter 1
- nb_replacement 0
- nb_inconsistency_replace 0
- validate_time 0.0
- callback_time 0.0
- time_toposort 0.00397276878357
- 0.004498s - ('c_blas_destructive', 'TopoOptimizer', 37, 313, 313) - 0.000s
- TopoOptimizer c_blas_destructive
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00261497497559
- loop time 0.00184512138367
- callback_time 0.0
- LocalOptGroup
- ---------------------
- --- The Optimizer wasn't successful ---
- 0.004486s - ('gpua_scanOp_make_inplace', 'ScanInplaceOptimizer', 44, 313, 313) - 0.000s
- 0.004417s - ('gpuablas_opt_inplace', 'TopoOptimizer', 36, 313, 313) - 0.000s
- TopoOptimizer InplaceGpuaBlasOpt
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00274109840393
- loop time 0.00157904624939
- callback_time 0.0
- LocalOptGroup
- ---------------------
- time taken - times applied - times tried - name - node_created:
- -0.001s - 10 - 20 - local_inplace_gpuagemm - 10
- 0.000s - in 2 optimization that were not used (display those with runtime greater than 0)
- 0.004336s - ('local_fill_to_alloc', 'TopoOptimizer', 9, 340, 340) - 0.000s
- TopoOptimizer local_fill_to_alloc
- nb_node (start, end, changed) (340, 340, 0)
- init io_toposort 0.0031270980835
- loop time 0.00114893913269
- callback_time 0.0
- 0.004204s - ('local_destructive', 'TopoOptimizer', 48, 313, 313) - 0.000s
- TopoOptimizer CURAND_destructive
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00382399559021
- loop time 0.000334978103638
- callback_time 0.0
- 0.004054s - ('local_inplace_gpu_sparse_block_outer', 'TopoOptimizer', 27, 313, 313) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_outer
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00367403030396
- loop time 0.000312089920044
- callback_time 0.0
- 0.003904s - ('local_inplace_gpu_sparse_block_gemv', 'TopoOptimizer', 26, 313, 313) - 0.000s
- TopoOptimizer local_inplace_gpu_sparse_block_gemv
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00352501869202
- loop time 0.000298976898193
- callback_time 0.0
- 0.003778s - ('random_make_inplace', 'TopoOptimizer', 49, 313, 313) - 0.000s
- TopoOptimizer random_make_inplace
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00336599349976
- loop time 0.000365018844604
- callback_time 0.0
- 0.003698s - ('local_inplace_sparseblockouter', 'TopoOptimizer', 33, 313, 313) - 0.000s
- TopoOptimizer local_inplace_sparseblockouter
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00329613685608
- loop time 0.000308990478516
- callback_time 0.0
- 0.003278s - ('local_IncSubtensor_serialize', 'TopoOptimizer', 5, 330, 330) - 0.000s
- TopoOptimizer pre_local_IncSubtensor_serialize
- nb_node (start, end, changed) (330, 330, 0)
- init io_toposort 0.00257587432861
- loop time 0.000670909881592
- callback_time 0.0
- 0.003210s - ('make_ger_destructive', 'TopoOptimizer', 41, 313, 313) - 0.000s
- TopoOptimizer make_scipy_blas_destructive
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00282907485962
- loop time 0.000349044799805
- callback_time 0.0
- 0.003062s - ('local_gemm16_inplace', 'TopoOptimizer', 40, 313, 313) - 0.000s
- TopoOptimizer local_gemm16_inplace
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00276303291321
- loop time 0.000236988067627
- callback_time 0.0
- 0.003057s - ('local_inplace_incsubtensor1', 'TopoOptimizer', 28, 313, 313) - 0.000s
- TopoOptimizer local_inplace_incsubtensor1
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00274205207825
- loop time 0.00025200843811
- callback_time 0.0
- 0.002935s - ('local_inplace_sparse_block_gemv', 'TopoOptimizer', 30, 313, 313) - 0.000s
- TopoOptimizer local_inplace_sparse_block_gemv
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00262904167175
- loop time 0.000252962112427
- callback_time 0.0
- 0.002929s - ('local_inplace_setsubtensor', 'TopoOptimizer', 29, 313, 313) - 0.000s
- TopoOptimizer local_inplace_setsubtensor
- nb_node (start, end, changed) (313, 313, 0)
- init io_toposort 0.00262403488159
- loop time 0.00025200843811
- callback_time 0.0
- 0.002910s - ('crossentropy_to_crossentropy_with_softmax', 'FromFunctionOptimizer', 14, 339, 339) - 0.000s
- 0.002738s - ('inplace_elemwise_optimizer', 'FromFunctionOptimizer', 42, 313, 313) - 0.000s
- 0.000303s - ('merge3', 'MergeOptimizer', 51, 313, 313) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000046s - ('merge1.2', 'MergeOptimizer', 7, 340, 340) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- 0.000027s - ('merge1.1', 'MergeOptimizer', 4, 330, 330) - 0.000s
- MergeOptimizer
- nb fail= 0 merged= 0 constant= 0
- time replace=0.00 validate=0.00 callback=0.00
- Here are tips to potentially make your code run faster
- (if you think of new ones, suggest them on the mailing list).
- Test them first, as they are not guaranteed to always provide a speedup.
- Sorry, no tip for today.
- Function profiling
- ==================
- Message: Sum of all(14) printed profiles at exit excluding Scan op profile.
- Time in 11 calls to Function.__call__: 1.068115e-03s
- Time in Function.fn.__call__: 9.267330e-04s (86.763%)
- Time in thunks: 8.780956e-04s (82.210%)
- Total compile time: 4.684355e+02s
- Number of Apply nodes: 1
- Theano Optimizer time: 7.320596e+01s
- Theano validate time: 1.033737e+01s
- Theano Linker time (includes C, CUDA code generation/compiling): 3.916804e+02s
- Import time 1.612234e+00s
- Node make_thunk time 3.911215e+02s
- Time in all call to theano.grad() 2.656322e+00s
- Time since theano import 478.217s
- Class
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
- 94.8% 94.8% 0.001s 4.16e-04s C 2 2 theano.gpuarray.basic_ops.HostFromGpu
- 2.3% 97.1% 0.000s 3.38e-06s C 6 6 theano.compile.ops.DeepCopyOp
- 2.3% 99.3% 0.000s 9.89e-06s C 2 2 theano.gpuarray.subtensor.GpuSubtensor
- 0.7% 100.0% 0.000s 1.99e-06s C 3 3 theano.compile.ops.Shape_i
- ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime)
- Ops
- ---
- <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
- 94.8% 94.8% 0.001s 4.16e-04s C 2 2 HostFromGpu(gpuarray)
- 2.3% 97.1% 0.000s 3.38e-06s C 6 6 DeepCopyOp
- 2.3% 99.3% 0.000s 9.89e-06s C 2 2 GpuSubtensor{:int64:}
- 0.7% 100.0% 0.000s 1.99e-06s C 3 3 Shape_i{0}
- ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime)
- Apply
- ------
- <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
- 88.1% 88.1% 0.001s 7.74e-04s 1 1 HostFromGpu(gpuarray)(GpuSubtensor{:int64:}.0)
- 6.6% 94.8% 0.000s 5.82e-05s 1 1 HostFromGpu(gpuarray)(GpuSubtensor{:int64:}.0)
- 1.8% 96.6% 0.000s 1.60e-05s 1 0 GpuSubtensor{:int64:}(<GpuArrayType<None>(float32, (False, False, False, False))>, Constant{128})
- 0.6% 97.1% 0.000s 5.01e-06s 1 0 DeepCopyOp(TensorConstant{-0.577215671539})
- 0.5% 97.6% 0.000s 4.05e-06s 1 0 DeepCopyOp(TensorConstant{-0.577215671539})
- 0.5% 98.1% 0.000s 4.05e-06s 1 0 DeepCopyOp(TensorConstant{-0.577215671539})
- 0.4% 98.5% 0.000s 3.81e-06s 1 0 GpuSubtensor{:int64:}(<GpuArrayType<None>(float32, (False,))>, Constant{128})
- 0.4% 98.9% 0.000s 3.10e-06s 1 0 DeepCopyOp(TensorConstant{-0.577215671539})
- 0.2% 99.1% 0.000s 2.15e-06s 1 0 Shape_i{0}(<GpuArrayType<None>(float32, (False,))>)
- 0.2% 99.3% 0.000s 2.15e-06s 1 0 DeepCopyOp(TensorConstant{-0.577215671539})
- 0.2% 99.6% 0.000s 1.91e-06s 1 0 Shape_i{0}(<GpuArrayType<None>(float32, (False,))>)
- 0.2% 99.8% 0.000s 1.91e-06s 1 0 Shape_i{0}(<GpuArrayType<None>(float32, (False,))>)
- 0.2% 100.0% 0.000s 1.91e-06s 1 0 DeepCopyOp(TensorConstant{-0.577215671539})
- ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
- Optimizer Profile
- -----------------
- SeqOptimizer time 73.202s for -1/-1 nodes before/after optimization
- 23.222s for callback
- 10.337s for fgraph.validate()
- callbacks_time
- <theano.gof.destroyhandler.DestroyHandler object at 0x1291a3f10> , 7.663418293
- <theano.tensor.opt.ShapeFeature object at 0x121562590> , 4.24110126495
- Updater{canonicalize} , 3.05698680878
- <theano.compile.function_module.Supervisor instance at 0x1227d3f38> , 2.56386876106
- <theano.gof.opt.MergeFeature object at 0x1227e7190> , 1.74654364586
- Updater{canonicalize} , 1.31303954124
- Updater{gpuarray_local_optimizations} , 0.261435270309
- Updater{gpuarray_cut_transfers} , 0.261384248734
- Updater{canonicalize} , 0.194210767746
- <theano.gof.destroyhandler.DestroyHandler object at 0x130d80590> , 0.18234872818
- <theano.gof.destroyhandler.DestroyHandler object at 0x12feccdd0> , 0.177083730698
- Updater{specialize} , 0.130860090256
- <theano.gof.toolbox.ReplaceValidate object at 0x12087ee50> , 0.11890411377
- <theano.tensor.opt.ShapeFeature object at 0x1306d5510> , 0.118015050888
- <theano.tensor.opt.ShapeFeature object at 0x12e307fd0> , 0.111938238144
- <theano.compile.function_module.Supervisor instance at 0x130748f38> , 0.0925514698029
- <theano.compile.function_module.Supervisor instance at 0x12e3986c8> , 0.0896117687225
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x122195850> , 0.0774388313293
- Updater{canonicalize} , 0.0659112930298
- <theano.gof.opt.ChangeTracker instance at 0x11f7d8f38> , 0.0626258850098
- <theano.gof.opt.MergeFeature object at 0x12e376410> , 0.0525453090668
- <theano.gof.opt.MergeFeature object at 0x130723d10> , 0.0524771213531
- Updater{canonicalize} , 0.0376682281494
- Updater{canonicalize} , 0.0349521636963
- Updater{gpuarray_local_optimizations} , 0.0294797420502
- Updater{specialize} , 0.026261806488
- <theano.gof.opt.ChangeTracker instance at 0x125a3e1b8> , 0.012256860733
- Updater{local_elemwise_alloc} , 0.00888752937317
- <theano.gof.opt.ChangeTracker instance at 0x124d8a098> , 0.00651884078979
- Updater{pre_local_IncSubtensor_serialize} , 0.00562787055969
- Updater{specialize} , 0.00509214401245
- <theano.gof.toolbox.ReplaceValidate object at 0x130665490> , 0.00423550605774
- <theano.gof.toolbox.ReplaceValidate object at 0x12aea9610> , 0.00402593612671
- Updater{dimshuffle_as_view} , 0.00395131111145
- Updater{topo_constant_folding} , 0.00292682647705
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x130723c10> , 0.00266361236572
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x12e376310> , 0.00266003608704
- Updater{useless} , 0.00235295295715
- <theano.gof.opt.ChangeTracker instance at 0x127399f38> , 0.00211262702942
- Updater{canonicalize} , 0.00179743766785
- Updater{canonicalize} , 0.00169730186462
- Updater{canonicalize} , 0.00153088569641
- Updater{canonicalize} , 0.00149822235107
- Updater{local_inplace_setsubtensor} , 0.0013701915741
- <theano.gof.opt.ChangeTracker instance at 0x12f961908> , 0.00130605697632
- <theano.gof.opt.ChangeTracker instance at 0x1308fe518> , 0.00124955177307
- <theano.gof.opt.ChangeTracker instance at 0x130782518> , 0.00123405456543
- Updater{local_dnna_conv_inplace} , 0.00122761726379
- Updater{specialize} , 0.00121068954468
- <theano.gof.opt.ChangeTracker instance at 0x12e7055f0> , 0.00116181373596
- Updater{gpuarray_local_optimizations} , 0.00108671188354
- Updater{gpuarray_local_optimizations} , 0.00101017951965
- Updater{gpuarray_cut_transfers} , 0.000955104827881
- Updater{constant_folding_for_scan2} , 0.000903367996216
- Updater{gpuarray_cut_transfers} , 0.000853061676025
- Updater{stabilize} , 0.000802278518677
- Updater{specialize} , 0.000703573226929
- Updater{topo_constant_folding} , 0.000659465789795
- <theano.gof.opt.ChangeTracker instance at 0x124bccfc8> , 0.000591039657593
- Updater{specialize} , 0.000572443008423
- <theano.tensor.opt.ShapeFeature object at 0x11b82cf10> , 0.000552177429199
- <theano.tensor.opt.ShapeFeature object at 0x11b873150> , 0.000530242919922
- Updater{dimshuffle_as_view} , 0.000411033630371
- Updater{dimshuffle_as_view} , 0.000401735305786
- Updater{GemmOptimizer} , 0.000359296798706
- Updater{local_dot_to_dot22} , 0.000351905822754
- Updater{topo_constant_folding} , 0.000331401824951
- <theano.gof.opt.MergeFeature object at 0x11c2d6cd0> , 0.000263452529907
- <theano.gof.opt.MergeFeature object at 0x11bf9ae10> , 0.000262498855591
- <theano.gof.opt.ChangeTracker instance at 0x1226b5560> , 0.000254154205322
- <theano.gof.opt.MergeFeature object at 0x11c315290> , 0.000249147415161
- <theano.gof.opt.MergeFeature object at 0x11bf8db90> , 0.000229120254517
- <theano.gof.opt.MergeFeature object at 0x11c787510> , 0.000221729278564
- <theano.gof.opt.MergeFeature object at 0x11c805b50> , 0.000217199325562
- Updater{random_make_inplace_mrg} , 0.000191688537598
- <theano.tensor.opt.ShapeFeature object at 0x11b747410> , 0.000166416168213
- Updater{gpuarray_local_optimizations} , 0.000160932540894
- Updater{gpuarray_local_optimizations} , 0.000154256820679
- <theano.tensor.opt.ShapeFeature object at 0x11c2d6210> , 0.000149726867676
- <theano.tensor.opt.ShapeFeature object at 0x11b7b4110> , 0.000149488449097
- <theano.tensor.opt.ShapeFeature object at 0x11bfcd150> , 0.000147342681885
- Updater{topo_constant_folding} , 0.000140428543091
- Updater{topo_constant_folding} , 0.000139713287354
- Updater{topo_constant_folding} , 0.000137805938721
- <theano.tensor.opt.ShapeFeature object at 0x11c315550> , 0.000135898590088
- Updater{local_dnna_conv_inplace} , 0.000133037567139
- <theano.tensor.opt.ShapeFeature object at 0x11b7ed110> , 0.00013279914856
- <theano.tensor.opt.ShapeFeature object at 0x11bf8d3d0> , 0.00013279914856
- Updater{local_dnna_conv_inplace} , 0.00013279914856
- <theano.tensor.opt.ShapeFeature object at 0x11c835d50> , 0.000131607055664
- <theano.tensor.opt.ShapeFeature object at 0x11bfcd4d0> , 0.000126838684082
- Updater{local_elemwise_alloc} , 0.000117063522339
- <theano.gof.destroyhandler.DestroyHandler object at 0x11bf6eed0> , 0.000104904174805
- Updater{local_elemwise_alloc} , 0.00010085105896
- Updater{InplaceGpuaBlasOpt} , 9.91821289062e-05
- <theano.gof.destroyhandler.DestroyHandler object at 0x11b747c90> , 9.58442687988e-05
- <theano.gof.destroyhandler.DestroyHandler object at 0x11b842050> , 9.48905944824e-05
- <theano.gof.destroyhandler.DestroyHandler object at 0x11c2b9e90> , 7.89165496826e-05
- <theano.gof.destroyhandler.DestroyHandler object at 0x11bfcda10> , 7.72476196289e-05
- Updater{specialize} , 7.67707824707e-05
- <theano.gof.destroyhandler.DestroyHandler object at 0x11c835bd0> , 7.39097595215e-05
- <theano.gof.destroyhandler.DestroyHandler object at 0x11b82c650> , 6.69956207275e-05
- <theano.gof.destroyhandler.DestroyHandler object at 0x11c315690> , 6.67572021484e-05
- Updater{topo_constant_folding} , 6.60419464111e-05
- <theano.gof.opt.MergeFeature object at 0x11b72fe10> , 6.55651092529e-05
- <theano.gof.opt.ChangeTracker instance at 0x12f7506c8> , 6.48498535156e-05
- <theano.gof.destroyhandler.DestroyHandler object at 0x11b842790> , 6.27040863037e-05
- <theano.gof.destroyhandler.DestroyHandler object at 0x11b7b4990> , 6.27040863037e-05
- Updater{specialize} , 6.24656677246e-05
- <theano.gof.destroyhandler.DestroyHandler object at 0x11b7ed990> , 6.103515625e-05
- <theano.gof.opt.MergeFeature object at 0x11b79dd10> , 5.98430633545e-05
- <theano.gof.opt.MergeFeature object at 0x11b7d6d10> , 5.26905059814e-05
- <theano.gof.opt.ChangeTracker instance at 0x12fd22830> , 5.22136688232e-05
- Updater{specialize} , 4.02927398682e-05
- <theano.gof.toolbox.ReplaceValidate object at 0x11bf8de10> , 3.95774841309e-05
- Updater{random_make_inplace_mrg} , 3.67164611816e-05
- Updater{canonicalize} , 3.60012054443e-05
- Updater{random_make_inplace_mrg} , 3.40938568115e-05
- <theano.gof.opt.ChangeTracker instance at 0x130a69170> , 3.38554382324e-05
- <theano.gof.toolbox.ReplaceValidate object at 0x11c670310> , 3.24249267578e-05
- <theano.gof.toolbox.ReplaceValidate object at 0x11bf9ad10> , 3.17096710205e-05
- <theano.gof.toolbox.ReplaceValidate object at 0x11c824a50> , 3.14712524414e-05
- <theano.gof.toolbox.ReplaceValidate object at 0x11c2e45d0> , 3.12328338623e-05
- <theano.gof.opt.ChangeTracker instance at 0x12fad9290> , 3.09944152832e-05
- Updater{local_dot_to_dot22} , 3.09944152832e-05
- Updater{canonicalize} , 3.0517578125e-05
- <theano.gof.opt.ChangeTracker instance at 0x11bfd3950> , 3.00407409668e-05
- Updater{local_dot_to_dot22} , 2.98023223877e-05
- <theano.gof.toolbox.ReplaceValidate object at 0x11b79db90> , 2.93254852295e-05
- Updater{specialize} , 2.93254852295e-05
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x11bf8da50> , 2.93254852295e-05
- Updater{canonicalize} , 2.93254852295e-05
- <theano.gof.toolbox.ReplaceValidate object at 0x11b633b50> , 2.86102294922e-05
- <theano.gof.opt.ChangeTracker instance at 0x11c2f63b0> , 2.83718109131e-05
- Updater{topo_constant_folding} , 2.8133392334e-05
- <theano.gof.toolbox.ReplaceValidate object at 0x11c315150> , 2.76565551758e-05
- Updater{specialize} , 2.76565551758e-05
- Updater{canonicalize} , 2.67028808594e-05
- Updater{canonicalize} , 2.64644622803e-05
- Updater{topo_constant_folding} , 2.55107879639e-05
- Updater{specialize} , 2.55107879639e-05
- Updater{canonicalize} , 2.45571136475e-05
- <theano.gof.opt.ChangeTracker instance at 0x11bfa3518> , 2.43186950684e-05
- <theano.gof.toolbox.ReplaceValidate object at 0x11b7d6b50> , 2.40802764893e-05
- <theano.gof.opt.MergeFeature object at 0x11b859d50> , 2.31266021729e-05
- <theano.gof.opt.ChangeTracker instance at 0x11c867b48> , 2.31266021729e-05
- <theano.gof.opt.MergeFeature object at 0x11b82cb50> , 2.24113464355e-05
- <theano.gof.opt.ChangeTracker instance at 0x11c848908> , 2.14576721191e-05
- Updater{local_dot22_to_dot22scalar} , 2.121925354e-05
- <theano.gof.opt.ChangeTracker instance at 0x11c31bc68> , 2.02655792236e-05
- <theano.compile.function_module.Supervisor instance at 0x11c3177e8> , 2.00271606445e-05
- <theano.compile.function_module.Supervisor instance at 0x11bfc8320> , 1.97887420654e-05
- <theano.compile.function_module.Supervisor instance at 0x11c2e1ab8> , 1.93119049072e-05
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x11bf9acd0> , 1.93119049072e-05
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x11c805a90> , 1.90734863281e-05
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x11c2d6fd0> , 1.8835067749e-05
- <theano.compile.function_module.Supervisor instance at 0x11c846680> , 1.83582305908e-05
- <theano.compile.function_module.Supervisor instance at 0x11c847710> , 1.74045562744e-05
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x11c315110> , 1.74045562744e-05
- Updater{constant_folding_for_scan2} , 1.71661376953e-05
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x11b72fcd0> , 1.71661376953e-05
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x11c787d90> , 1.6450881958e-05
- <theano.compile.function_module.Supervisor instance at 0x11bfa1c68> , 1.6450881958e-05
- Updater{topo_constant_folding} , 1.52587890625e-05
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x11b7d6cd0> , 1.45435333252e-05
- <theano.gof.toolbox.ReplaceValidate object at 0x11b859b90> , 1.43051147461e-05
- Updater{topo_constant_folding} , 1.43051147461e-05
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x11b79dcd0> , 1.4066696167e-05
- Updater{topo_constant_folding} , 1.38282775879e-05
- <theano.compile.function_module.Supervisor instance at 0x11b794fc8> , 1.31130218506e-05
- Updater{topo_constant_folding} , 1.28746032715e-05
- <theano.gof.toolbox.ReplaceValidate object at 0x11b82c990> , 1.21593475342e-05
- Updater{constant_folding_for_scan2} , 1.21593475342e-05
- <theano.gof.opt.ChangeTracker instance at 0x12f9d21b8> , 1.21593475342e-05
- <theano.compile.function_module.Supervisor instance at 0x11b73bef0> , 1.21593475342e-05
- Updater{canonicalize} , 1.07288360596e-05
- Updater{topo_constant_folding} , 1.07288360596e-05
- <theano.gof.opt.ChangeTracker instance at 0x11b7b29e0> , 1.07288360596e-05
- Updater{canonicalize} , 1.02519989014e-05
- Updater{canonicalize} , 1.02519989014e-05
- <theano.compile.function_module.Supervisor instance at 0x11b7e0128> , 1.00135803223e-05
- Updater{topo_constant_folding} , 9.77516174316e-06
- Updater{topo_constant_folding} , 9.29832458496e-06
- <theano.gof.opt.ChangeTracker instance at 0x11b7f0ab8> , 9.05990600586e-06
- Updater{canonicalize} , 9.05990600586e-06
- <theano.gof.opt.ChangeTracker instance at 0x11b75a5a8> , 9.05990600586e-06
- Updater{topo_constant_folding} , 9.05990600586e-06
- Updater{topo_constant_folding} , 8.82148742676e-06
- Updater{canonicalize} , 8.10623168945e-06
- <theano.gof.opt.ChangeTracker instance at 0x12f745e18> , 8.10623168945e-06
- Updater{topo_constant_folding} , 8.10623168945e-06
- Updater{canonicalize} , 7.86781311035e-06
- Updater{canonicalize} , 7.15255737305e-06
- Updater{topo_constant_folding} , 6.91413879395e-06
- Updater{topo_constant_folding} , 6.67572021484e-06
- Updater{canonicalize} , 6.67572021484e-06
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x11b859d10> , 6.19888305664e-06
- <theano.gof.toolbox.PreserveVariableAttributes object at 0x11b82cb10> , 5.96046447754e-06
- Updater{gpuarray_local_optimizations} , 5.72204589844e-06
- <theano.compile.function_module.Supervisor instance at 0x11b82ac20> , 4.05311584473e-06
- <theano.compile.function_module.Supervisor instance at 0x11b86f4d0> , 4.05311584473e-06
- time - (name, class, index, nodes before, nodes after) - validate time
- 21.451680s - ('canonicalize', 'EquilibriumOptimizer', 43, 3590, 3590)
- EquilibriumOptimizer
- time 21.446s for 7 passes
- nb nodes (start, end, max) 7639 4559 7639
- time io_toposort 1.053s
- time in local optimizers 14.061s
- time in global optimizers 0.000s
- time in final optimizers 1.266s
- time in cleanup optimizers 4.601s
- 0 - 10.811s 4886 (0.457s in global opts, 0.388s io_toposort) - 8311 nodes - ('MergeOptimizer', 1791) ('local_useless_fill', 647) ('local_mul_canonizer', 378) ('local_fill_sink', 315) ('local_neg_to_mul', 308) ...
- 1 - 5.488s 1785 (0.150s in global opts, 0.434s io_toposort) - 7026 nodes - ('MergeOptimizer', 658) ('local_dimshuffle_lift', 230) ('local_mul_canonizer', 218) ('local_fill_sink', 203) ('local_upcast_elemwise_constant_inputs', 143) ...
- 2 - 1.574s 501 (0.064s in global opts, 0.051s io_toposort) - 5477 nodes - ('MergeOptimizer', 144) ('local_fill_sink', 114) ('local_useless_fill', 57) ('local_zero_div', 57) ('local_sum_prod_div_dimshuffle', 56) ...
- 3 - 0.911s 123 (0.056s in global opts, 0.049s io_toposort) - 5248 nodes - ('MergeOptimizer', 60) ('local_dimshuffle_lift', 56) ('local_useless_fill', 3) ('local_mul_zero', 3) ('topo_constant_folding', 1)
- 4 - 0.762s 67 (0.048s in global opts, 0.041s io_toposort) - 4568 nodes - ('MergeOptimizer', 32) ('local_sum_prod_div_dimshuffle', 28) ('local_zero_div', 3) ('local_fill_sink', 3) ('topo_constant_folding', 1)
- 5 - 1.199s 56 (0.441s in global opts, 0.047s io_toposort) - 4559 nodes - ('local_dimshuffle_lift', 28) ('MergeOptimizer', 28)
- 6 - 0.702s 0 (0.051s in global opts, 0.044s io_toposort) - 4559 nodes -
- times - times applied - nb node created - name:
- 4.601s - 2713 - 25 - MergeOptimizer
- 2.269s - 599 - 1568 - local_mul_canonizer
- 2.156s - 186 - 975 - local_greedy_distributor
- 1.684s - 635 - 790 - local_fill_sink
- 1.536s - 572 - 1780 - local_dimshuffle_lift
- 1.266s - 15 - 0 - topo_constant_folding
- 1.118s - 18 - 36 - local_reshape_to_dimshuffle
- 0.817s - 260 - 1317 - local_mul_zero
- 0.817s - 273 - 445 - local_add_canonizer
- 0.378s - 803 - 0 - local_useless_fill
- 0.360s - 161 - 483 - local_upcast_elemwise_constant_inputs
- 0.330s - 308 - 608 - local_neg_to_mul
- 0.327s - 168 - 504 - local_sum_prod_div_dimshuffle
- 0.230s - 24 - 48 - local_subtensor_merge
- 0.171s - 134 - 0 - local_cut_gpu_transfers
- 0.168s - 3 - 3 - local_useless_elemwise
- 0.152s - 161 - 571 - local_shape_to_shape_i
- 0.136s - 90 - 180 - local_zero_div
- 0.120s - 36 - 108 - local_mul_switch_sink
- 0.089s - 33 - 99 - local_div_switch_sink
- 0.077s - 18 - 9 - local_useless_switch
- 0.065s - 9 - 0 - local_join_1
- 0.053s - 108 - 32 - local_subtensor_make_vector
- 0.049s - 9 - 9 - local_useless_dimshuffle_in_reshape
- 0.035s - 31 - 62 - local_inv_canon
- 0.021s - 19 - 0 - local_pow_canonicalize
- 0.017s - 14 - 28 - local_subtensor_lift
- 0.015s - 18 - 0 - local_intdiv_by_one
- 0.874s - in 61 optimization that were not used (display only those with a runtime > 0)
- 0.189s - local_func_inv
- 0.119s - local_one_minus_erf2
- 0.099s - local_merge_switch_same_cond
- 0.094s - local_useless_elemwise_comparison
- 0.067s - local_track_shape_i
- 0.056s - local_fill_cut
- 0.046s - local_expm1
- 0.041s - local_cast_cast
- 0.034s - local_one_minus_erf
- 0.034s - local_IncSubtensor_serialize
- 0.018s - local_useless_subtensor
- 0.010s - local_sum_prod_all_to_none
- 0.008s - local_lift_transpose_through_dot
- 0.007s - local_useless_slice
- 0.007s - local_op_of_op
- 0.006s - local_useless_reduce
- 0.005s - local_sumsqr2dot
- 0.005s - local_dimshuffle_no_inplace_at_canonicalize
- 0.005s - f
- 0.005s - local_reduce_join
- 0.004s - local_subtensor_remove_broadcastable_index
- 0.003s - local_0_dot_x
- 0.002s - local_abs_lift
- 0.001s - local_useless_reshape
- 0.001s - local_incsubtensor_of_zeros
- 0.001s - local_subtensor_of_dot
- 0.001s - local_subtensor_of_alloc
- 0.001s - local_reshape_lift
- 0.001s - local_canonicalize_alloc
- 0.001s - local_useless_inc_subtensor
- 0.000s - local_useless_inc_subtensor_alloc
- 0.000s - local_useless_alloc
- 0.000s - local_merge_alloc
- 0.000s - local_setsubtensor_of_constants
- 0.000s - local_scalar_tensor_scalar
- Global, final and clean up optimizers
- Iter 0
- TopoOptimizer topo_constant_folding
- nb_node (start, end, changed) (2, 0, 2)
- init io_toposort 3.48091125488e-05
- loop time 0.000529050827026
- callback_time 0.000237703323364
- Error in atexit._run_exitfuncs:
- Traceback (most recent call last):
- File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
- func(*targs, **kargs)
- File "/Users/Ramana/projects/macvnev/lib/python2.7/site-packages/theano/compile/profiling.py", line 103, in _atexit_print_fn
- n_apply_to_print=config.profiling.n_apply)
- File "/Users/Ramana/projects/macvnev/lib/python2.7/site-packages/theano/compile/profiling.py", line 1256, in summary
- self.optimizer_profile[1])
- File "/Users/Ramana/projects/macvnev/lib/python2.7/site-packages/theano/gof/opt.py", line 337, in print_profile
- level=level + 1)
- File "/Users/Ramana/projects/macvnev/lib/python2.7/site-packages/theano/gof/opt.py", line 2588, in print_profile
- o.print_profile(stream, prof, level + 2)
- File "/Users/Ramana/projects/macvnev/lib/python2.7/site-packages/theano/gof/opt.py", line 892, in print_profile
- callback_time, callbacks_time, nb_merged, nb_constant) = prof
- ValueError: too many values to unpack
- Error in sys.exitfunc:
- Traceback (most recent call last):
- File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
- func(*targs, **kargs)
- File "/Users/Ramana/projects/macvnev/lib/python2.7/site-packages/theano/compile/profiling.py", line 103, in _atexit_print_fn
- n_apply_to_print=config.profiling.n_apply)
- File "/Users/Ramana/projects/macvnev/lib/python2.7/site-packages/theano/compile/profiling.py", line 1256, in summary
- self.optimizer_profile[1])
- File "/Users/Ramana/projects/macvnev/lib/python2.7/site-packages/theano/gof/opt.py", line 337, in print_profile
- level=level + 1)
- File "/Users/Ramana/projects/macvnev/lib/python2.7/site-packages/theano/gof/opt.py", line 2588, in print_profile
- o.print_profile(stream, prof, level + 2)
- File "/Users/Ramana/projects/macvnev/lib/python2.7/site-packages/theano/gof/opt.py", line 892, in print_profile
- callback_time, callbacks_time, nb_merged, nb_constant) = prof
- ValueError: too many values to unpack
- Segmentation fault: 11
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement