

This makes sense because for x86 it is expected to use tiled data layout but I could not understand selection of tiling factor here also the tiling factor is different for each layer. On the other hand, it is also a very important one, because it results in much higher material savings than cross-section optimization. Placeholder = PLACEHOLDER Ĭonv2d_NCHWc(n, oc_chunk, oh, ow, oc_block) += (placeholder*placeholder)Īs you can see it uses conv2d_NCHWc operator instead of conv2d. Optimal Layout 2 3 12 As was indicated in Chapter 1, layout optimization is probably the most difficult class of problems in structural optimization. Here is the sample output of what extract task prints when layout is specified as NCHW :

each input is tiled with different tile size. from publication: Optimal multi-floor plant layout with consideration of safety distance based on.
#OPTIMAL LAYOUT 2 DOWNLOAD#
I ran the script tune_network_x86.py inside tutorial section and in extract task output, I could see each task is using different layout i.e. Download scientific diagram Optimal layout of example 2, without safety considerations. I am not using AutoTVM, I am using AutoScheduler and I could see change in layout way before starting the tuning. When Latency(transform(a, b) + conv2(x=b)) is better than Latency(conv2(x=a)), a layout transform will be inserted.įor (3), you could refer to this paper for details: For example, when determining the of conv2, graph tuner leverages the following dynamic programming equation: Optimal Experimental Design Optimal Design (examples: D-optimal and I-optimal) Pros Great for creating empirical models of many forms (especially useful if using the linear regression approach) Useful for constrained design spaces Optimal designs for many linear regression models are the standard designs (i.e. The graph tuner is based on 1) at most 20 candidates from each conv2d’s tuning log (each candidate has different )., and 2) the benchmarked layout transform latency.

You tuned the model using AutoTVM followed by the graph tuner.Since the process is (tuning conv2ds) -> (insert layout transform), the layout transform latency won’t be included to either of the conv2d latency during the tuning process. In this case, a layout transform is inserted when building the model. As you can imagine, it’s possible that the first conv2d is NCHW8c and the second conv2d is NCHW4c. In this case, is selected based on the best schedule of each tuning task (e.g., conv2d) from the tuning log. Since most default schedules use the same, there is almost no layout transform overhead, but the might not be optimal, of course. In this case, is selected based on the default TOPI schedule. You didn’t tune the model using AutoTVM.
