Intel AI Frameworks Cheat Sheets

Kevin Ta Rachel Oberman Preethi Venkatesh

Get started with Intel® Optimization for TensorFlow* and Intel® Extension for TensorFlow* using the following commands.

Intel® Optimization for TensorFlow*: A Public Release from Google

Features and optimizations for TensorFlow* on Intel hardware are frequently upstreamed and included in stock TensorFlow* releases. As of TensorFlow* v2.9, Intel® oneAPI Deep Neural Network Library (oneDNN) optimization is automatically enabled.

For more information, see TensorFlow.

Basic Installation Using PyPI*	pip install tensorflow
Basic Installation Using Anaconda*	conda install -c conda-forge tensorflow
Import TensorFlow	import tensorflow as tf
Capture a Verbose Log (Command Prompt)	export ONEDNN_VERBOSE=1
Parallelize Execution (in the Code)	tf.config.threading.set_intra_op_parallelism_threads(<number of physical cores per socket>) tf.config.threading.set_inter_op_parallelism_threads(<number of sockets>) tf.config.set_soft_device_placement(True) # Users could tune the INTRAOP and INTEROP setting based on the workloads
Parallelize Execution (Command Prompt)	export TF_NUM_INTRAOP_THREADS=<number of physical cores per socket> export TF_NUM_INTEROP_THREADS=<number of sockets> # Users could tune the INTRAOP and INTEROP setting based on the workloads
Non-Uniform Memory Access (NUMA)	numactl --cpunodebind N --membind N python <script>
Enable Keras Mixed Precision with BF16	from tf.keras import mixed_precision mixed_precision.set_global_policy('mixed_bfloat16')

Intel® Optimization for TensorFlow*: A Public Release from Intel

In addition to the performance tuning options listed under the Google public release, the Intel public release offers OpenMP* optimizations for further performance enhancements.

For additional installation methods, see the Intel® Optimization for TensorFlow* Installation Guide.

For more information about performance, see the Maximize TensorFlow* Performance on CPU and Getting Started with Mixed Precision Support in oneDNN Bfloat16.

Basic Installation Using PyPI*	pip install intel-tensorflow
Basic Installation Using Anaconda*	conda install tensorflow (Linux/MacOS) conda install tensorflow-mkl (Windows)
Import TensorFlow	import tensorflow as tf
Capture a Verbose Log (Command Prompt)	export ONEDNN_VERBOSE=1
Parallelize Execution (in the Code)	tf.config.threading.set_intra_op_parallelism_threads(<number of physical cores per socket>) tf.config.threading.set_inter_op_parallelism_threads(<number of sockets>) tf.config.set_soft_device_placement(True) # Users could tune the INTRAOP and INTEROP setting based on the workloads
Parallelize Execution (Command Prompt)	export TF_NUM_INTRAOP_THREADS=<number of physical cores per socket> export TF_NUM_INTEROP_THREADS=<number of sockets> # Users could tune the INTRAOP and INTEROP setting based on the workloads
Non-Uniform Memory Access (NUMA)	numactl --cpunodebind N --membind N python <script>
Enable Keras Mixed Precision with BF16	from tf.keras import mixed_precision mixed_precision.set_global_policy('mixed_bfloat16')
Set the Maximum Number of Threads (Command Prompt)	export OMP_NUM_THREADS=<number of physical cores per socket>
Bind OpenMP Threads to Physical Processing Units	export KMP_AFFINITY=granularity=fine,compact,1,0
Set a Wait Time (ms) After Completing the Execution of a Parallel Region Before Sleeping	export KMP_BLOCKTIME=<time> # Recommended to be to 0 for CNN or 1 for non-CNN (user should verify empirically)
Print an OpenMP Runtime Library Env Variables During Execution	export KMP_SETTINGS=TRUE

Intel® Extension for TensorFlow*

This extension provides the most up-to-date features and optimizations on Intel hardware, supporting both Intel CPU and Intel GPU devices, most of which will eventually be upstreamed to stock TensorFlow* releases. Additionally, while users can get many optimization benefits by default without needing an additional set up, Intel® Extension for TensorFlow* provides further tuning and custom operations to boost performance even more.

For additional installation methods, see the Intel® Extension for TensorFlow* Installation Guide.

For more information, see Intel® Extension for TensorFlow*.

Basic Installation Using PyPI*	pip install --upgrade intel-extension-for-tensorflow[gpu] # Install for GPU pip install --upgrade intel-extension-for-tensorflow[cpu] # Install for CPU [Experimental]
Import Intel® Extension for TensorFlow*	import intel_extension_for_tensorflow as itex
Get the Current XPU Backend Type	itex.get_backend()
Set the Specific Backend Type (in the Code): Set by Default	itex.set_backend('GPU') # 'CPU'
Set the Specific Backend Type (Command Prompt): Set by Default	export ITEX_XPU_BACKEND="GPU" # "CPU"
Advanced Automatic Mixed Precision (in the Code): A Basic Configuration with Improved Inference Speed with Reduced Memory Consumption	auto_mixed_precision_options = itex.AutoMixedPrecisionOptions() auto_mixed_precision_options.data_type = itex.BFLOAT16 #itex.FLOAT16 graph_options = itex.GraphOptions(auto_mixed_precision_options=auto_mixed_precision_options) graph_options.auto_mixed_precision = itex.ON config = itex.ConfigProto(graph_options=graph_options) itex.set_config(config)
Advanced Automatic Mixed Precision (Command Prompt): A Basic Configuration with Improved Inference Speed with Reduced Memory Consumption	export ITEX_AUTO_MIXED_PRECISION=1 export ITEX_AUTO_MIXED_PRECISION_DATA_TYPE="BFLOAT16" # or "FLOAT16"
Customized AdamW Optimizer (in the Code)	itex.ops.AdamWithWeightDecayOptimizer( weight_decay_rate=0.001, learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, name='Adam', exclude_from_weight_decay=["LayerNorm", "layer_norm", "bias"], **kwargs )
Customized Layer Normalization (in the Code)	itex.ops.LayerNormalization( axis=-1, epsilon=0.001, center=True, scale=True, beta_initializer='zeros', gamma_initializer='ones', beta_regularizer=None, gamma_regularizer=None, beta_constraint=None, gamma_constraint=None, **kwargs )
Customized GELU (in the Code)	itex.ops.gelu( features, approximate=False, name=None )
Customized LSTM (in the Code)	itex.ops.ItexLSTM( 200, activation='tanh', recurrent_activation='sigmoid', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', **kwargs )

For more information and support, or to report any issues, see:

Intel® Extension for TensorFlow* Issues on GitHub*

TensorFlow* Issues on GitHub*

Intel® AI Analytics Toolkit Forum

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® Optimization for TensorFlow* and Intel® Extension for TensorFlow* Cheat Sheet

Intel® Optimization for TensorFlow*: A Public Release from Google

Intel® Optimization for TensorFlow*: A Public Release from Intel

Intel® Extension for TensorFlow*

Product and Performance Information