Name	Name	Last commit message	Last commit date
parent directory ..
.gitkeep	.gitkeep
License.txt	License.txt
README.md	README.md
TensorFlow_HelloWorld.py	TensorFlow_HelloWorld.py
run.sh	run.sh
sample.json	sample.json
third-party-programs.txt	third-party-programs.txt

`TensorFlow HelloWorld` Sample

TensorFlow* is a widely-used machine learning framework in the deep learning arena, demanding efficient computational resource utilization. To take full advantage of Intel® architecture and to extract maximum performance, the TensorFlow framework has been optimized using Intel® oneAPI Deep Neural Networks (Intel® oneDNN) primitives. This sample demonstrates how to train an example neural network and shows how Intel-optimized TensorFlow enables Intel oneDNN calls by default.

Optimized for	Description
OS	Linux* Ubuntu* 18.0.x and later, Windows* 10
Hardware	Intel® Xeon® Scalable processor family or newer
Software	Intel® AI Analytics Toolkit
What you will learn	How to get started to use Intel® Optimization for TensorFlow*
Time to complete	10 minutes

Purpose

This sample code shows how to get started with Intel® Optimization for TensorFlow*. It implements an example neural network with one convolution layer and one ReLU layer. Developers can quickly build and train a Tensorflow neural network using a simple python code. Also, by controlling the build-in environment variable, the sample attempts to explicitly show how Intel® oneDNN Primitives are called and their performance during the neural network training.

Intel-optimized Tensorflow is available as part of the Intel® AI Analytics Toolkit. For more information on the optimizations and performance data, see this blog post TensorFlow* Optimizations on Modern Intel® Architecture.

Key implementation details

Please export the environment variable ONEDNN_VERBOSE=1 to display the deep learning primitives trace during execution.

The training data is generated by np.random.
The neural network with one convolution layer and one ReLU layer is created by tf.nn.conv2d and tf.nn.relu.
The TF session is inistialized by tf.global_variables_initializer.

The train is implemented via the below for-loop:

for epoch in range(0, EPOCHNUM):
    for step in range(0, BS_TRAIN):
        x_batch = x_data[step*N:(step+1)*N, :, :, :]
        y_batch = y_data[step*N:(step+1)*N, :, :, :]
        s.run(train, feed_dict={x: x_batch, y: y_batch})

Note: For convenience, code line os.environ["ONEDNN_VERBOSE"] = "1" has been added in the body of the script as an alternative method to setting this variable.

Runtime settings for ONEDNN_VERBOSE, KMP_AFFINITY, and Inter/Intra-op Threads are set within the script. You can read more about these settings in this dedicated document: Maximize TensorFlow Performance on CPU: Considerations and Recommendations for Inference Workloads

License

Code samples are licensed under the MIT license. See License.txt for details.

Third party program Licenses can be found here: third-party-programs.txt

Build and Run the Sample on your Local Machine

These instructions demonstrate how to build and run a sample on a machine where you have installed the Intel AI Analytics Toolkit. If you would like to try a sample without installing a toolkit, see Running Samples in DevCloud.

Pre-requirement

TensorFlow is ready for use once you finish the Intel AI Analytics Toolkit installation. You can refer to the oneAPI main page for toolkit installation and the Toolkit Getting Started Guide for Linux for post-installation steps and scripts.

Activate conda environment With Root Access

Note: If you have not already done so, set up your CLI environment by sourcing the setvars script located in the root of your oneAPI installation.

Linux Sudo: . /opt/intel/oneapi/setvars.sh

Linux User: . ~/intel/oneapi/setvars.sh

Windows: C:\Program Files(x86)\Intel\oneAPI\setvars.bat

For more information on environment variables, see Use the setvars Script for Linux or macOS, or Windows.

conda activate tensorflow

please replace ~/intel/oneapi for your oneapi installation path.

Activate conda environment Without Root Access (Optional)

By default, the Intel AI Analytics Toolkit is installed in the inteloneapi folder, which requires root privileges to manage it. If you would like to bypass using root access to manage your conda environment, then you can clone your desired conda environment using the following command:

conda create --name user_tensorflow --clone tensorflow

Then activate your conda environment with the following command:

conda activate user_tensorflow

Running the Sample

To run the program on Linux*, type the following command in the terminal with Python installed:

Navigate to the directory with the TensorFlow sample:

cd ~/oneAPI-samples/AI-and-Analytics/Getting-Started Samples/IntelTensorFlow_GettingStarted

Run the sample:

    python TensorFlow_HelloWorld.py

Example of Output

With successful execution, it will print out the following results:

    0 0.4147554
    1 0.3561021
    2 0.33979267
    3 0.33283564
    4 0.32920069
    [CODE_SAMPLE_COMPLETED_SUCCESSFULLY]

If you export the ONEDNN_VERBOSE as 1 in the command line, the onednn run-time verbose trace should look similar to what is shown below:

export ONEDNN_VERBOSE=1
Windows: set ONEDNN_VERBOSE=1

Notes：the historical environment variables include DNNL_VERBOSE, MKLDNN_VERBOSE

Then run the sample again:

python TensorFlow_HelloWorld.py

You will see the verbose output:

2022-04-24 16:56:02.497963: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
onednn_verbose,info,oneDNN v2.5.0 (commit N/A)
onednn_verbose,info,cpu,runtime:OpenMP
onednn_verbose,info,cpu,isa:Intel AVX-512 with Intel DL Boost
onednn_verbose,info,gpu,runtime:none
onednn_verbose,info,prim_template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,exec,cpu,reorder,jit:uni,undef,src_f32::blocked:cdba:f dst_f32:p:blocked:Acdb16a:f,,,10x4x3x3,0.00195312
onednn_verbose,exec,cpu,convolution,brgconv:avx512_core,forward_training,src_f32::blocked:acdb:f wei_f32:p:blocked:Acdb16a:f bia_f32::blocked:a:f dst_f32::blocked:acdb:f,attr-post-ops:eltwise_relu ,alg:convolution_direct,mb,4.96411
onednn_verbose,exec,cpu,convolution,jit:avx512_common,backward_weights,src_f32::blocked:acdb:f wei_f32:p:blocked:Acdb16a:f bia_undef::undef::f dst_f32::blocked:acdb:f,,alg:convolution_direct,mb,0.567871
...

Please see the oneDNN Developer's Guide for more details on the verbose log.

Running The Sample In DevCloud (Optional)

Please refer to using samples in DevCloud for general usage instructions.

Submit The Sample in Batch Mode

Navigate to the directory with the TensorFlow sample:

cd ~/oneAPI-samples/AI-and-Analytics/Getting-Started Samples/IntelTensorFlow_GettingStarted

Submit this "TensorFlow_HelloWorld" workload on the selected node with the run script.

./q ./run.sh

the run.sh contains all the instructions needed to run this "TensorFlow_HelloWorld" workload

Troubleshooting

If an error occurs, troubleshoot the problem using the Diagnostics Utility for Intel® oneAPI Toolkits. Learn more

Using Visual Studio Code* (Optional)

You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations, and browse and download samples.

The basic steps to build and run a sample using VS Code include:

Download a sample using the extension Code Sample Browser for Intel oneAPI Toolkits.
Configure the oneAPI environment with the extension Environment Configurator for Intel oneAPI Toolkits.
Open a Terminal in VS Code (Terminal>New Terminal).
Run the sample in the VS Code terminal using the instructions below.
(Linux only) Debug your GPU application with GDB for Intel® oneAPI toolkits using the Generate Launch Configurations extension.

To learn more about the extensions, see Using Visual Studio Code with Intel® oneAPI Toolkits.

After learning how to use the extensions for Intel oneAPI Toolkits, return to this readme for instructions on how to build and run a sample.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IntelTensorFlow_GettingStarted

IntelTensorFlow_GettingStarted

README.md

`TensorFlow HelloWorld` Sample

Purpose

Key implementation details

Note: For convenience, code line os.environ["ONEDNN_VERBOSE"] = "1" has been added in the body of the script as an alternative method to setting this variable.

License

Build and Run the Sample on your Local Machine

Pre-requirement

Activate conda environment With Root Access

Activate conda environment Without Root Access (Optional)

Running the Sample

Example of Output

Running The Sample In DevCloud (Optional)

Submit The Sample in Batch Mode

Troubleshooting

Using Visual Studio Code* (Optional)

Files

IntelTensorFlow_GettingStarted

Directory actions

More options

Directory actions

More options

Latest commit

History

IntelTensorFlow_GettingStarted

Folders and files

parent directory

README.md

TensorFlow HelloWorld Sample

Purpose

Key implementation details

Note: For convenience, code line os.environ["ONEDNN_VERBOSE"] = "1" has been added in the body of the script as an alternative method to setting this variable.

License

Build and Run the Sample on your Local Machine

Pre-requirement

Activate conda environment With Root Access

Activate conda environment Without Root Access (Optional)

Running the Sample

Example of Output

Running The Sample In DevCloud (Optional)

Submit The Sample in Batch Mode

Troubleshooting

Using Visual Studio Code* (Optional)

`TensorFlow HelloWorld` Sample