1.3. Performance Results

AN 856: K-Mean Clustering with the Intel® FPGA SDK for OpenCL™

Download PDF

ID 683395

Date 6/12/2018

Version current

Public

Visible to Intel only — GUID: tdc1528308463463

Ixiasoft

View Details

1.3. Performance Results

We compared the performance of this implementation on an FPGA with the optimized k-mean implementation on a CPU. For both FPGA and CPU runs, the same data set was used.

The FPGA used for performance comparison was an Intel® Arria® 10 GX FPGA Development Kit. The FPGA was programmed with Intel® FPGA SDK for OpenCL™ Version 17.1 Update 1. During testing, the FPGA had an f_MAX of 320 MHz.

The CPU used for performance comparison was an Intel® Xeon® E5-2680 (24 cores, no hyperthreading).

The following table shows the time to converge on acceptable clusters for data with various data sizes.

Data Size (bytes)	FPGA		CPU
Data Size (bytes)	Time with initialization method 1 (ms)	Time with initialization method 2 (ms)	Time (ms)
512	0.028	0.016	0.065
1024	0.042	0.032	0.573
2048	0.051	0.037	0.627
4096	0.089	0.039	0.804
8192	0.105	0.044	0.919

In this experiment, the number of clusters are set to 10.

Each data set includes 2 features of floating type and different numbers of input data sets (512 to 8192) are used to compare the performance of FPGA and CPU.

For the FPGA runs, we tried two initialization methods. In the first method, we used the first k-data as the centroids of the clusters. In the second method, we chose centroids randomly. With randomly-chosen initial centroids, the algorithm required fewer iterations and therefore achieved faster times.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

AN 856: K-Mean Clustering with the Intel® FPGA SDK for OpenCL™

1.3. Performance Results