c++ - Get nearest centroid using Thrust library? (K-Means) -


i finished computing distances , stored in thrust vector, instance, have 2 centroids , 5 datapoints , way computed distances each centroid computed distances 5 datapoints first , stored in array , later other centroid in 1d array in distances, this:

for (int = 0; < centroids.size(); ++i) {     computedistance(data, distances, centroids[i], ndatapoints, ndimensions); } 

resulting in vector 1d, instance:

distancesvalues = {10, 15, 20, 12, 10, 5, 17, 22, 8, 7}  datapointsindex = {1, 2,  3,   4,  5,  1,  2,  3, 4, 5} 

where first 5 values represent centroid 1 , other 5 values centroid 2.

what know if there thrust function in can store count in array of minimum values each centroid?

comparing values of each index, result should be:

counts = {2, 3} 

where:

countofcentroid 1 = 2        countofcentroid 2 = 3 

here 1 possible approach:

  1. create additional centroid index vector:

    distancesvalues = {10, 15, 20, 12, 10, 5, 17, 22,  8, 7} datapointsindex = {1,   2,  3,  4,  5, 1,  2,  3,  4, 5} centroidindex   = {1,   1,  1,  1,  1, 2,  2,  2,  2, 2} 
  2. now sort_by_key, using datapointsindex keys, , other 2 vectors zipped values. has effect rearranging 3 vectors datapointsindex has indices grouped together:

    datapointsindex = {1, 1, 2, 2, 3, 3, 4, 4, 5, 5}  

    (and other 2 vectors correspondingly rearranged).

  3. now reduce_by_key. if choose thrust::minimum operator, reduction chooses minimum value in group (rather summing values in group). reduce_by_key means reduction of type done on each consecutive group of keys. once again use datapointsindex our key vector, , other 2 vectors zipped our value vector. of output of reduce_by_key don't care about, except results vector emanates centroidindex vector. counting centroid indices in results vector, can desired output.

here worked example:

$ cat t428.cu #include <thrust/host_vector.h> #include <thrust/device_vector.h> #include <thrust/sort.h> #include <thrust/reduce.h> #include <thrust/copy.h> #include <thrust/iterator/zip_iterator.h> #include <thrust/iterator/discard_iterator.h> #include <stdio.h> #define num_points 5 #define num_centroid 2 #define dsize (num_points*num_centroid)  int main(){    int distancesvalues[dsize] = {10, 15, 20, 12, 10, 5, 17, 22, 8, 7};   int datapointsindex[dsize] = {1, 2,  3,   4,  5,  1,  2,  3, 4, 5};   int centroidindex[dsize]   = {1, 1, 1, 1, 1, 2, 2, 2, 2, 2};    thrust::device_vector<int> dv(distancesvalues, distancesvalues + dsize);   thrust::device_vector<int> di(datapointsindex, datapointsindex + dsize);   thrust::device_vector<int> ci(centroidindex, centroidindex + dsize);   thrust::device_vector<int> ra(num_points);   thrust::device_vector<int> rb(num_points);    thrust::sort_by_key(di.begin(), di.end(), thrust::make_zip_iterator(thrust::make_tuple(dv.begin(), ci.begin())));   thrust::reduce_by_key(di.begin(), di.end(), thrust::make_zip_iterator(thrust::make_tuple(dv.begin(), ci.begin())), thrust::make_discard_iterator(), thrust::make_zip_iterator(thrust::make_tuple(ra.begin(), rb.begin())), thrust::equal_to<int>(), thrust::minimum<thrust::tuple<int, int> >());   printf("countofcentroid 1 = %d\n", thrust::count(rb.begin(), rb.end(), 1));   printf("countofcentroid 2 = %d\n", thrust::count(rb.begin(), rb.end(), 2));    return 0; }  $ nvcc -arch=sm_20 -o t428 t428.cu $ ./t428 countofcentroid 1 = 2 countofcentroid 2 = 3 $ 

as eric points out in answer here (your question duplicate of one), sort_by_key unnecessary. reordering of data follows regular pattern, don't need harness complexity of sort, , can therefore re-order data clever use of iterators. under circumstances, may possible entire operation (approximately) single call reduce_by_key.


Comments

Popular posts from this blog

commonjs - How to write a typescript definition file for a node module that exports a function? -

openid - Okta: Failed to get authorization code through API call -

thorough guide for profiling racket code -