python - Saving order of splitting with a vector of index -


l want split data train , test , vector contains names (it serves me index , reference).

name_images has shape of (2440,) 

my data :

data has shape of (2440, 3072)  labels has shape of (2440,)  sklearn.cross_validation import train_test_split x_train, x_test, y_train, y_test= train_test_split(data, labels, test_size=0.3) 

but l want split name_images name_images_train , name_images_test respect split of data , labels

l tried

  x_train, x_test, y_train, y_test,name_images_train,name_images_test= train_test_split(data, labels,name_images, test_size=0.3) 

it doesn't preserve order suggestions thank you

edit1:

x_train, x_test, y_train, y_test= train_test_split(data, labels,test_size=0.3, random_state=42)  name_images_train, name_images_test=train_test_split(name_images,                                                           test_size=0.3,                                                           random_state=42) 

edit1 don't preserve order

there multiple ways accomplish this.

the straight forward use random_state parameter of train_test_split. documentation states:

random_state : int or randomstate :-
pseudo-random number generator state used random sampling.

when fix random_state, indices generated splitting arrays train , test exact same each time.

so change code to:

x_train, x_test,  y_train, y_test,  name_images_train, name_images_test=train_test_split(data, labels, name_images,                                                       test_size=0.3,                                                       random_state=42) 

for more understanding on random_state, see answer here:


Comments

Popular posts from this blog

inversion of control - Autofac named registration constructor injection -

verilog - Systemverilog dynamic casting issues -

ios - Change Storyboard View using Seague -