python - Saving order of splitting with a vector of index -
l want split data train , test , vector contains names (it serves me index , reference).
name_images has shape of (2440,) my data :
data has shape of (2440, 3072) labels has shape of (2440,) sklearn.cross_validation import train_test_split x_train, x_test, y_train, y_test= train_test_split(data, labels, test_size=0.3) but l want split name_images name_images_train , name_images_test respect split of data , labels
l tried
x_train, x_test, y_train, y_test,name_images_train,name_images_test= train_test_split(data, labels,name_images, test_size=0.3) it doesn't preserve order suggestions thank you
edit1:
x_train, x_test, y_train, y_test= train_test_split(data, labels,test_size=0.3, random_state=42) name_images_train, name_images_test=train_test_split(name_images, test_size=0.3, random_state=42) edit1 don't preserve order
there multiple ways accomplish this.
the straight forward use random_state parameter of train_test_split. documentation states:
random_state : int or randomstate :-
pseudo-random number generator state used random sampling.
when fix random_state, indices generated splitting arrays train , test exact same each time.
so change code to:
x_train, x_test, y_train, y_test, name_images_train, name_images_test=train_test_split(data, labels, name_images, test_size=0.3, random_state=42) for more understanding on random_state, see answer here:
Comments
Post a Comment