sklearn split

 
from sklearn.model_selection import train_test_split

# 数据分训练测试,标签也分训练测试
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=73)


shuffle=True

 
train_test_split(
    *arrays,
    test_size=None,
    train_size=None,
    random_state=None,
    shuffle=True,
    stratify=None,
)

test_size

 
如果是小数,按百分比,通常在 10% ~ 20%
如果是整数,比如test_size=56,则取56条数据作为测试集 
    
参考