Titanic Dataset

Notebook from the Titanic - Machine Learning from Disaster challenge on Kaggle.

2 538

Titanic - Machine Learning from Disaster

Use machine learning to create a model that predicts which passengers survived the Titanic shipwreck
In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).

Submission File Format

You should submit a csv file with exactly 418 entries plus a header row. Your submission will show an error if you have extra columns (beyond PassengerId and Survived) or rows.

Two columns should be included in submission:
- PassengerId (in any order)
- Survived (contains binary predictions, 1 for survived, 0 deceased)

Survived Pclass Age SibSp Parch Fare

count 891.000000 891.000000 714.000000 891.000000 891.000000 891.000000

mean 0.383838 2.308642 29.699118 0.523008 0.381594 32.204208

std 0.486592 0.836071 14.526497 1.102743 0.806057 49.693429

min 0.000000 1.000000 0.420000 0.000000 0.000000 0.000000

25% 0.000000 2.000000 20.125000 0.000000 0.000000 7.910400

50% 0.000000 3.000000 28.000000 0.000000 0.000000 14.454200

75% 1.000000 3.000000 38.000000 1.000000 0.000000 31.000000

max 1.000000 3.000000 80.000000 8.000000 6.000000 512.329200

train.info()
X_pred.info()

out[4]

<class 'pandas.core.frame.DataFrame'>
Index: 891 entries, 1 to 891
Data columns (total 11 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Survived 891 non-null int64
1 Pclass 891 non-null int64
2 Name 891 non-null object
3 Sex 891 non-null object
4 Age 714 non-null float64
5 SibSp 891 non-null int64
6 Parch 891 non-null int64
7 Ticket 891 non-null object
8 Fare 891 non-null float64
9 Cabin 204 non-null object
10 Embarked 889 non-null object
dtypes: float64(2), int64(4), object(5)
memory usage: 83.5+ KB
<class 'pandas.core.frame.DataFrame'>
Index: 418 entries, 892 to 1309
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Pclass 418 non-null int64
1 Name 418 non-null object
2 Sex 418 non-null object
3 Age 332 non-null float64
4 SibSp 418 non-null int64
5 Parch 418 non-null int64
6 Ticket 418 non-null object
7 Fare 417 non-null float64
8 Cabin 91 non-null object
9 Embarked 418 non-null object
dtypes: float64(2), int64(3), object(5)
memory usage: 35.9+ KB

np.abs(train.count()-train.count().max())

out[5]

Survived 0

Pclass 0

Name 0

Sex 0

Age 177

SibSp 0

Parch 0

Ticket 0

Fare 0

Cabin 687

Embarked 2

dtype: int64

The Age, Cabin, and Embarked columns have na values that need to be filled
The Sex column needs to be processed using OneHotEncoder (1 for male)

train.describe(include=[np.object_])

out[7]

Name Sex Ticket Cabin Embarked

count 891 891 891 204 889

unique 891 2 681 147 3

top Braund, Mr. Owen Harris male 347082 B96 B98 S

freq 1 577 7 4 644

Drop the Cabin, Name, and Ticket column since they has so many unique values and embedding doesn't seem like it would help

unique_embarked = train["Embarked"].unique()
unique_sex = train["Sex"].unique()
print("Unique Embarked: ",unique_embarked)
print("Unique Sex: ",unique_sex)

out[9]

Unique Embarked: ['S' 'C' 'Q' nan]
Unique Sex: ['male' 'female']

Plot a histogram of how the embarked values affect the survival rate

from collections import Counter
embarked_survived = train[["Embarked","Survived"]]
embarked_survived = embarked_survived.dropna()
embarked_counts = Counter(embarked_survived[embarked_survived["Survived"]==1]["Embarked"])
neg_embarked_counts = Counter(embarked_survived[embarked_survived["Survived"]==0]["Embarked"])
embarked_counts_df = pd.DataFrame.from_dict(embarked_counts, orient='index')
neg_embarked_counts_df = pd.DataFrame.from_dict(neg_embarked_counts,orient="index")
fig, axes = plt.subplots(nrows=1, ncols=2)
embarked_counts_df.sort_index().plot(kind="bar",title="Survived",ax=axes[0],ylim=[0,450],legend=None)
neg_embarked_counts_df.sort_index().plot(kind="bar",title="Died",ax=axes[1],ylim=[0,450],legend=None)
fig.tight_layout(w_pad=50)

out[11]

C:\Users\fmb20\AppData\Local\Temp\ipykernel_11672\2076951307.py:11: UserWarning: Tight layout not applied. tight_layout cannot make axes width small enough to accommodate all axes decorations
fig.tight_layout(w_pad=50)

All else being equal, you are about twice as likely to die if you embarked from "S"

y = train["Survived"]
X = train.drop(columns=["Survived"])
np.random.seed(42)
rand_num = np.random.randint(0,100)
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=rand_num)

drop_encode = ColumnTransformer(transformers=[
    ('','drop',['Cabin','Name','Ticket']),
    ('ord',OrdinalEncoder(),['Sex'])
],remainder="passthrough",sparse_threshold=0) 
# Sex is 0, Pclass is 1, Age is 2, SibSp is 3, Parch is 4, Fare is 5, Embarked is 6
impute = ColumnTransformer(transformers=[
    ('imp_age',KNNImputer(),[2]),
    ('imp_fare',KNNImputer(),[5]),
    ('imp_embarked',SimpleImputer(strategy="most_frequent"),[-1])
],remainder="passthrough",sparse_threshold=0)
# Age is 0, Fare is 1, Embarked is 2, Sex is 3, Pclass is 4,  SibSp is 5, Parch is 6 
scale_encode = ColumnTransformer(transformers=[
    ('imp',StandardScaler(),[0,1,4,5,6]),
    ('ohe',OneHotEncoder(),[2])
],remainder="passthrough",sparse_threshold=0)

preperation_pipeline = Pipeline(steps=[
    ("initial",drop_encode),
    ("impute",impute),
    ("scale",scale_encode),
    ("predict",SVC(gamma="auto",C=1,random_state=rand_num))
])

# RandomForestClassifier(n_estimators=100,criterion="gini") Random Forest Classifier: {'predict__n_estimators': 10, 'predict__max_features': 'log2', 'predict__criterion': 'entropy'}
# LinearSVC(max_iter=1000,dual=False) scoe about 75
# SVC(gamma="auto",C=1,random_state=rand_num)) score about 77
# preperation_pipeline.fit(X_train,y_train)

param_dist_svc = {
    "predict__C": np.linspace(0.9,1,20)
}
param_dist_rfclf = {
    "predict__n_estimators": [10,100,200],
    "predict__criterion": ["gini","entropy","log_loss"],
    "predict__max_features": ["sqrt","log2",None]
}
clf = RandomizedSearchCV(preperation_pipeline,param_dist_svc,verbose=3)

search = clf.fit(X_test,y_test)
print(search.best_params_)

print("Average of Cross Val Score: ",np.average(cross_val_score(search,X_train,y_train)))

predictions = search.predict(X_pred)
passenger_ids = X_pred.index

submission = pd.DataFrame({"PassengerId": passenger_ids,"Survived": predictions})
submission = submission.set_index("PassengerId")
submission.to_csv(get_submission_path())
print("Submissions Submitted")

out[13]

Fitting 5 folds for each of 10 candidates, totalling 50 fits
[CV 1/5] END ....................predict__C=0.9;, score=0.833 total time= 0.0s
[CV 2/5] END ....................predict__C=0.9;, score=0.778 total time= 0.0s
[CV 3/5] END ....................predict__C=0.9;, score=0.806 total time= 0.0s
[CV 4/5] END ....................predict__C=0.9;, score=0.722 total time= 0.0s
[CV 5/5] END ....................predict__C=0.9;, score=0.800 total time= 0.0s
[CV 1/5] END ......predict__C=0.968421052631579;, score=0.806 total time= 0.0s
[CV 2/5] END ......predict__C=0.968421052631579;, score=0.778 total time= 0.0s
[CV 3/5] END ......predict__C=0.968421052631579;, score=0.806 total time= 0.0s
[CV 4/5] END ......predict__C=0.968421052631579;, score=0.722 total time= 0.0s
[CV 5/5] END ......predict__C=0.968421052631579;, score=0.829 total time= 0.0s
[CV 1/5] END .....predict__C=0.9421052631578948;, score=0.833 total time= 0.0s
[CV 2/5] END .....predict__C=0.9421052631578948;, score=0.778 total time= 0.0s
[CV 3/5] END .....predict__C=0.9421052631578948;, score=0.806 total time= 0.0s
[CV 4/5] END .....predict__C=0.9421052631578948;, score=0.722 total time= 0.0s
[CV 5/5] END .....predict__C=0.9421052631578948;, score=0.829 total time= 0.0s
[CV 1/5] END .....predict__C=0.9052631578947369;, score=0.833 total time= 0.0s
[CV 2/5] END .....predict__C=0.9052631578947369;, score=0.778 total time= 0.0s
[CV 3/5] END .....predict__C=0.9052631578947369;, score=0.806 total time= 0.0s
[CV 4/5] END .....predict__C=0.9052631578947369;, score=0.722 total time= 0.0s
[CV 5/5] END .....predict__C=0.9052631578947369;, score=0.800 total time= 0.0s
[CV 1/5] END .....predict__C=0.9789473684210527;, score=0.806 total time= 0.0s
[CV 2/5] END .....predict__C=0.9789473684210527;, score=0.778 total time= 0.0s
[CV 3/5] END .....predict__C=0.9789473684210527;, score=0.806 total time= 0.0s
[CV 4/5] END .....predict__C=0.9789473684210527;, score=0.722 total time= 0.0s
[CV 5/5] END .....predict__C=0.9789473684210527;, score=0.829 total time= 0.0s
[CV 1/5] END .....predict__C=0.9263157894736842;, score=0.833 total time= 0.0s
[CV 2/5] END .....predict__C=0.9263157894736842;, score=0.778 total time= 0.0s
[CV 3/5] END .....predict__C=0.9263157894736842;, score=0.806 total time= 0.0s
[CV 4/5] END .....predict__C=0.9263157894736842;, score=0.722 total time= 0.0s
[CV 5/5] END .....predict__C=0.9263157894736842;, score=0.829 total time= 0.0s
[CV 1/5] END ....................predict__C=1.0;, score=0.806 total time= 0.0s
[CV 2/5] END ....................predict__C=1.0;, score=0.778 total time= 0.0s
[CV 3/5] END ....................predict__C=1.0;, score=0.806 total time= 0.0s
[CV 4/5] END ....................predict__C=1.0;, score=0.722 total time= 0.0s
[CV 5/5] END ....................predict__C=1.0;, score=0.829 total time= 0.0s
[CV 1/5] END .....predict__C=0.9578947368421052;, score=0.833 total time= 0.0s
[CV 2/5] END .....predict__C=0.9578947368421052;, score=0.778 total time= 0.0s
[CV 3/5] END .....predict__C=0.9578947368421052;, score=0.806 total time= 0.0s
[CV 4/5] END .....predict__C=0.9578947368421052;, score=0.722 total time= 0.0s
[CV 5/5] END .....predict__C=0.9578947368421052;, score=0.829 total time= 0.0s
[CV 1/5] END .....predict__C=0.9157894736842106;, score=0.833 total time= 0.0s
[CV 2/5] END .....predict__C=0.9157894736842106;, score=0.778 total time= 0.0s
[CV 3/5] END .....predict__C=0.9157894736842106;, score=0.806 total time= 0.0s
[CV 4/5] END .....predict__C=0.9157894736842106;, score=0.722 total time= 0.0s
[CV 5/5] END .....predict__C=0.9157894736842106;, score=0.829 total time= 0.0s
[CV 1/5] END .....predict__C=0.9210526315789473;, score=0.833 total time= 0.0s
[CV 2/5] END .....predict__C=0.9210526315789473;, score=0.778 total time= 0.0s
[CV 3/5] END .....predict__C=0.9210526315789473;, score=0.806 total time= 0.0s
[CV 4/5] END .....predict__C=0.9210526315789473;, score=0.722 total time= 0.0s
[CV 5/5] END .....predict__C=0.9210526315789473;, score=0.829 total time= 0.0s
{'predict__C': 0.9421052631578948}
Fitting 5 folds for each of 10 candidates, totalling 50 fits
[CV 1/5] END ....................predict__C=1.0;, score=0.798 total time= 0.0s
[CV 2/5] END ....................predict__C=1.0;, score=0.851 total time= 0.0s
[CV 3/5] END ....................predict__C=1.0;, score=0.789 total time= 0.0s
[CV 4/5] END ....................predict__C=1.0;, score=0.851 total time= 0.0s
[CV 5/5] END ....................predict__C=1.0;, score=0.823 total time= 0.0s
[CV 1/5] END .....predict__C=0.9842105263157894;, score=0.798 total time= 0.0s
[CV 2/5] END .....predict__C=0.9842105263157894;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9842105263157894;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9842105263157894;, score=0.851 total time= 0.0s
[CV 5/5] END .....predict__C=0.9842105263157894;, score=0.823 total time= 0.0s
[CV 1/5] END .....predict__C=0.9789473684210527;, score=0.798 total time= 0.0s
[CV 2/5] END .....predict__C=0.9789473684210527;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9789473684210527;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9789473684210527;, score=0.851 total time= 0.0s
[CV 5/5] END .....predict__C=0.9789473684210527;, score=0.823 total time= 0.0s
[CV 1/5] END .....predict__C=0.9263157894736842;, score=0.798 total time= 0.0s
[CV 2/5] END .....predict__C=0.9263157894736842;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9263157894736842;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9263157894736842;, score=0.851 total time= 0.0s
[CV 5/5] END .....predict__C=0.9263157894736842;, score=0.823 total time= 0.0s
[CV 1/5] END .....predict__C=0.9210526315789473;, score=0.798 total time= 0.0s
[CV 2/5] END .....predict__C=0.9210526315789473;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9210526315789473;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9210526315789473;, score=0.851 total time= 0.0s
[CV 5/5] END .....predict__C=0.9210526315789473;, score=0.823 total time= 0.0s
[CV 1/5] END .....predict__C=0.9631578947368421;, score=0.798 total time= 0.0s
[CV 2/5] END .....predict__C=0.9631578947368421;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9631578947368421;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9631578947368421;, score=0.851 total time= 0.0s
[CV 5/5] END .....predict__C=0.9631578947368421;, score=0.823 total time= 0.0s
[CV 1/5] END .....predict__C=0.9736842105263158;, score=0.798 total time= 0.0s
[CV 2/5] END .....predict__C=0.9736842105263158;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9736842105263158;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9736842105263158;, score=0.851 total time= 0.0s
[CV 5/5] END .....predict__C=0.9736842105263158;, score=0.823 total time= 0.0s
[CV 1/5] END .....predict__C=0.9368421052631579;, score=0.798 total time= 0.0s
[CV 2/5] END .....predict__C=0.9368421052631579;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9368421052631579;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9368421052631579;, score=0.851 total time= 0.0s
[CV 5/5] END .....predict__C=0.9368421052631579;, score=0.823 total time= 0.0s
[CV 1/5] END .....predict__C=0.9157894736842106;, score=0.798 total time= 0.0s
[CV 2/5] END .....predict__C=0.9157894736842106;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9157894736842106;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9157894736842106;, score=0.851 total time= 0.0s
[CV 5/5] END .....predict__C=0.9157894736842106;, score=0.823 total time= 0.0s
[CV 1/5] END .....predict__C=0.9315789473684211;, score=0.798 total time= 0.0s
[CV 2/5] END .....predict__C=0.9315789473684211;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9315789473684211;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9315789473684211;, score=0.851 total time= 0.0s
[CV 5/5] END .....predict__C=0.9315789473684211;, score=0.823 total time= 0.0s
Fitting 5 folds for each of 10 candidates, totalling 50 fits
[CV 1/5] END .....predict__C=0.9526315789473684;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9526315789473684;, score=0.877 total time= 0.0s
[CV 3/5] END .....predict__C=0.9526315789473684;, score=0.816 total time= 0.0s
[CV 4/5] END .....predict__C=0.9526315789473684;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9526315789473684;, score=0.832 total time= 0.0s
[CV 1/5] END .....predict__C=0.9368421052631579;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9368421052631579;, score=0.877 total time= 0.0s
[CV 3/5] END .....predict__C=0.9368421052631579;, score=0.816 total time= 0.0s
[CV 4/5] END .....predict__C=0.9368421052631579;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9368421052631579;, score=0.832 total time= 0.0s
[CV 1/5] END ....................predict__C=0.9;, score=0.825 total time= 0.0s
[CV 2/5] END ....................predict__C=0.9;, score=0.877 total time= 0.0s
[CV 3/5] END ....................predict__C=0.9;, score=0.816 total time= 0.0s
[CV 4/5] END ....................predict__C=0.9;, score=0.833 total time= 0.0s
[CV 5/5] END ....................predict__C=0.9;, score=0.832 total time= 0.0s
[CV 1/5] END .....predict__C=0.9105263157894737;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9105263157894737;, score=0.877 total time= 0.0s
[CV 3/5] END .....predict__C=0.9105263157894737;, score=0.816 total time= 0.0s
[CV 4/5] END .....predict__C=0.9105263157894737;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9105263157894737;, score=0.832 total time= 0.0s
[CV 1/5] END .....predict__C=0.9263157894736842;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9263157894736842;, score=0.877 total time= 0.0s
[CV 3/5] END .....predict__C=0.9263157894736842;, score=0.816 total time= 0.0s
[CV 4/5] END .....predict__C=0.9263157894736842;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9263157894736842;, score=0.832 total time= 0.0s
[CV 1/5] END .....predict__C=0.9947368421052631;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9947368421052631;, score=0.877 total time= 0.0s
[CV 3/5] END .....predict__C=0.9947368421052631;, score=0.833 total time= 0.0s
[CV 4/5] END .....predict__C=0.9947368421052631;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9947368421052631;, score=0.832 total time= 0.0s
[CV 1/5] END .....predict__C=0.9842105263157894;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9842105263157894;, score=0.877 total time= 0.0s
[CV 3/5] END .....predict__C=0.9842105263157894;, score=0.833 total time= 0.0s
[CV 4/5] END .....predict__C=0.9842105263157894;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9842105263157894;, score=0.832 total time= 0.0s
[CV 1/5] END .....predict__C=0.9631578947368421;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9631578947368421;, score=0.877 total time= 0.0s
[CV 3/5] END .....predict__C=0.9631578947368421;, score=0.816 total time= 0.0s
[CV 4/5] END .....predict__C=0.9631578947368421;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9631578947368421;, score=0.832 total time= 0.0s
[CV 1/5] END .....predict__C=0.9210526315789473;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9210526315789473;, score=0.877 total time= 0.0s
[CV 3/5] END .....predict__C=0.9210526315789473;, score=0.816 total time= 0.0s
[CV 4/5] END .....predict__C=0.9210526315789473;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9210526315789473;, score=0.832 total time= 0.0s
[CV 1/5] END .....predict__C=0.9736842105263158;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9736842105263158;, score=0.877 total time= 0.0s
[CV 3/5] END .....predict__C=0.9736842105263158;, score=0.825 total time= 0.0s
[CV 4/5] END .....predict__C=0.9736842105263158;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9736842105263158;, score=0.832 total time= 0.0s
Fitting 5 folds for each of 10 candidates, totalling 50 fits
[CV 1/5] END ....................predict__C=0.9;, score=0.807 total time= 0.0s
[CV 2/5] END ....................predict__C=0.9;, score=0.833 total time= 0.0s
[CV 3/5] END ....................predict__C=0.9;, score=0.807 total time= 0.0s
[CV 4/5] END ....................predict__C=0.9;, score=0.833 total time= 0.0s
[CV 5/5] END ....................predict__C=0.9;, score=0.798 total time= 0.0s
[CV 1/5] END .....predict__C=0.9947368421052631;, score=0.789 total time= 0.0s
[CV 2/5] END .....predict__C=0.9947368421052631;, score=0.833 total time= 0.0s
[CV 3/5] END .....predict__C=0.9947368421052631;, score=0.816 total time= 0.0s
[CV 4/5] END .....predict__C=0.9947368421052631;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9947368421052631;, score=0.798 total time= 0.0s
[CV 1/5] END .....predict__C=0.9315789473684211;, score=0.807 total time= 0.0s
[CV 2/5] END .....predict__C=0.9315789473684211;, score=0.833 total time= 0.0s
[CV 3/5] END .....predict__C=0.9315789473684211;, score=0.807 total time= 0.0s
[CV 4/5] END .....predict__C=0.9315789473684211;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9315789473684211;, score=0.798 total time= 0.0s
[CV 1/5] END .....predict__C=0.9421052631578948;, score=0.807 total time= 0.0s
[CV 2/5] END .....predict__C=0.9421052631578948;, score=0.833 total time= 0.0s
[CV 3/5] END .....predict__C=0.9421052631578948;, score=0.807 total time= 0.0s
[CV 4/5] END .....predict__C=0.9421052631578948;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9421052631578948;, score=0.798 total time= 0.0s
[CV 1/5] END .....predict__C=0.9894736842105263;, score=0.789 total time= 0.0s
[CV 2/5] END .....predict__C=0.9894736842105263;, score=0.833 total time= 0.0s
[CV 3/5] END .....predict__C=0.9894736842105263;, score=0.816 total time= 0.0s
[CV 4/5] END .....predict__C=0.9894736842105263;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9894736842105263;, score=0.798 total time= 0.0s
[CV 1/5] END .....predict__C=0.9578947368421052;, score=0.798 total time= 0.0s
[CV 2/5] END .....predict__C=0.9578947368421052;, score=0.833 total time= 0.0s
[CV 3/5] END .....predict__C=0.9578947368421052;, score=0.807 total time= 0.0s
[CV 4/5] END .....predict__C=0.9578947368421052;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9578947368421052;, score=0.798 total time= 0.0s
[CV 1/5] END .....predict__C=0.9210526315789473;, score=0.807 total time= 0.0s
[CV 2/5] END .....predict__C=0.9210526315789473;, score=0.833 total time= 0.0s
[CV 3/5] END .....predict__C=0.9210526315789473;, score=0.807 total time= 0.0s
[CV 4/5] END .....predict__C=0.9210526315789473;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9210526315789473;, score=0.798 total time= 0.0s
[CV 1/5] END .....predict__C=0.9473684210526316;, score=0.807 total time= 0.0s
[CV 2/5] END .....predict__C=0.9473684210526316;, score=0.833 total time= 0.0s
[CV 3/5] END .....predict__C=0.9473684210526316;, score=0.807 total time= 0.0s
[CV 4/5] END .....predict__C=0.9473684210526316;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9473684210526316;, score=0.798 total time= 0.0s
[CV 1/5] END .....predict__C=0.9631578947368421;, score=0.789 total time= 0.0s
[CV 2/5] END .....predict__C=0.9631578947368421;, score=0.833 total time= 0.0s
[CV 3/5] END .....predict__C=0.9631578947368421;, score=0.807 total time= 0.0s
[CV 4/5] END .....predict__C=0.9631578947368421;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9631578947368421;, score=0.798 total time= 0.0s
[CV 1/5] END .....predict__C=0.9526315789473684;, score=0.798 total time= 0.0s
[CV 2/5] END .....predict__C=0.9526315789473684;, score=0.833 total time= 0.0s
[CV 3/5] END .....predict__C=0.9526315789473684;, score=0.807 total time= 0.0s
[CV 4/5] END .....predict__C=0.9526315789473684;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9526315789473684;, score=0.798 total time= 0.0s
Fitting 5 folds for each of 10 candidates, totalling 50 fits
[CV 1/5] END .....predict__C=0.9263157894736842;, score=0.833 total time= 0.0s
[CV 2/5] END .....predict__C=0.9263157894736842;, score=0.842 total time= 0.0s
[CV 3/5] END .....predict__C=0.9263157894736842;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9263157894736842;, score=0.816 total time= 0.0s
[CV 5/5] END .....predict__C=0.9263157894736842;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9105263157894737;, score=0.833 total time= 0.0s
[CV 2/5] END .....predict__C=0.9105263157894737;, score=0.842 total time= 0.0s
[CV 3/5] END .....predict__C=0.9105263157894737;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9105263157894737;, score=0.816 total time= 0.0s
[CV 5/5] END .....predict__C=0.9105263157894737;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9526315789473684;, score=0.833 total time= 0.0s
[CV 2/5] END .....predict__C=0.9526315789473684;, score=0.842 total time= 0.0s
[CV 3/5] END .....predict__C=0.9526315789473684;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9526315789473684;, score=0.816 total time= 0.0s
[CV 5/5] END .....predict__C=0.9526315789473684;, score=0.816 total time= 0.0s
[CV 1/5] END ....................predict__C=1.0;, score=0.825 total time= 0.0s
[CV 2/5] END ....................predict__C=1.0;, score=0.842 total time= 0.0s
[CV 3/5] END ....................predict__C=1.0;, score=0.789 total time= 0.0s
[CV 4/5] END ....................predict__C=1.0;, score=0.807 total time= 0.0s
[CV 5/5] END ....................predict__C=1.0;, score=0.816 total time= 0.0s
[CV 1/5] END ....................predict__C=0.9;, score=0.833 total time= 0.0s
[CV 2/5] END ....................predict__C=0.9;, score=0.842 total time= 0.0s
[CV 3/5] END ....................predict__C=0.9;, score=0.789 total time= 0.0s
[CV 4/5] END ....................predict__C=0.9;, score=0.816 total time= 0.0s
[CV 5/5] END ....................predict__C=0.9;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9631578947368421;, score=0.833 total time= 0.0s
[CV 2/5] END .....predict__C=0.9631578947368421;, score=0.842 total time= 0.0s
[CV 3/5] END .....predict__C=0.9631578947368421;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9631578947368421;, score=0.816 total time= 0.0s
[CV 5/5] END .....predict__C=0.9631578947368421;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9473684210526316;, score=0.842 total time= 0.0s
[CV 2/5] END .....predict__C=0.9473684210526316;, score=0.842 total time= 0.0s
[CV 3/5] END .....predict__C=0.9473684210526316;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9473684210526316;, score=0.816 total time= 0.0s
[CV 5/5] END .....predict__C=0.9473684210526316;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9842105263157894;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9842105263157894;, score=0.842 total time= 0.0s
[CV 3/5] END .....predict__C=0.9842105263157894;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9842105263157894;, score=0.807 total time= 0.0s
[CV 5/5] END .....predict__C=0.9842105263157894;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9052631578947369;, score=0.833 total time= 0.0s
[CV 2/5] END .....predict__C=0.9052631578947369;, score=0.842 total time= 0.0s
[CV 3/5] END .....predict__C=0.9052631578947369;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9052631578947369;, score=0.816 total time= 0.0s
[CV 5/5] END .....predict__C=0.9052631578947369;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9210526315789473;, score=0.833 total time= 0.0s
[CV 2/5] END .....predict__C=0.9210526315789473;, score=0.842 total time= 0.0s
[CV 3/5] END .....predict__C=0.9210526315789473;, score=0.789 total time= 0.0s
[CV 4/5] END .....predict__C=0.9210526315789473;, score=0.816 total time= 0.0s
[CV 5/5] END .....predict__C=0.9210526315789473;, score=0.816 total time= 0.0s
Fitting 5 folds for each of 10 candidates, totalling 50 fits
[CV 1/5] END .....predict__C=0.9157894736842106;, score=0.816 total time= 0.0s
[CV 2/5] END .....predict__C=0.9157894736842106;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9157894736842106;, score=0.798 total time= 0.0s
[CV 4/5] END .....predict__C=0.9157894736842106;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9157894736842106;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9052631578947369;, score=0.816 total time= 0.0s
[CV 2/5] END .....predict__C=0.9052631578947369;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9052631578947369;, score=0.798 total time= 0.0s
[CV 4/5] END .....predict__C=0.9052631578947369;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9052631578947369;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9210526315789473;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9210526315789473;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9210526315789473;, score=0.798 total time= 0.0s
[CV 4/5] END .....predict__C=0.9210526315789473;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9210526315789473;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9263157894736842;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9263157894736842;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9263157894736842;, score=0.798 total time= 0.0s
[CV 4/5] END .....predict__C=0.9263157894736842;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9263157894736842;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9789473684210527;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9789473684210527;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9789473684210527;, score=0.807 total time= 0.0s
[CV 4/5] END .....predict__C=0.9789473684210527;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9789473684210527;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9842105263157894;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9842105263157894;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9842105263157894;, score=0.807 total time= 0.0s
[CV 4/5] END .....predict__C=0.9842105263157894;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9842105263157894;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9473684210526316;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9473684210526316;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9473684210526316;, score=0.807 total time= 0.0s
[CV 4/5] END .....predict__C=0.9473684210526316;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9473684210526316;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9526315789473684;, score=0.825 total time= 0.0s
[CV 2/5] END .....predict__C=0.9526315789473684;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9526315789473684;, score=0.807 total time= 0.0s
[CV 4/5] END .....predict__C=0.9526315789473684;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9526315789473684;, score=0.816 total time= 0.0s
[CV 1/5] END ......predict__C=0.968421052631579;, score=0.825 total time= 0.0s
[CV 2/5] END ......predict__C=0.968421052631579;, score=0.851 total time= 0.0s
[CV 3/5] END ......predict__C=0.968421052631579;, score=0.807 total time= 0.0s
[CV 4/5] END ......predict__C=0.968421052631579;, score=0.833 total time= 0.0s
[CV 5/5] END ......predict__C=0.968421052631579;, score=0.816 total time= 0.0s
[CV 1/5] END .....predict__C=0.9105263157894737;, score=0.816 total time= 0.0s
[CV 2/5] END .....predict__C=0.9105263157894737;, score=0.851 total time= 0.0s
[CV 3/5] END .....predict__C=0.9105263157894737;, score=0.798 total time= 0.0s
[CV 4/5] END .....predict__C=0.9105263157894737;, score=0.833 total time= 0.0s
[CV 5/5] END .....predict__C=0.9105263157894737;, score=0.816 total time= 0.0s
Average of Cross Val Score: 0.8286516300600809
Submissions Submitted

Conclusion - 77%

I really did not perform enough feature engineering at all on the dataset after looking at good scores. Check out this Jupyter Notebook to see people who scored better and why they scored better. A good score is about 4% higher than I scored.

User Comments

There are currently no comments for this article.

Titanic Dataset

Titanic - Machine Learning from Disaster

Submission File Format

Variable Notes

Conclusion - 77%

Comments

User Comments