Understanding Feedback Mechanisms in Machine Learning Jupyter Notebooks
Online Appendix
Abstract
This website hosts the online appendix for the research paper titled Understanding Feedback Mechanisms in Machine Learning Jupyter Notebooks which has been submitted to the Journal of Empirical Software Engineering.
(RQ2) How is explicit feedback from assert statements used to validate ML code written in Jupyter notebooks?
Data Shape Check (\(N = 26\))
Key | Code |
---|---|
A5 | assert y_valid.shape == (1132,) |
A17 | assert X.shape[1] == 13, 'Did you drop/lose some columns in X? Did you properly load and split the data?' |
A29 | assert len(test_y_preds) == len(test_y), 'Unexpected number of predictions.' |
A31 | assert img.shape == (112, 92) |
A76 | assert len(encoding['token_type_ids']) == max_seq_length |
A84 | assert red.get_shape().as_list()[1:] == [224, 224, 1] |
A90 | assert len(X_train) == 2000 |
A93 | assert temp_embed.shape[0] == stride |
Data Validation Check (\(N = 14\))
Key | Code |
---|---|
A41 | assert np.all(np.unique(X['smoke'].values) == np.array([0, 1])) |
A44 | assert np.all(np.unique(X['smoke'].values) == np.array([0, 1])) |
A46 | assert np.isclose(stdev_norm, 1.0, atol=1e-16) |
A52 | assert grouped_users['user_id'].nunique() == user_engagement['user_id'].nunique() |
A65 | assert np.all(y <= nb_classes) |
A73 | assert df['clf'].value_counts()[1] == len(df[df['quality'] >= 7]) |
Model Performance Check (\(N = 11\))
Key | Code |
---|---|
A7 | assert len(neighbours_1) == 20, "Neighbors don't match!" |
A15 | assert np.allclose(verify('images/camera_1.jpg', 'bertrand', database, FRmodel), (0.54364836, True))$ |
A19 | assert np.allclose(linear_model.coef_, [[1.57104472, 0.92521608]]), 'The model parameters you learned seem incorrect!' |
A38 | assert 0.75 < auc(fpr, tpr) < 0.85 |
A58 | assert np.isclose(accuracy, 0.9666666666666667) |
Existence Check (\(N = 8\))
Key | Code |
---|---|
A23 | assert np.all(orders.groupby('user_id') .days_since_prior_order.tail(1).notnull()) |
A42 | assert not lab_s.isnull().values.any() |
A43 | assert len(data) != 0, 'cannot divide by zero' |
A50 | assert not np.any(np.isnan(X)) |
A51 | assert data.target.notnull().all() |
A63 | assert X.isnull().sum().sum() == 0 |
A79 | assert not processed_data_df.isna().any().any() |
A86 | assert p0 in poi_info.index |
Resource Check (\(N = 7\))
Key | Code |
---|---|
A10 | assert le_path.is_file(), f"Label encoder file not found at {le_path}. Make sure 'label_encoder.pkl' exists in the lightning_logs directory." |
A14 | assert self.model is not None, 'Model is not loaded, load it by calling .load_model()' |
A18 | assert pd.__version__.rpartition('.')[0] == '1.0', f"Unexpected pandas version: expected 1.0, got {pd.__version__.rpartition('.')[0]}" |
A37 | assert svm.fit_status_ == 0, 'Forgot to train the SVM!' |
A60 | assert f2.gca().has_data() |
A67 | assert pm.__version__ == '3.9.2' |
A74 | assert os.path.exists(image_dir) |
Type Check (\(N = 5\))
Key | Code |
---|---|
A2 | assert isinstance(X_trn, torch.FloatTensor), 'Features should be float32!' |
A35 | assert isinstance(column_transformer, ColumnTransformer), "Input isn't a ColumnTransformer" |
A40 | assert isinstance(model_3, sklearn.ensemble.RandomForestClassifier) |
A81 | assert is_all_ints(filled_df[r]) is True |
A88 | assert isinstance(betas, np.ndarray) |
Mathematical Property Check (\(N = 4\))
Key | Code |
---|---|
A3 | assert (xH - wH) % self.stride == 0 |
A25 | assert test_output.std() < 0.15, "Don't use batchnorm here" |
A56 | assert np.allclose(e_v_states[:, -1], np.ones_like(e_v_states[:, -1])) |
A64 | assert np.allclose(T, T.T) |
Batch Size Check (\(N = 3\))
Key | Code |
---|---|
A21 | assert x.size(0) % batch_size == 0, f'the first dimension of input tensor ({x.size(0)}) should be divisible by batch_size ({batch_size})' |
A28 | assert image_size % patch_size_small == 0, 'Image dimensions must be divisible by the patch size.' |
A70 | assert n_img > batch_size |
Network Architecture Check (\(N = 3\))
Key | Code |
---|---|
A11 | assert self.encoder_conv_01[0].weight.size() == self.vgg16.features[2].weight.size() |
A62 | assert self.encoder_conv_01[0].weight.size() == self.vgg16.features[2].weight.size() |
A75 | assert reg in ['none', 'l2'] |
Data Leakage Check (\(N = 1\))
Key | Code |
---|---|
A33 | assert len(set( tr_df.PetID.unique()).intersection(valid_df.PetID.unique())) == 0 |
(RQ3) How is implicit feedback from print statements and last cell statements used when writing ML code in Jupyter notebooks?
Model Performance Check (\(N = 33\))
Key | Code |
---|---|
P3 | print('The mean accuracy with 10 fold cross validation is: %s ' % round(scores * 100, 2), '%') |
P6 | print('RMSE:', np.sqrt(metrics.mean_squared_error(y_test, pred))) |
P18 | print('The Accuracy is:', accuracy_score(y_test, y_pred)) |
P50 | print('Classification Report: SVM (validation data)')$ |
P54 | print('Intercept value:', lm.intercept_) |
L3 | skplt.metrics. plot_confusion_matrix(Y_val, Vote.predict(X_val), normalize=True, figsize=(10, 10)) |
L52 | spot_check_recs(classifier, 910) |
Data Distribution (\(N = 7\))
Key | Code |
---|---|
L2 | _ = sns.catplot(x='category_id', y='likes', data=train, height=5, aspect=1.5) |
L9 | sns.kdeplot(data=data.loc[ data['Survived'] == 0].Age, label='Died', shade=True) |
L14 | pd.pivot_table(train, index='Survived', values=['Age', 'SibSp', 'Parch', 'Fare']) |
L25 | sns.countplot(house_pred['OverallQual']) |
L48 | x_train.describe() |
Resource Check (\(N = 7\))
Key | Code |
---|---|
P68 | print('GPU is available') |
P71 | print('Hub version: ', hub.__version__) |
P82 | print('Running on TPU ', tpu.master()) |
P86 | print('Cuda is available') |
P107 | print('Model loaded') |
L64 | full_table.head(-5) |
L66 | prostate_cancer_df.shape |
Spot Check (\(N = 5\))
Key | Code |
---|---|
L60 | X_pca.head() |
P64 | print(np.max(cur[:, :, 1])) |
P114 | print(onehot_encoded) |
Model Training Check (\(N = 4\))
Key | Code |
---|---|
L8 | autoencoder.fit(x=X_train, y=X_train, epochs=15, validation_data=[X_test, X_test], callbacks=[keras_utils.TqdmProgressCallback()], verbose=0) |
L31 | adaBoost.fit(X_train, y_train) |
L42 | m_r.best_params_ |
Missing Value Check (\(N = 3\))
Key | Code |
---|---|
P74 | print(train_df.isnull().sum()) |
L12 | sns.heatmap(test_df.isnull(), yticklabels=False, cbar=False, cmap='viridis') |
L36 | test.isna().sum().unique() |
Shape Check (\(N = 3\))
Key | Code |
---|---|
P4 | print('no.of examples in test data : ', len(test_data)) |
P32 | print('Training set shape : ', x_train.shape) |
P117 | print('Y_train.shape: ', Y_train.shape) |
Data Relationship Check (\(N = 2\))
Key | Code |
---|---|
L6 | b = sns.relplot(x='SIZE', y='Cash', hue='CLARITY', alpha=0.9, palette='muted', height=8, data=raw_data) |
L10 | sns.regplot(x='X4 number of convenience stores', y='Y house price of unit area', data=data) |
Type Check (\(N = 2\))
Key | Code |
---|---|
P43 | print('data type:', images.dtype) |
L71 | type(Y) |
Execution Time Check (\(N = 1\))
Key | Code |
---|---|
P66 | print('Total Run Time:') |
Network Architecture Check (\(N = 1\))
Key | Code |
---|---|
P92 | print(MyNetwork) |