Assignment 2:Due 16 Sept 2013 2. Experiment with different types of binning on the glass data (installed with WEKA). Try equal width and equal depth binning. Try various numbers of bins. Use OneR on the original data and on your binned data. Compare the results. #equal frequency(depth) binning in weka, with ignore class=true or false (doesn't matter),using OneR, number of bin=10 === Run information === Scheme:weka.classifiers.rules.OneR -B 6 Relation: Glass-weka.filters.unsupervised.attribute.Discretize-F-B10-M-1.0-Rfirst-last Instances: 214 Attributes: 10 RI Na Mg Al Si K Ca Ba Fe Type Test mode:10-fold cross-validation === Classifier model (full training set) === Al: '(-inf-0.85]' -> build wind float '(0.85-1.145]' -> build wind float '(1.145-1.225]' -> build wind float '(1.225-1.285]' -> build wind float '(1.285-1.355]' -> build wind float '(1.355-1.475]' -> build wind non-float '(1.475-1.555]' -> build wind non-float '(1.555-1.725]' -> build wind non-float '(1.725-2.07]' -> headlamps '(2.07-inf)' -> headlamps (123/214 instances correct) Time taken to build model: 0 seconds === Stratified cross-validation === === Summary === Correctly Classified Instances 121 56.5421 % Incorrectly Classified Instances 93 43.4579 % Kappa statistic 0.3829 Mean absolute error 0.1242 Root mean squared error 0.3524 Relative absolute error 58.6353 % Root relative squared error 108.5738 % Total Number of Instances 214 === Detailed Accuracy By Class === TP Rate FP Rate Precision Recall F-Measure ROC Area Class 0.8 0.313 0.554 0.8 0.655 0.744 build wind float 0.513 0.21 0.574 0.513 0.542 0.652 build wind non-float 0 0 0 0 0 0.5 vehic wind float 0 0 0 0 0 ? vehic wind non-float 0 0 0 0 0 0.5 containers 0 0 0 0 0 0.5 tableware 0.897 0.103 0.578 0.897 0.703 0.897 headlamps Weighted Avg. 0.565 0.191 0.463 0.565 0.502 0.687 === Confusion Matrix === a b c d e f g <-- classified as 56 14 0 0 0 0 0 | a = build wind float 28 39 0 0 0 0 9 | b = build wind non-float 11 5 0 0 0 0 1 | c = vehic wind float 0 0 0 0 0 0 0 | d = vehic wind non-float 0 6 0 0 0 0 7 | e = containers 3 4 0 0 0 0 2 | f = tableware 3 0 0 0 0 0 26 | g = headlamps #try the rule of thumb of number of bins, bin=log2(214)=7.74 (use 8),ignore class=false results === Run information === Scheme:weka.classifiers.rules.OneR -B 6 Relation: Glass-weka.filters.unsupervised.attribute.Discretize-F-B8-M-1.0-Rfirst-last Instances: 214 Attributes: 10 RI Na Mg Al Si K Ca Ba Fe Type Test mode:10-fold cross-validation === Classifier model (full training set) === Al: '(-inf-0.905]' -> build wind float '(0.905-1.185]' -> build wind float '(1.185-1.275]' -> build wind float '(1.275-1.355]' -> build wind float '(1.355-1.5]' -> build wind non-float '(1.5-1.625]' -> build wind non-float '(1.625-1.985]' -> build wind non-float '(1.985-inf)' -> headlamps (122/214 instances correct) Time taken to build model: 0 seconds === Stratified cross-validation === === Summary === Correctly Classified Instances 122 57.0093 % Incorrectly Classified Instances 92 42.9907 % Kappa statistic 0.3735 Mean absolute error 0.1228 Root mean squared error 0.3505 Relative absolute error 58.0048 % Root relative squared error 107.9885 % Total Number of Instances 214 === Detailed Accuracy By Class === TP Rate FP Rate Precision Recall F-Measure ROC Area Class 0.829 0.319 0.558 0.829 0.667 0.755 build wind float 0.592 0.275 0.542 0.592 0.566 0.658 build wind non-float 0 0 0 0 0 0.5 vehic wind float 0 0 0 0 0 ? vehic wind non-float 0 0 0 0 0 0.5 containers 0 0 0 0 0 0.5 tableware 0.655 0.043 0.704 0.655 0.679 0.806 headlamps Weighted Avg. 0.57 0.208 0.47 0.57 0.511 0.681 === Confusion Matrix === a b c d e f g <-- classified as 58 12 0 0 0 0 0 | a = build wind float 28 45 0 0 0 0 3 | b = build wind non-float 12 5 0 0 0 0 0 | c = vehic wind float 0 0 0 0 0 0 0 | d = vehic wind non-float 0 9 0 0 0 0 4 | e = containers 3 5 0 0 0 0 1 | f = tableware 3 7 0 0 0 0 19 | g = headlamps #use equal width, ignore class=false, number of bins=8, find optimal no of bins:false === Run information === Scheme:weka.classifiers.rules.OneR -B 6 Relation: Glass-weka.filters.unsupervised.attribute.Discretize-B8-M-1.0-Rfirst-last Instances: 214 Attributes: 10 RI Na Mg Al Si K Ca Ba Fe Type Test mode:10-fold cross-validation === Classifier model (full training set) === Al: '(-inf-0.69125]' -> build wind float '(0.69125-1.0925]' -> build wind float '(1.0925-1.49375]' -> build wind float '(1.49375-1.895]' -> build wind non-float '(1.895-2.29625]' -> headlamps '(2.29625-2.6975]' -> headlamps '(2.6975-3.09875]' -> headlamps '(3.09875-inf)' -> containers (115/214 instances correct) Time taken to build model: 0 seconds === Stratified cross-validation === === Summary === Correctly Classified Instances 110 51.4019 % Incorrectly Classified Instances 104 48.5981 % Kappa statistic 0.2979 Mean absolute error 0.1389 Root mean squared error 0.3726 Relative absolute error 65.5707 % Root relative squared error 114.8155 % Total Number of Instances 214 === Detailed Accuracy By Class === TP Rate FP Rate Precision Recall F-Measure ROC Area Class 0.914 0.451 0.496 0.914 0.643 0.731 build wind float 0.355 0.21 0.482 0.355 0.409 0.573 build wind non-float 0 0 0 0 0 0.5 vehic wind float 0 0 0 0 0 ? vehic wind non-float 0 0.01 0 0 0 0.495 containers 0 0 0 0 0 0.5 tableware 0.655 0.043 0.704 0.655 0.679 0.806 headlamps Weighted Avg. 0.514 0.229 0.429 0.514 0.448 0.643 === Confusion Matrix === a b c d e f g <-- classified as 64 6 0 0 0 0 0 | a = build wind float 45 27 0 0 0 0 4 | b = build wind non-float 12 5 0 0 0 0 0 | c = vehic wind float 0 0 0 0 0 0 0 | d = vehic wind non-float 2 8 0 0 0 0 3 | e = containers 3 5 0 0 0 0 1 | f = tableware 3 5 0 0 2 0 19 | g = headlamps #use the same as before, but use optimize number of bins for each attribute, ignore class=true or false doesn't make a difference === Run information === Scheme:weka.classifiers.rules.OneR -B 6 Relation: Glass-weka.filters.unsupervised.attribute.Discretize-O-B8-M-1.0-Rfirst-last Instances: 214 Attributes: 10 RI Na Mg Al Si K Ca Ba Fe Type Test mode:10-fold cross-validation === Classifier model (full training set) === Mg: '(-inf-0.56125]' -> headlamps '(0.56125-1.1225]' -> build wind non-float '(1.1225-1.68375]' -> build wind non-float '(1.68375-2.245]' -> containers '(2.245-2.80625]' -> build wind non-float '(2.80625-3.3675]' -> build wind non-float '(3.3675-3.92875]' -> build wind float '(3.92875-inf)' -> build wind non-float (107/214 instances correct) Time taken to build model: 0 seconds === Stratified cross-validation === === Summary === Correctly Classified Instances 96 44.8598 % Incorrectly Classified Instances 118 55.1402 % Kappa statistic 0.2231 Mean absolute error 0.1575 Root mean squared error 0.3969 Relative absolute error 74.3975 % Root relative squared error 122.2995 % Total Number of Instances 214 === Detailed Accuracy By Class === TP Rate FP Rate Precision Recall F-Measure ROC Area Class 0.9 0.507 0.463 0.9 0.612 0.697 build wind float 0.211 0.138 0.457 0.211 0.288 0.536 build wind non-float 0 0 0 0 0 0.5 vehic wind float 0 0 0 0 0 ? vehic wind non-float 0 0.03 0 0 0 0.485 containers 0 0.01 0 0 0 0.495 tableware 0.586 0.097 0.486 0.586 0.531 0.744 headlamps Weighted Avg. 0.449 0.23 0.38 0.449 0.374 0.609 === Confusion Matrix === a b c d e f g <-- classified as 63 7 0 0 0 0 0 | a = build wind float 50 16 0 0 1 1 8 | b = build wind non-float 17 0 0 0 0 0 0 | c = vehic wind float 0 0 0 0 0 0 0 | d = vehic wind non-float 0 5 0 0 0 1 7 | e = containers 0 4 0 0 2 0 3 | f = tableware 6 3 0 0 3 0 17 | g = headlamps #use original data without binning === Run information === Scheme:weka.classifiers.rules.OneR -B 6 Relation: Glass Instances: 214 Attributes: 10 RI Na Mg Al Si K Ca Ba Fe Type Test mode:10-fold cross-validation === Classifier model (full training set) === Al: < 0.905 -> build wind float < 1.1150000000000002 -> build wind non-float < 1.2149999999999999 -> build wind float < 1.2650000000000001 -> build wind non-float < 1.42 -> build wind float < 1.815 -> build wind non-float < 2.95 -> headlamps >= 2.95 -> containers (135/214 instances correct) Time taken to build model: 0.01 seconds === Stratified cross-validation === === Summary === Correctly Classified Instances 124 57.9439 % Incorrectly Classified Instances 90 42.0561 % Kappa statistic 0.3946 Mean absolute error 0.1202 Root mean squared error 0.3466 Relative absolute error 56.7438 % Root relative squared error 106.8083 % Total Number of Instances 214 === Detailed Accuracy By Class === TP Rate FP Rate Precision Recall F-Measure ROC Area Class 0.786 0.299 0.561 0.786 0.655 0.744 build wind float 0.579 0.268 0.543 0.579 0.561 0.655 build wind non-float 0 0 0 0 0 0.5 vehic wind float 0 0 0 0 0 ? vehic wind non-float 0.231 0 1 0.231 0.375 0.615 containers 0 0 0 0 0 0.5 tableware 0.759 0.054 0.688 0.759 0.721 0.852 headlamps Weighted Avg. 0.579 0.2 0.53 0.579 0.534 0.69 === Confusion Matrix === a b c d e f g <-- classified as 55 15 0 0 0 0 0 | a = build wind float 26 44 0 0 0 0 6 | b = build wind non-float 12 5 0 0 0 0 0 | c = vehic wind float 0 0 0 0 0 0 0 | d = vehic wind non-float 0 7 0 0 3 0 3 | e = containers 3 5 0 0 0 0 1 | f = tableware 2 5 0 0 0 0 22 | g = headlamps