Comparing Evolutionary-Inspired Algorithms and Neural Networks on Breast Cancer Prediction

Kevin Lopez Sepulveda

📋 Agenda

  • Introduction to the problem
  • Project Goal
  • Random Forest
  • MLP Neural Network
  • Methods
  • Results
  • Conclusion

🩺 Why Predicting Breast Cancer Matters

  • Early detection increases survival rates
  • Machine learning can support diagnostic tools
  • Automation helps reduce human error

🌲 How Random Forest Works

  • An ensemble of decision trees
  • Trains multiple trees on bootstrapped samples
  • Final prediction is based on majority voting
  • Helps reduce overfitting

🧠 How MLP Neural Network Works

  • MLP = Multi-layer Perceptron
  • Consists of input layer, hidden layers, output layer
  • Learns via backpropagation and gradient descent
  • Effective at capturing complex patterns

🔧 Methods

  • Dataset: Breast Cancer Wisconsin Diagnostic Data
  • Preprocessing: normalization, handling missing values
  • Algorithms: Random Forest, MLP
  • Evaluation metrics: Accuracy, Precision, Recall, F1 Score

🐍 Running Models and Showing Results

-Python Code: Below is the code to run both models and compare their performance on the breast cancer dataset.

Training Random Forest Classifier (baseline for evolutionary approach)...

Training Neural Network Classifier...

--- Random Forest Classification Report ---
              precision    recall  f1-score   support

           0       0.98      0.93      0.95        43
           1       0.96      0.99      0.97        71

    accuracy                           0.96       114
   macro avg       0.97      0.96      0.96       114
weighted avg       0.97      0.96      0.96       114


--- Neural Network Classification Report ---
              precision    recall  f1-score   support

           0       0.98      0.98      0.98        43
           1       0.99      0.99      0.99        71

    accuracy                           0.98       114
   macro avg       0.98      0.98      0.98       114
weighted avg       0.98      0.98      0.98       114


--- Accuracy Summary ---
Random Forest Accuracy: 0.9649
MLP Accuracy:          0.9825

📊 Results

  • Precision : How many predicted positive cases are actually positive

  • Recall : How many actual positive cases are correctly identified

  • F1-Score : Balance of precision and recall

  • Accuracy : Overall percentage of correct predictions

✅ Conclusion

  • Random Forest performed better in terms of interpretability
  • MLP captured non-linear relationships better
  • Combining both might improve overall performance
  • Future work: include other algorithms, tune hyperparameters

🙏 Thank You

Kevin Lopez Sepulveda
klopezs@bu.edu