Cross-Validation from Scratch
MediumCross-ValidationModel EvaluationMachine LearningPerformance MetricsData Splitting
Implement k-fold cross-validation from scratch and understand its importance in model evaluation. Compare cross-validation results with simple train-test split and visualize the performance across different folds.
Problem:
Implement k-fold cross-validation from scratch and understand its importance in model evaluation. Compare cross-validation results with simple train-test split and visualize the performance across different folds.
Examples:
Input: X, y = load_diabetes(return_X_y=True)
cv_results = cross_validation(X, y, LinearRegression())
Output: Mean MSE: 3000.25 ± 500.12
Mean R²: 0.4521 ± 0.0892
Cross-validation on diabetes dataset showing mean and standard deviation of metrics
Input: X = np.random.randn(1000, 10)
y = np.random.randn(1000)
folds = create_folds(X, y, k=10)
Output: 10 folds created
Each fold size ≈ 100
Creating 10 folds with random data, demonstrating equal fold sizes
Constraints:
- Must shuffle data before creating folds
- Must handle non-divisible fold sizes
- Must calculate both MSE and R² metrics
Code Editorpython
Run your code to see the output here.
Output
Run your code to see the output here.