Comprehensive, modular, and well-documented machine learning library for educational purposes on the Ethereum blockchain
Artemis adalah library machine learning komprehensif yang ditulis dalam Solidity, terinspirasi oleh scikit-learn dan framework ML modern. Library ini dirancang untuk demonstrasi dan pembelajaran tentang implementasi algoritma machine learning on-chain.
- Features Overview
- Installation Guide
- Quick Start
- API Documentation
- Examples
- Gas Optimization
- Limitations
- Contributing
- License
Artemis menyediakan implementasi lengkap berbagai algoritma machine learning dengan arsitektur modular:
- Linear Regression - Regresi linear dengan gradient descent optimization
- Logistic Regression - Klasifikasi biner dengan sigmoid activation
- K-Nearest Neighbors - Instance-based learning untuk classification dan regression
- K-Means Clustering - Clustering dengan multiple initialization methods
- Neural Network - Multi-layer perceptron dengan backpropagation
- Dense Layers - Fully connected layers dengan bias
- Activation Layers - Non-linear transformations
- Optimizers - SGD dan Adam optimizers
- Array Utils - Operasi array dan vektor
- Matrix Operations - Operasi matriks dasar
- Statistics - Fungsi statistik dan analisis data
- Data Preprocessing - Normalisasi dan scaling
- Activation Functions - ReLU, Sigmoid, Softmax
- Loss Functions - MSE, MAE, Cross Entropy
- Optimizers - SGD, Adam dengan momentum
- Interfaces - Standardized interfaces untuk ekstensibilitas
- Foundry - Toolkit development Ethereum
- Solidity ^0.8.0
Artemis/
├── src/
│ └── Artemis/
│ ├── activation/ # Activation functions (ReLU, Sigmoid, Softmax)
│ ├── examples/ # Complete usage examples
│ ├── interfaces/ # Core interfaces (IModel, IActivation, etc.)
│ ├── loss/ # Loss functions (MSE, MAE, CrossEntropy)
│ ├── math/ # Mathematical utilities
│ ├── models/ # ML models (supervised & unsupervised)
│ ├── neural/ # Neural network components
│ └── utils/ # Utility functions
├── test/ # Test files
├── script/ # Deployment scripts
└── foundry.toml # Foundry configuration
- Clone repository atau tambahkan sebagai submodule:
git clone https://github.com/rezacrown/Artemis.git
cd Artemis- Install dependencies:
forge install- Build project:
forge build- Run tests:
forge test// Import model yang diinginkan
import "Artemis/src/Artemis/models/supervised/LinearRegression.sol";
import "Artemis/src/Artemis/neural/NeuralNetwork.sol";
import "Artemis/src/Artemis/utils/DataPreprocessor.sol";// SPDX-License-Identifier: MIT
pragma solidity ^0.8.0;
import "../Artemis/models/supervised/LinearRegression.sol";
import "../Artemis/utils/DataPreprocessor.sol";
contract HousePricePredictor {
LinearRegression public model;
constructor() {
// Inisialisasi model dengan learning rate 0.01 dan regularization 0.001
model = new LinearRegression(1e16, 1e15);
}
function trainModel() public returns (bool success) {
uint256[][] memory features = getTrainingFeatures();
uint256[] memory labels = getTrainingLabels();
// Preprocess data
(uint256[][] memory processedFeatures, uint256[] memory processedLabels, ) =
DataPreprocessor.scaleFeatures(features, 0);
// Train model dengan 100 epochs
(success, ) = model.train(processedFeatures, processedLabels, 100, 1e16);
return success;
}
function predictPrice(uint256 area, uint256 bedrooms) public view returns (uint256 price) {
uint256[] memory features = new uint256[](2);
features[0] = area * 1e18; // Convert ke fixed-point
features[1] = bedrooms * 1e18;
return model.predict(features);
}
}// SPDX-License-Identifier: MIT
pragma solidity ^0.8.0;
import "../Artemis/neural/NeuralNetwork.sol";
import "../Artemis/neural/layers/DenseLayer.sol";
import "../Artemis/activation/Sigmoid.sol";
import "../Artemis/loss/CrossEntropyLoss.sol";
import "../Artemis/neural/optimizers/AdamOptimizer.sol";
contract XORClassifier {
NeuralNetwork public network;
Sigmoid public sigmoid;
CrossEntropyLoss public lossFunction;
AdamOptimizer public optimizer;
constructor() {
sigmoid = new Sigmoid();
lossFunction = new CrossEntropyLoss();
optimizer = new AdamOptimizer();
network = new NeuralNetwork(2, 1, lossFunction, optimizer);
// Build network architecture
network.addDenseLayer(4, true, sigmoid); // Hidden layer dengan 4 neurons
network.addDenseLayer(1, true, sigmoid); // Output layer
}
function trainXOR() public returns (bool success) {
uint256[][] memory features = getXORFeatures();
uint256[] memory labels = getXORLabels();
(success, ) = network.train(features, labels, 500, 1e17);
return success;
}
function getXORFeatures() public pure returns (uint256[][] memory) {
uint256[][] memory features = new uint256[][](4);
uint256 constant ONE = 1e18;
features[0] = new uint256[](2); // [0,0] -> 0
features[0][0] = 0;
features[0][1] = 0;
features[1] = new uint256[](2); // [0,1] -> 1
features[1][0] = 0;
features[1][1] = ONE;
features[2] = new uint256[](2); // [1,0] -> 1
features[2][0] = ONE;
features[2][1] = 0;
features[3] = new uint256[](2); // [1,1] -> 0
features[3][0] = ONE;
features[3][1] = ONE;
return features;
}
function getXORLabels() public pure returns (uint256[] memory) {
uint256[] memory labels = new uint256[](4);
uint256 constant ONE = 1e18;
labels[0] = 0; // [0,0] -> 0
labels[1] = ONE; // [0,1] -> 1
labels[2] = ONE; // [1,0] -> 1
labels[3] = 0; // [1,1] -> 0
return labels;
}
}interface IModel {
function train(uint256[][] features, uint256[] labels, uint256 epochs, uint256 learningRate)
external returns (bool success, uint256 finalLoss);
function predict(uint256[] features) external view returns (uint256 prediction);
function evaluate(uint256[][] features, uint256[] labels)
external view returns (uint256 accuracy, uint256 loss);
function getTrainingStatus() external view returns (bool isTrained, uint256 epochs, uint256 loss);
function getParameters() external view returns (uint256[] parameters);
function setParameters(uint256[] parameters) external returns (bool success);
}interface IActivation {
function activate(uint256 input) external pure returns (uint256 output);
function derivative(uint256 input) external pure returns (uint256 derivative);
function getActivationInfo() external pure returns (string name, string version, string type);
}interface ILossFunction {
function calculateLoss(uint256 prediction, uint256 target) external pure returns (uint256 loss);
function calculateGradient(uint256 prediction, uint256 target) external pure returns (int256 gradient);
function getLossFunctionInfo() external pure returns (string name, string version, bool differentiable);
}interface IOptimizer {
function updateParameters(uint256[] parameters, int256[] gradients, uint256 learningRate)
external returns (uint256[] updatedParameters, uint256 updateMagnitude);
function setLearningRate(uint256 newLearningRate) external returns (bool success);
function getOptimizerInfo() external view returns (string name, string version, string type);
}Use Cases: Price prediction, trend analysis, continuous value prediction
Parameters:
learningRate: Learning rate untuk gradient descent (recommended: 0.01-0.001)regularization: L2 regularization strength (recommended: 0.001-0.0001)
Example:
LinearRegression model = new LinearRegression(1e16, 1e15); // 0.01 learning rate, 0.001 regularizationUse Cases: Binary classification, probability estimation
Parameters:
learningRate: Learning rate untuk optimizationregularization: L2 regularization untuk mencegah overfitting
Use Cases: Classification dan regression, pattern recognition
Configuration:
kValue: Jumlah neighbors (recommended: 3-10)distanceMetric: Euclidean, Manhattan, atau MinkowskiweightingStrategy: Uniform atau distance-based weighting
Use Cases: Customer segmentation, data grouping, anomaly detection
Parameters:
numClusters: Jumlah cluster (K)maxIterations: Maksimum iterasi traininginitializationMethod: Random, K-means++, atau Manual
Use Cases: Complex pattern recognition, non-linear relationships
Architecture:
- Multiple dense layers dengan activation functions
- Support untuk berbagai optimizers (SGD, Adam)
- Configurable loss functions
Dataset: Sample house data dengan features (area, bedrooms) dan target (price)
Features:
- Area dalam square feet
- Number of bedrooms
Target: House price dalam USD
Expected Results:
- R-squared > 0.85 untuk dataset training
- Accurate price predictions untuk houses dengan features serupa
Code:
// Lihat file lengkap di: src/Artemis/examples/TrainLinearRegression.sol
contract TrainLinearRegression {
function runDemo() public returns (string memory results) {
// Training, evaluation, dan prediction workflow lengkap
return "Demo completed successfully";
}
}Dataset: XOR gate truth table
Features: [input1, input2] (0 atau 1) Target: XOR output (0 atau 1)
Architecture:
- Input Layer: 2 nodes
- Hidden Layer: 4 nodes dengan Sigmoid activation
- Output Layer: 1 node dengan Sigmoid activation
Expected Results:
- Accuracy > 95% untuk XOR patterns
- Proper classification untuk semua input combinations
- Batch Processing
// Gunakan batch operations untuk mengurangi transaction count
function trainBatch(uint256[][] features, uint256[] labels, uint256 batchSize) public {
for (uint256 i = 0; i < features.length; i += batchSize) {
// Process batch
}
}- Storage Optimization
- Gunakan
memorydaripadastorageketika memungkinkan - Pack multiple variables ke dalam single storage slot
- Gunakan fixed-point arithmetic untuk efisiensi
- Model Complexity Management
- Pilih model yang sesuai dengan complexity problem
- Gunakan regularization untuk mencegah overfitting
- Pertimbangkan trade-off antara accuracy dan gas costs
- Inference Optimization
- Cache predictions ketika memungkinkan
- Gunakan simplified models untuk production
- Implement model compression techniques
| Operation | Gas Cost | Complexity |
|---|---|---|
| Linear Regression Training (100 epochs) | ~500K gas | O(n×features×epochs) |
| Neural Network Forward Pass | ~50K gas | O(layers×neurons) |
| KNN Prediction | ~100K gas | O(n×features) |
| K-Means Clustering (10 iterations) | ~1M gas | O(k×n×features×iterations) |
-
Computational Limits
- Gas limits membatasi complexity computations
- Large datasets tidak praktis untuk on-chain processing
- Iterative algorithms memerlukan careful gas management
-
Numerical Precision
- Fixed-point arithmetic dengan precision 1e18
- Potential untuk overflow/underflow
- Limited numerical stability untuk complex operations
-
Storage Costs
- Model parameters memerlukan storage space
- Training data tidak praktis untuk disimpan on-chain
- Consider off-chain storage dengan on-chain verification
-
Recommended Use Cases
- ✅ Small to medium datasets
- ✅ Simple to moderately complex models
- ✅ Batch processing dengan reasonable sizes
- ✅ Educational dan demonstrasi purposes
-
Scenarios to Avoid
- ❌ Large-scale deep learning
- ❌ Real-time continuous training
- ❌ Very large datasets
- ❌ High-frequency model updates
| Problem Type | Recommended Model | Use Case | Gas Efficiency |
|---|---|---|---|
| Regression | Linear Regression | Price prediction, trend analysis | ⭐⭐⭐⭐⭐ |
| Binary Classification | Logistic Regression | Yes/No classification | ⭐⭐⭐⭐ |
| Multi-class Classification | K-Nearest Neighbors | Pattern recognition | ⭐⭐⭐ |
| Clustering | K-Means | Customer segmentation | ⭐⭐ |
| Complex Patterns | Neural Network | Non-linear relationships | ⭐ |
Issue: Model loss tidak berkurang selama training Solutions:
- Turunkan learning rate (0.01 → 0.001)
- Normalisasi features menggunakan DataPreprocessor
- Cek data quality dan remove outliers
- Increase number of training epochs
Issue: Training accuracy tinggi tapi test accuracy rendah Solutions:
- Gunakan regularization (L2 untuk Linear/Logistic Regression)
- Kurangi model complexity
- Gunakan lebih banyak training data
- Implement early stopping
Issue: Training atau inference terlalu mahal Solutions:
- Gunakan batch processing untuk large datasets
- Optimize model architecture (kurangi layers/neurons)
- Cache predictions untuk data yang sama
- Consider off-chain computation dengan on-chain verification
Issue: Overflow/underflow errors Solutions:
- Gunakan fixed-point arithmetic dengan precision yang sesuai
- Normalisasi input data ke range [0, 1]
- Implement gradient clipping untuk neural networks
- Gunakan activation functions yang numerically stable (ReLU > Sigmoid)
-
Data Preprocessing:
// Selalu normalize data sebelum training (uint256[][] memory scaledFeatures, uint256[][] memory scalingParams) = DataPreprocessor.scaleFeatures(rawFeatures, 0); // 0 = min-max scaling
-
Hyperparameter Tuning:
- Learning Rate: Start dengan 0.01, adjust berdasarkan convergence
- Regularization: Gunakan 0.001 untuk mencegah overfitting
- Batch Size: Gunakan batch processing untuk datasets besar
-
Model Architecture:
- Start dengan model sederhana, tingkatkan complexity secara bertahap
- Gunakan activation functions yang sesuai untuk problem type
- Monitor training progress dengan events
Kami menyambut kontribusi dari komunitas! Berikut guidelines untuk berkontribusi:
- Gunakan Solidity ^0.8.0
- Ikuti Solidity Style Guide
- Gunakan NatSpec comments untuk semua public functions
- Tulis comprehensive tests untuk semua new features
# Run semua tests
forge test
# Run tests dengan gas reports
forge test --gas-report
# Run specific test file
forge test --match-path test/LinearRegression.t.sol- Update README.md untuk new features
- Tambahkan examples untuk demonstration
- Document gas costs dan performance characteristics
- Sertakan use cases dan best practices
- Fork repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open Pull Request
Artemis dilisensikan di bawah MIT License - lihat file LICENSE untuk detail lengkap.
MIT License
Copyright (c) 2024 Rizky Reza
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
- Terinspirasi oleh scikit-learn dan framework machine learning modern
- Dibangun dengan Foundry toolkit
- Menggunakan fixed-point arithmetic untuk numerical stability
- Dirancang untuk educational purposes dan blockchain experimentation
Rizky Reza - Bringing Machine Learning to the Blockchain
