Skip to content

Usage GuideΒΆ

Setting Up LocallyΒΆ

1. Clone the RepositoryΒΆ

git clone https://gitlab.com/icos/intelligence/intelligence-module.git
cd intelligence-module

2. Create a Python EnvironmentΒΆ

conda create -n icos-dev python=3.10
conda activate icos-dev
pip install -r src/requirements.txt

🧹 CLI Interaction via ICOS Shell¢

The Intelligence Layer supports CLI-based interactions through the Export Metrics API, accessible within the ICOS Shell. You can use two main commands:

  • train: Triggers model training with JSON-formatted input.
  • predict: Generates predictions based on the provided data.

These commands map to backend API endpoints detailed in the Backend Table.


πŸ“‹ Using JupyterHub Inside the AI Support ContainerΒΆ

  1. Access the container:

    docker exec -u root -it icos_intelligence_docker /bin/bash
    

  2. Create a user:

    passwd UC1  # Replace UC1 with your desired username
    

  3. Launch JupyterHub:

    jupyterhub -f /path/to/jupyterhub_config.py
    

  4. Login via browser using the created credentials.


πŸ” Trustworthy AI ModuleΒΆ

βœ… Explainable AI (XAI)ΒΆ

Uses SHAP for model interpretability:

plot_func(shap_data, show=False)
mlflow.log_figure(figure=fig, artifact_file=file_name)
Use consistent MLFlow tags to group experiment artifacts.

βœ… Prediction ConfidenceΒΆ

Each prediction includes confidence scores and intervals for assessing result reliability.

βœ… Model MonitoringΒΆ

Uses NannyML to monitor performance and detect drift. Retraining can be triggered automatically.

βœ… Federated LearningΒΆ

Implements federated learning via Flower, enabling privacy-preserving distributed training where raw data stays local.


πŸ“Š AI Analytics ModuleΒΆ

To support scalable, efficient, and interpretable machine learning workflows, the Intelligence Layer offers a robust analytics module designed for both research and production use cases.

The Intelligence Layer provides a flexible and extensible AI analytics pipeline that supports:

  • Univariate & Multivariate Forecasting using LSTM models:

  • Supports forecasting for n-size metrics. If a single metric is used, it performs univariate forecasting. If multiple metrics are provided, it performs multivariate forecasting using a shared vanilla neural network architecture.

  • Users can define the number of steps ahead (x) for prediction β€” supporting both short and long-term forecasts.

  • The main limitations are the volume and quality of data, and the need for customized models tuned to specific datasets.

  • Experiment Tracking with MLflow:

  • Automatically logs training metrics, loss curves, and other relevant parameters.

  • Helps users visually inspect and compare model performance across multiple runs.

  • Model Compression using quantization and distillation:

πŸ”§ Model CompressionΒΆ

The Intelligence Layer supports PyTorch-based model compression, which can be configured directly in the training request to the /train endpoint. This includes:

{
  "pytorch_model_parameters": {
    "hidden_size": 64,
    "num_epochs": 50,
    "quantize": true,
    "distill": false
  }
}

πŸ”Ή Available Compression Options:ΒΆ

  • "quantize": true β†’ Enables dynamic post-training quantization (INT8 conversion using PyTorch). This reduces the model size by up to 3Γ—.

  • "distill": true β†’ Activates distillation using a teacher-student training setup, achieving model size reductions of up to 70Γ—.

  • Both techniques can be combined, yielding a smaller, faster INT8-optimized model with preserved accuracy.

πŸ“„ OutputΒΆ

  • Trained models (whether quantized or full precision) are stored in the Intelligence Layer model registry.

  • Additional parameters (e.g., hidden_size, num_epochs) can be adjusted to fine-tune training.

These compression techniques improve model efficiency for deployment while preserving transparency and accuracy β€” making them suitable for both edge and cloud environments.

All compression options are easily configured using the JSON payload shown above.

Local testsΒΆ

# Example training script with argparse
import argparse
import json

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Model Trainer')
    parser.add_argument('--steps_back', default=12)
    parser.add_argument('--model_type', default='XGB', choices=['XGB', 'ARIMA'])
    parser.add_argument('--dataset_name', default='ORANGECPU')
    parser.add_argument('--test_size', default=0.2)
    parser.add_argument('--model_parameters', default={"arima_model_parameters": {"p":5, "d":1, "q":0},
        "xgboost_model_parameters": {"n_estimators":1000, "max_depth":7, "eta":0.1}}, type=json.loads)
    args = parser.parse_args()
    train = ModelTrain(args)
    results = train.initiate_train()
  • Inference via API:
  • Load trained models from BentoML store via the service API
# Example inference script
from inference import predict_cpu_utilisation, PredictFeatures

data = {
    "model_tag": "cpu_utilization_model_xgb:latest",
    "model_type": "XGB",
    "steps_back": 12,
    "input_series": [79.4, 67.9, 71.2, 46.5, 67.3, 65.7, 62.7, 70.5, 73.5, 69.9, 64.1, 61.0]
}
input_data = PredictFeatures(**data)
y_test_pred = predict_cpu_utilisation(input_data)
print(f"Predicted value: {y_test_pred}")

UsageΒΆ

After running the api service you could use tools like curl, Postman or Swagger to interact with the endpoints.

ExampleΒΆ

  • To request a model training

    curl -X 'POST' \
      'http://0.0.0.0:3000/train_metrics_utilisation' \
      -H 'accept: application/json' \
      -H 'Content-Type: application/json' \
      -d '{model_training_parameters}'
    

  • To request model inference

    curl -X 'POST' \
      'http://0.0.0.0:3000/predict_metrics_utilisation' \
      -H 'accept: application/json' \
      -H 'Content-Type: application/json' \
      -d '{model_inference_parameters}'