Development Guide¶
This guide provides instructions for developers contributing to the ICOS Intelligence Layer project. It covers local setup, service building, and guidelines for extending components like training, inference, monitoring, and model management.
Project Structure¶
The repository includes the following folders. The goal of this structure is to define a scalable framework and an agnostic API, enabling efficient integration across different machine learning libraries. The models directory, in particular, supports parallel development for multiple backends in a consistent way. You can add any other library by following the structure defined in {LIBRARY}
:
├── apis/ # OpenAPI specifications
├── bentos/ # BentoML bundle definitions
├── dataset/ # Datasets and preprocessing logic
├── env/ # Docker and environment setup files
├── notebooks/ # Jupyter notebooks for testing or demos
├── oasis/ # Core logic of the Intelligence Layer
│ ├── tai/ # Trustworthy AI: Explainability, monitoring
│ ├── analytics/ # Metrics computation and training utilities
│ ├── processing/ # Data pipelines and utilities
│ └── models/ # Model architecture and training scripts
│ ├── management/ # Model registry logic
│ └── {LIBRARY}/ # Deep learning models (e.g., PyTorch, TensorFlow)
│ ├── arch/ # Architectures like LSTM, RNN
│ ├── train.py
│ └── predict.py
├── offloading/ # Federation and remote training logic
├── test/ # Unit and integration tests
├── LICENSE, README.md, .gitignore, __init__.py
Files and Folder Structure¶
Current structure based on the previous skelethon:
├── .gitignore
├── bentofile.yaml
├── LICENSE
├── README.md
├── apis
│ ├── openapi.md
│ └── openapi.yaml
├── bentos
│ ├── analytics-45oaqrdztss4sn3j_arm.bento
│ └── analytics-62wndsd7ecmzswar_latest_x86.bento
├── dataset
│ ├── cpu_dataset_500_samples.csv
│ └── node_3_utilisation_sample_dataset.csv
├── env
│ ├── docker
│ ├── Dockerfile
│ ├── analytics_docker.img
├── oasis
│ ├── analytics
│ │ ├── dataframes.py
│ │ ├── lstm_model.py
│ │ ├── metrics.py
│ │ └── model_metrics.py
│ ├── api_service.py
│ ├── api_service_configs.json
│ ├── api_train.py
│ ├── bentofile.yaml
│ ├── clean_dockers.sh
│ ├── configuration.yaml
│ ├── models
│ │ ├── arima
│ │ │ └── arima_compiler.py
│ │ ├── management
│ │ │ └── registry.py
│ │ ├── pytorch
│ │ │ └── pytorch_compiler.py
│ │ └── xgboost
│ │ └── xgboost_compiler.py
│ ├── processing
│ │ ├── process.py
│ │ └── utils.py
│ ├── requirements.txt
│ └── tai
│ ├── model_explainability.py
│ └── monitoring.py
├── offloading
│ ├── Dockerfile
│ ├── docker-compose.yml
│ └── requirements.txt
├── test
│ ├── api
│ ├── misc_services.py
│ ├── model_inference.py
│ ├── model_training.py
- .gitignore: Specifies files and folders to be ignored by Git. Subdirectories may include their own
.gitignore
as needed. - bentofile.yaml: BentoML configuration file used for building and serving models.
- LICENSE & README.md: Standard project metadata and documentation.
- apis/: Contains OpenAPI specifications:
openapi.yaml
: Main API definition (endpoints, methods, auth).openapi.md
: Markdown version or supplementary description.- bentos/: Stores serialized BentoML bundles (ready-to-deploy models).
- dataset/: Contains sample datasets used for training and testing:
cpu_dataset_500_samples.csv
,node_3_utilisation_sample_dataset.csv
- env/: Environment and Docker-related files:
docker/
: IncludesDockerfile
and pre-built images likeanalytics_docker.img
- oasis/: Core logic and model-serving source code:
analytics/
: Includes models, metrics computation, and dataframe helpers.api_service.py
: Main BentoML service with training/inference endpoints.api_train.py
: Contains model training orchestration logic.api_service_configs.json
: Stores configurable runtime parameters.bentofile.yaml
,clean_dockers.sh
,configuration.yaml
: Utilities for building and managing services.models/
: Modular structure for various ML libraries:arima/
,pytorch/
,xgboost/
: Contain custom model compiler scripts.management/
: Includes model registry logic (registry.py
) for HuggingFace.
processing/
: Data preparation and utilities (process.py
,utils.py
).tai/
: Trustworthy AI components (e.g., explainability via SHAP, drift monitoring via NannyML).requirements.txt
: Dependency list for theoasis/
module.- offloading/: dataClay support:
- Includes its own
Dockerfile
,docker-compose.yml
, andrequirements.txt
. - test/: Unit and integration tests for key components:
api/
: Includes test cases formisc_services.py
,model_inference.py
, andmodel_training.py
.
Create a Helm Chart¶
Helm Quick Start¶
Run the following command:
This generates the basic chart structure:
-
.helmignore
: Ignore patterns -
Chart.yaml
: Metadata -
values.yaml
: Default values -
templates/
: K8s manifests (deployment, service) -
charts/
: Optional chart dependencies
Customize values.yaml
and templates/deployment.yaml
for your use case. Helm simplifies Kubernetes deployments.
Please refer to the helm suite for intelligence-coordination in ICOS Controller Repository
Developing New Features¶
Add New Models¶
For the univariate and multivariate LSTM, we used the models presented in 1, as well as the XGBoost model
- Add new architecture to
analytics/model.py
- If the supporting library is new, add the executors in
oasis/models/{LIBRARY}/arch/
- Add
api_train_model.py
logic - Update
api_train_model.py
andapi_service.py
Add Monitoring or Explainability¶
- Add SHAP, NannyML, or other utilities to
oasis/tai/
- Update API definitions or CLI commands accordingly
Extend Preprocessing Pipelines¶
- Use
oasis/processing/process.py
for transformations - Add helpers to
oasis/processing/utils.py
Register Models¶
Use oasis/models/management/registry.py
to manage HuggingFace integrations.
Running Tests¶
✅ Ensure all new features are covered by tests.
Contribution Guidelines¶
- Use feature branch naming:
feature/your-feature
- Include docstrings for all public functions and classes
- Add logs or screenshots in merge requests when helpful