Cube4health - Package and Service

A set of algorithms and software packages has been specifically developed for processing drone (Module 1), health (Module 2), and climate (Module 3) data, enabling visualization and analysis in accordance with our technical–scientific concept, the HARMONIZE Instance. This infrastructure has been consolidated into the Cube4Health Python package, which integrates all modules required for data processing and publication through services such as GeoServer, BDC-STAC, and TiTiler. Building on this foundation, we will develop the Cube4Health Application Programming Interface (API) — a server-side extension that provides the same capabilities through a web service.

The Cube4Health package offers a complete environment for data processing and publication that can be directly employed in health attention systems. Its main limitation, however, lies in the need for a scalable server capable of handling computationally intensive workloads. The forthcoming Cube4Health API will address this challenge by providing a lightweight, easy-to-install client application that delegates data-processing tasks to remote servers. While this approach substantially reduces local hardware requirements, it increases the demand for network bandwidth to support the transfer of large datasets between client and server.

Cube4health package

Cube4Health is a Python package developed using version 3.10, which provides a set of scripts for processing and publishing data related to health, climate, and drone imagery, with a specific focus on the Brazilian context. The package follows protocols defined within each module, enabling the integration and analysis of data from various sources, provided they comply with these protocols. It offers an efficient solution for handling such data, facilitating the extraction of essential information for monitoring public health, climate change, and other environmental variables that impact population well-being.

The Cube4Health package is a Python 3.10 library that consolidates the functionality of previously independent algorithms into a unified, modular framework for processing and publishing health, climate, and drone-derived data, with particular emphasis on the Brazilian context. The package adheres to protocols defined within each module, ensuring that heterogeneous data sources can be integrated and analyzed provided they conform to the established standards. This structure supports efficient data handling and facilitates the extraction of critical information for public health surveillence, climate-change assessment, and other environmental analyses relevant to population well-being.

Figure below illustrates the architecture of Cube4Health, which is composed of modules dedicated to different stages of data processing, each with specific responsibilities within the system.

_images/c4h-package.png

Figure 3 - Architecture of the cube4health package.

Distributed as a Python package, Cube4Health acts as a wrapper for its core modules — including EDDPR, EHIPR, ECLIMPR, and EDPU — which are installed as dependencies. Its integrated functional layer internally manages these modules, enabling users to locally configure and process their own data. This modular architecture promotes the creation of scalable and customizable pipelines, while maintaining the flexibility needed to integrate with external systems and custom data sources. The project is hosted on GitHub (Figure below ), promoting open access, collaborative contributions, and scientific reproducibility (https://github.com/Harmonize-Brazil/Cube4Health).

_images/c4h-git.png

Figure 4 - Screenshot from the Github repository of the cube4health package.

Cube4health API

The Cube4Health API (planned) will extend the functionality of the Cube4Health package to a server-based architecture, fully integrated into the HARMONIZE project framework. Through a lightweight client application, users will be able to send requests containing raw data and receive metadata, JSON files, and processed images ready for publication (Figure below). This service-oriented model enables remote data processing and seamless integration with distributed environments. To ensure robustness, scalability, and automation, the Cube4Health API will be implemented using a workflow orchestrator, such as Apache Airflow, responsible for coordinating tasks, managing dependencies, and monitoring complex data-processing pipelines. This design will provide greater automation, reproducibility, and adaptability across diverse health and environmental data workflows.

_images/c4h-api.png

Figure 5 - Architecture of the cube4health API.

Cube4health Documentation

The Cube4Health package includes a complete technical documentation developed using Sphinx, a framework for creating structured and extensible technical documentation in Python (Figure below). This documentation provides detailed information for developers and researchers, including installation guides, and examples of data processing workflows. The use of Sphinx enables automatic extraction of docstrings directly from the source code of the package, ensuring accuracy and synchronization between implementation and documentation. The generated files are exported in HTML formats and integrated with GitHub Pages, allowing version control and continuous updates aligned with the development package.