System Architecture
The HMC FAIR Data Dashboard comprises two projects.
The project (opens in a new tab) is divided into two project components, which are available in separate sub-repositories:
- The HMC Toolbox for Data Mining (opens in a new tab) collects relevant information about open and FAIR data published by a research organization. It includes the harvesting of literature, the identification of linked datasets, the enrichment of metadata, the determination of F-UJI Scores, the storage of data in a database.
- The HMC FAIR Data Dashboard (opens in a new tab) as a web interface for various target groups to interactively explore the collected data.
Best Practices for HMC FAIR Data Dashboard
We adhere to modern software development best practices. These include:
- Version Control (GitLab): We utilize version control systems to track changes, collaborate effectively, and manage code history.
- Separation of Concerns: By dividing functionality into distinct modules, we enhance maintainability and reduce code repetition.
- Partial Views and Utility Functions: We create reusable partial views and utility functions, promoting modularity and code efficiency.
- Single Source of Truth Data: Consistency is maintained by relying on a single authoritative data source.
- Data Reuse for Performance: Whenever possible, we reuse existing data to optimize performance.
- Efficient Queries: We craft queries that balance performance, speed, and resource usage.
- YAML Files for Translations: Using YAML files simplifies translation management and content updates.
- Responsive UI Design: Our user interfaces adapt seamlessly to various devices.
- Using SVG Files: SVG (Scalable Vector Graphics) files are employed wherever possible for scalable, resolution-independent graphics.
- Centralized Environment and Configuration Variables: A single source for environment variables streamlines system setup and configuration.
- Semantic Directory Naming: Well-chosen directory names enhance code maintenance.
- Containerization: Bundling necessary packages in containers isolates the system from user space and operating systems.
- Pre-Commit Hooks: We validate code quality using pre-commit hooks.
- Modern open software best practices: We use the latest best practices of open software programmes such as REUSE (opens in a new tab) and Open Source Security Foundation (OpenSSF) (opens in a new tab).
Directory structures
- about.png
- Banner.png
- hmc_Banner.jpg
- vertical_FAIR.png
- custom-script.js
- favicon.ico
- favicon.png
- style.css
- Apache-2.0.txt
- CC-BY-4.0.txt
- CC0-1.0.txt
- LicenseRef-helmholtz.txt
- LicenseRef-x.txt
- MIT.txt
- OFL-1.1-RFN.txt
- Dockerfile
- schema.sql
- about.py
- assess_my_data.py
- data_in_helmholtz.py
- fair_by_repository.py
- home.py
- about.de.yml
- about.en.yml
- data.de.yml
- data.en.yml
- faq.de.yml
- faq.en.yml
- home.de.yml
- home.en.yml
- my_data.de.yml
- my_data.en.yml
- nav.de.yml
- nav.en.yml
- repo.de.yml
- repo.en.yml
- .env.example
- .gitignore
- .gitlab-ci.yml
- .pre-commit-config.yaml
- app.py
- CHANGELOG.md
- CITATION.cff
- codemeta.json
- dashboard_logo.png
- docker-compose.yml
- Dockerfile
- gunicorn.conf.py
- INSTALLATION.md
- LICENSE
- pyproject.toml
- README.md
- REUSE.toml
- run.sh
You may click on the directories in the tree structure above to explore the file structure.
Note: some *.licence
may also exist in the repository, containing the copyright and licence type information.
Environment variables
This is the example of .env
file.
.env
DB_CONTAINER_NAME_OR_ADDRESS=mariadb
DB_NAME=library_db_hmc
DB_USER=library_db_hmc
DB_PASSWORD=1234
SSL_CRT_FILE=ssl/local.crt
SSL_KEY_FILE=ssl/local.key
PORT=8050
MAX_YEAR=2024
OLD_MAX_YEAR=2022
FUJI_HOST=fuji_local_server
FUJI_PORT=80
FUJI_PROTOCOL=http
- The variable
DB_CONTAINER_NAME_OR_ADDRESS
can assume one of two values, either "localhost" or "mariadb," in accordance with the configuration specified in the docker-compose.yml file. - The variable
DB_NAME
holds the name of the database. - The variable
MAX_YEAR
variable holds the most recent year as an integer value to be considered for the evaluation and presentation of data in the dashbaord (e.g.2024
). - The variable
OLD_MAX_YEAR
holds in integer value lower thanMAX_YEAR
(e.g.2022
). On the Welcome-page, data points betweenOLD_MAX_YEAR
andMAX_YEAR
are connected with dashed lines to distinguish rather finalized data collection (solid lines) from ongoing data collection (dashed lines). - The variable
FUJI_HOST
holds either the name of a local F-UJI container or the URL of an online F-UJI service.
List of used packages
The following packages have been utilized in the current version:
Run-time dependencies
- dash>=2.6.2 MIT License (MIT)
- pandas>=1.4.3 BSD License (BSD-3-Clause)
- pymysql>=1.0.2 MIT License (MIT)
- dash-bootstrap-components>=1.2.1 Apache Software License (Apache 2.0)
- gunicorn>=20.1.0 MIT License (MIT)
- python-dotenv>=0.21.0 BSD License (BSD-3-Clause)
- requests>=2.28.2 Apache Software License (Apache 2.0)
- python-i18n>=0.3.9 MIT License (MIT)
- python-i18n[YAML] MIT License (MIT)
Development dependencies
- pre-commit
- pytest-playwright
- playwright