Cluster Administration
#
Identity ManagementApex identity management system is provided by FreeIPA on archon-3
. The system is running in isolation from the rest of the cluster using LXD. IPA provides bundled identiy, authentication and policy management with LDAP and Kerberos support for linux-based services and applications. IPA also provides a web portal for identity provisioning & management.
Dex is also deployed on Kubernetes auth
namespace and provides OAuth/OpenID Connect mechanism for web-based/modern applications.
We are working on streamlining IPA with Goliath for account generation. At the moment, we need users to complete Apex Trial Request Form for account generation.
#
User Home DirectoryUser home directories are provisioned on /lustre/ai/home
(under /lustre/ai
filesystem). The directory is also automatically mounted to /home
as a shared directory for all machines in the cluster. Occasionally, after server crashes or network failure, the machine may need to re-sync its cached with lustre making the /home
unavailable. Users are recommended to refrain from using the machine until /home
has been properly re-mounted as any data written before the sync to local /home
will be shadowed by lustre mount.
Administrators should ensure that all lustre directories (/lustre/ai
, /lustre/scratch
, /lustre/testfs
, /home
) are properly mounted when the system restarts.
#
Ansible and DeepOpsAnsible can be used to automate many repetitive tasks required to manage the host cluster. Adhoc ansible runs should re-use DeepOps/config/inventory
for inventory file to ensure node configuration consistency. DeepOps used the inventory file as the configuration to provision Kubernetes using Kubespray.
Ideally, we should be able to provision & configure bare-metal systems declaratively using Ansible or Crossplane. The admin team is working toward the goal which will help our long-term operation & maintability of the cluster. For references, see metal-stack,gardener and KubeKey.