Fleet configuration

Fleets represent cloud accounts connected to a CFKE cluster. When you create a Fleet, CFKE automatically provisions nodes within that cloud account to run your workloads. Currently, CFKE supports AWS, GCP, and Hetzner for node autoprovisioning.

To create a Fleet, you need to grant CFKE the necessary permissions to provision and manage nodes in your cloud account.

As the orchestrator of critical infrastructure for our customers, security is a top priority at Cloudfleet. We always follow the principle of least privilege and request only the permissions necessary to operate the service. Depending on your setup, these permissions may include launching and terminating instances, as well as managing load balancers.

For infrastructure providers with mature identity and access management, we use passwordless and credential-less authentication. For others, we apply strong encryption and protection to credentials, which are stored securely within your own cluster. In both cases, only your cluster has access to your cloud provider accounts.

For most cloud providers (e.g., AWS and GCP), you don’t need to share hardcoded credentials with us. Instead, you delegate permissions within your cloud account to a role or service account that we manage. This approach eliminates the operational burden of securely storing and rotating static credentials, and allows you to restrict the permissions granted to Cloudfleet to a specific set of actions and resources. This ensures that Cloudfleet cannot access any other resources in your account and can only perform the actions necessary to deliver infrastructure for CFKE.

You can revoke the permissions granted to Cloudfleet at any time, and we will immediately lose access to your account.

Quick start with Terraform

The Cloudfleet Terraform provider is the fastest way to set up the required cloud account set up and create a Fleet. Below is an example of a Terraform configuration that creates a CFKE cluster and a Fleet for AWS, GCP, and Hetzner Cloud. The example sets up the required permissions with a custom role on GCP, uses the CFKE Fleet Terraform module for AWS to set up the IAM roles and trust policies and the VPCs in each region.

terraform {
    required_providers {
        cloudfleet = {
            source = "terraform.cloudfleet.ai/cloudfleet/cloudfleet"
        }
    }
}

variable "cfke_control_plane_region" {
    description = "CFKE control plane region where the cluster is deployed"
    type        = string
    default     = "europe-central-1a"
}

variable "gcp_project" {
    type        = string
    description = "GCP project ID where CFKE nodes will be provisioned"
}

variable "hetzner_api_key" {
    description = "API key for Hetzner Cloud"
    type        = string
    sensitive   = true
}

variable "aws_region" {
    description = "AWS region where CFKE nodes will be provisioned"
    type        = string
    default     = "eu-central-1"
}

variable "aws_profile" {
    description = "AWS profile to use authenticate with AWS"
    type        = string
    default     = "default"
}

variable "hetzner_api_key" {
    description = "API key for Hetzner Cloud"
    type        = string
    sensitive   = true
}

provider "aws" {
    region  = var.aws_region
    profile = var.aws_profile
}

provider "cloudfleet" {
    profile = "default"
}

resource "cloudfleet_cfke_cluster" "cfke_test" {
    name   = "cfke-test"
    region = var.cfke_control_plane_region
    tier   = "basic"
}

resource "google_project_iam_custom_role" "cfke_node_autoprovisioner" {
    project = var.gcp_project
    permissions = [
        // Node management permissions
        "compute.instances.create",
        "compute.instances.delete",
        "compute.instances.get",
        "compute.instances.list",
        "compute.disks.create",
        "compute.subnetworks.use",
        "compute.subnetworks.useExternalIp",
        "compute.instances.setMetadata",
        "compute.instances.setTags",
        "compute.instances.setLabels",

        // Load Balancer management permissions (required for Load Balancing, can be omitted if not used)
        "compute.firewalls.create",
        "compute.firewalls.delete",
        "compute.firewalls.get",
        "compute.firewalls.update",
        "compute.regionHealthChecks.get",
        "compute.regionHealthChecks.create",
        "compute.regionHealthChecks.update",
        "compute.regionHealthChecks.delete",
        "compute.regionHealthChecks.useReadOnly",
        "compute.networkEndpointGroups.get",
        "compute.networkEndpointGroups.delete",
        "compute.networkEndpointGroups.create",
        "compute.networkEndpointGroups.use",
        "compute.networkEndpointGroups.attachNetworkEndpoints",
        "compute.networkEndpointGroups.detachNetworkEndpoints",
        "compute.instances.use",
        "compute.zoneOperations.get",
        "compute.regionBackendServices.get",
        "compute.regionBackendServices.create",
        "compute.regionBackendServices.update",
        "compute.regionBackendServices.delete",
        "compute.forwardingRules.get",
        "compute.forwardingRules.create",
        "compute.forwardingRules.update",
        "compute.forwardingRules.delete",
        "compute.forwardingRules.setLabels",
        "compute.networks.updatePolicy",
        "compute.zones.list"
    ]
    role_id = "cfke.nodeAutoprovisioner"
    title   = "CFKE Node-autoprovisioner"
}

resource "google_project_iam_binding" "gcp_project_binding" {
    project = var.gcp_project
    role    = google_project_iam_custom_role.cfke_node_autoprovisioner.id
    members = [
        "principal://iam.googleapis.com/projects/207152264238/locations/global/workloadIdentityPools/cfke/subject/${cloudfleet_cfke_cluster.hetzner_test.id}"
    ]
}

module "cfke_connected_fleet" {
    source               = "registry.terraform.io/cloudfleetai/cfke-connected-fleet/aws"
    version              = "~> 0.1.0"
    control_plane_region = cloudfleet_cfke_cluster.cfke_test.region
    cluster_id           = cloudfleet_cfke_cluster.cfke_test.id
}

resource "cloudfleet_cfke_fleet" "fleet" {

    depends_on = [
        google_project_iam_binding.gcp_project_binding
    ]

    cluster_id = cloudfleet_cfke_cluster.cfke_test.id
    name       = "cfke-multi-cloud-fleet"

    limits {
        cpu = 24
    }

    hetzner {
        api_key = var.hetzner_api_key
    }

    aws {
        role_arn = module.cfke_connected_fleet.fleet_arn
    }

    gcp {
        project_id = var.gcp_project
    }
}

Cloud provider specific setup

AWS

To connect an AWS account with CFKE, you do not need to handle hardcoded credentials. Each CFKE cluster has a unique AWS IAM role managed internally by Cloudfleet. To authorize CFKE in your AWS account, create a role, attach the required permissions, and configure a trust policy allowing CFKE’s internal IAM role to assume it.

Although this method requires fewer IAM resources and may seem complex, it is more secure than using hardcoded credentials. Cloudfleet provides a Terraform module to automate the creation of the required IAM role and policies. You can find the CFKE Fleet Terraform module here.

The module also deploys VPCs and subnets in every region supported by CFKE. These resources are free of charge.

The IAM permissions created by the module are tightly scoped to restrict access to specific virtual machines. This is enforced by limiting IAM policies to resources with certain tags. Cloudfleet is permitted to create and delete EC2 instances only if they carry a specific tag. As a result, even though Cloudfleet operates within your AWS account, it cannot use its IAM permissions to create or delete any other resources outside of those explicitly tagged.

Currently, CFKE nodes require public IP addresses to communicate with the control plane, download necessary packages, and interact with other nodes. However, CFKE is designed to work behind NAT, and public IP addresses will not always be required. Cloudfleet is developing a solution to support private subnets and NAT gateways. While nodes have public IP addresses, security groups block external traffic.

  1. Use the module to create the required IAM roles and policies in your AWS account:

    module "cfke_connected_fleet" {
      source               = "registry.terraform.io/cloudfleetai/cfke-connected-fleet/aws"
      # version              = "~> 1.0.0" # (Optional) Specify a version if you want to pin
      control_plane_region = "CONTROL_PLANE_REGION"
      cluster_id           = "CLUSTER_ID"
    }
    

    Replace CONTROL_PLANE_REGION with your CFKE control plane region and CLUSTER_ID with your cluster ID. You can find both values in the CFKE console.

  2. The module outputs fleet_arn. Use this ARN when creating a Fleet for AWS in the CFKE console.

For more details, see the module documentation.

GCP

CFKE uses Workload Identity Federation to access your GCP project without hardcoded credentials. Each CFKE cluster has a unique principal managed internally by Cloudfleet. To authorize CFKE in your GCP project, grant the roles/compute.instanceAdmin.v1 role to this principal.

To provision CFKE nodes in your GCP project:

  1. Ensure your GCP project has a default VPC network with subnet creation mode set to Automatic. This ensures subnets exist in all regions.

    Support for custom VPC networks (standalone or shared) is on the roadmap.

    How to check VPC networks with gcloud CLI:

    gcloud compute networks list --project=YOUR_PROJECT_ID
    

    You should get a printout like this:

    NAME     SUBNET_MODE  BGP_ROUTING_MODE  IPV4_RANGE  GATEWAY_IPV4  INTERNAL_IPV6_RANGE
    default  AUTO         REGIONAL
    

    If you do not get this result, to create a new auto-subnetted VPC:

    gcloud compute networks create default \
    --subnet-mode=auto \
    --project=YOUR_PROJECT_ID
    
  2. Currently, CFKE nodes require public IP addresses to communicate with the control plane, download necessary packages, and interact with other nodes. Some organizations may restrict public IP addresses via the constraints/compute.vmExternalIpAccess policy. Ensure this policy is not set to DENY.

    CFKE is designed to work behind NAT, and public IP addresses will not always be required. Cloudfleet is developing a solution to support private nodes behind Cloud NAT.

    While nodes have public IP addresses, VPC firewall rules block external traffic.

    How to check organization policies with gcloud CLI:

    gcloud org-policies list --project=YOUR_PROJECT_ID
    

    If you have the necessary permissions, you can remove the DENY restriction with the following command:

    gcloud org-policies reset constraints/compute.vmExternalIpAccess \
    --project=YOUR_PROJECT_ID
    

    WARNING: This command re-enables the use of public IP addresses. Proceed according to your security policy.

  3. Grant the roles/compute.instanceAdmin.v1 role to the following principal:

    principal://iam.googleapis.com/projects/207152264238/locations/global/workloadIdentityPools/cfke/subject/CLUSTER_ID
    

    Use the gcloud CLI to apply this:

    gcloud projects add-iam-policy-binding PROJECT_ID \
      --member=principal://iam.googleapis.com/projects/207152264238/locations/global/workloadIdentityPools/cfke/subject/CLUSTER_ID \
      --role=roles/compute.instanceAdmin.v1
    

    Replace CLOUDFLEET_GCP_PROJECT_NUMBER with the appropriate project number from the table above and CLUSTER_ID with your cluster ID.

    If you want to grant more granular permissions instead of granting the built-in role to Cloudfleet, you can create a custom IAM role with the following permissions:

    **Node management permissions:**
    - compute.instances.create
    - compute.instances.delete
    - compute.instances.get
    - compute.instances.list
    - compute.disks.create
    - compute.subnetworks.use
    - compute.subnetworks.useExternalIp
    - compute.instances.setMetadata
    - compute.instances.setTags
    - compute.instances.setLabels
    
    **Load Balancer management permissions (required for Load Balancing, can be omitted if not used):**
    - compute.firewalls.create
    - compute.firewalls.delete
    - compute.firewalls.get
    - compute.firewalls.update
    - compute.regionHealthChecks.get
    - compute.regionHealthChecks.create
    - compute.regionHealthChecks.update
    - compute.regionHealthChecks.delete
    - compute.regionHealthChecks.useReadOnly
    - compute.networkEndpointGroups.get
    - compute.networkEndpointGroups.delete
    - compute.networkEndpointGroups.create
    - compute.networkEndpointGroups.attachNetworkEndpoints
    - compute.networkEndpointGroups.detachNetworkEndpoints
    - compute.instances.use
    - compute.zoneOperations.get
    - compute.regionBackendServices.get
    - compute.regionBackendServices.create
    - compute.regionBackendServices.update
    - compute.regionBackendServices.delete
    - compute.forwardingRules.get
    - compute.forwardingRules.create
    - compute.forwardingRules.update
    - compute.forwardingRules.delete
    - compute.forwardingRules.setLabels
    - compute.networks.updatePolicy
    - compute.zones.list
    

Once the IAM binding is created, provide the GCP Project ID in the Fleet creation wizard in the CFKE console.

Although Cloudfleet does not access any resources not provisioned by CFKE, if your setup requires additional isolation, you can create a separate Google Cloud project for CFKE. This ensures that the role assigned to Cloudfleet has access only to the resources within that project.

Hetzner Cloud

  1. Follow the Hetzner Cloud documentation to generate an API token with “Read & Write” permissions.

  2. Enter this token in the CFKE console when creating a Fleet for Hetzner Cloud.

CFKE creates a separate network per region in your Hetzner Cloud account for provisioning nodes. This network is named cfke-CLUSTER_ID-NETWORK_REGION_NAME.

Support for custom networks is on the roadmap.

As of now, Hetzner Cloud does not support granular permissions for API tokens, meaning it’s not possible to restrict access to specific resources. Although Cloudfleet does not access any resources not provisioned by CFKE, if your setup requires additional isolation, you can create a separate Hetzner Cloud project for CFKE. This ensures that the API token used by Cloudfleet has access only to the resources within that project.