How to deploy StarRocks with IAM enabled in AWS EKS?

This article records a practical walkthrough for deploying StarRocks on AWS EKS with IAM enabled, using CloudShell, eksctl, Helm, and IAM service accounts. It starts from creating an EKS cluster and fixing pod scheduling issues by reducing FE and BE resource requests, then shows how to bind IAM roles to StarRocks pods so they can access services such as S3 and Glue, and finally covers applying the updated Helm values, connecting to the cluster, and cleaning up both the IAM service account and the EKS cluster after the test.

Here are notes about how to deploy StarRocks with IAM enabled.

Below commands are executed by AWS’s CloudShell.

1. Download eksctl

Download from: https://eksctl.io/installation

# for ARM systems, set ARCH to: `arm64`, `armv6` or `armv7`
ARCH=amd64
PLATFORM=$(uname -s)_$ARCH

curl -sLO "https://github.com/eksctl-io/eksctl/releases/latest/download/eksctl_$PLATFORM.tar.gz"

# (Optional) Verify checksum
curl -sL "https://github.com/eksctl-io/eksctl/releases/latest/download/eksctl_checksums.txt" | grep $PLATFORM | sha256sum --check

tar -xzf eksctl_$PLATFORM.tar.gz -C /tmp && rm eksctl_$PLATFORM.tar.gz

sudo mv /tmp/eksctl /usr/local/bin

2. Create EKS cluster

I create a EKS cluster named smith-eks.

eksctl create cluster --name smith-eks --region us-west-2

About 10~20 minutes later, the EKS cluster will be ready.

3. Configure kubectl config

aws eks update-kubeconfig --region us-west-2 --name smith-eks

Then we can use kubectl connect with EKS cluster.

Check all nodes are alive:

$ kubectl get nodes
NAME                                           STATUS   ROLES    AGE     VERSION
ip-192-168-31-98.us-west-2.compute.internal    Ready    <none>   5m17s   v1.30.2-eks-1552ad0
ip-192-168-79-119.us-west-2.compute.internal   Ready    <none>   5m25s   v1.30.2-eks-1552ad0

4. Deploy StarRocks by Helm

Download Helm first: https://helm.sh/docs/intro/install/

curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh

Deploy StarRocks by Helm: https://docs.starrocks.io/docs/deployment/helm/

helm repo add starrocks https://starrocks.github.io/starrocks-kubernetes-operator
helm repo update
helm search repo starrocks

# check with heml search
$ helm search repo starrocks
NAME                            CHART VERSION   APP VERSION     DESCRIPTION                                       
starrocks/kube-starrocks        1.9.8           3.3-latest      kube-starrocks includes two subcharts, operator...
starrocks/starrocks             1.9.8           3.3-latest      A Helm chart for StarRocks cluster                
starrocks/operator              1.9.8           1.9.8           A Helm chart for StarRocks operator               
starrocks/warehouse             1.9.8           3.3-latest      Warehouse is currently a feature of the StarRoc...

# install StarRocks
helm install starrocks starrocks/kube-starrocks

5. Resolve pods always pending

You will find your pods are always pending because of lack resources, you can check it with kubectl describe pod <pod-name> .

# FE always pending
$ kubectl get pods
NAME                                      READY   STATUS    RESTARTS   AGE
kube-starrocks-fe-0                       0/1     Pending   0          26s
kube-starrocks-operator-d59c86c95-5hhfd   1/1     Running   0          30s

# check with kubectl describe pod <pod-name>
$ kubectl describe pod kube-starrocks-fe-0
# ...
# ...
# ...
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  82s   default-scheduler  0/2 nodes are available: 2 Insufficient cpu. preemption: 0/2 nodes are available: 2 No preemption victims found for incoming pod.

6. Custom values.yaml

We have to custom values.yaml , to reduce FE/BE’s resource requests.

Default values.yaml can be download from: https://raw.githubusercontent.com/StarRocks/starrocks-kubernetes-operator/main/helm-charts/charts/kube-starrocks/values.yaml

Change resources.requests fields, I’ve reduce cpu→1. You have to change FE & BE both.

resources:
  requests:
    cpu: 1
    memory: 4Gi

7. IAM binding

We have to bind IAM role to pod, so we can access S3/Glue in StarRocks.

https://eksctl.io/usage/iamserviceaccounts/?h=eksctl#usage-without-config-files

$ eksctl utils associate-iam-oidc-provider --cluster=smith-eks --region=us-west-2 --approve
2024-09-09 09:15:36 [ℹ]  will create IAM Open ID Connect provider for cluster "smith-eks" in "us-west-2"
2024-09-09 09:15:36 [✔]  created IAM Open ID Connect provider for cluster "smith-eks" in "us-west-2"

# Here just bind S3 read-only policy
$ eksctl create iamserviceaccount --region=us-west-2 --cluster=smith-eks --name=sr-service-account --namespace=default --attach-policy-arn=arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess --approve
2024-09-09 09:17:00 [ℹ]  1 iamserviceaccount (default/sr-service-account) was included (based on the include/exclude rules)
2024-09-09 09:17:00 [!]  serviceaccounts that exist in Kubernetes will be excluded, use --override-existing-serviceaccounts to override
2024-09-09 09:17:00 [ℹ]  1 task: { 
    2 sequential sub-tasks: { 
        create IAM role for serviceaccount "default/sr-service-account",
        create serviceaccount "default/sr-service-account",
    } }2024-09-09 09:17:00 [ℹ]  building iamserviceaccount stack "eksctl-smith-eks-addon-iamserviceaccount-default-sr-service-account"
2024-09-09 09:17:01 [ℹ]  deploying stack "eksctl-smith-eks-addon-iamserviceaccount-default-sr-service-account"
2024-09-09 09:17:01 [ℹ]  waiting for CloudFormation stack "eksctl-smith-eks-addon-iamserviceaccount-default-sr-service-account"
2024-09-09 09:17:31 [ℹ]  waiting for CloudFormation stack "eksctl-smith-eks-addon-iamserviceaccount-default-sr-service-account"
2024-09-09 09:17:31 [ℹ]  created serviceaccount "default/sr-service-account"

eksctl will create IAM roles automatically, you can edit role’s policy by yourself.

image.png

You can see AmazonS3ReadOnlyAccess already attached to eksctl-smith-eks-addon-iamserviceaccount-defa-Role1-7d8vKLWBBDIs

image.png

Then we have to change serviceAccount field in values.yaml. For FE & BE, change values from ""->"sr-service-account".

8. Apply new values.yaml

helm upgrade -f values.yaml starrocks starrocks/kube-starrocks

Check all pods are working.

$ kubectl get pods
NAME                                      READY   STATUS    RESTARTS   AGE
kube-starrocks-be-0                       1/1     Running   0          2m6s
kube-starrocks-fe-0                       1/1     Running   0          3m26s
kube-starrocks-operator-d59c86c95-5hhfd   1/1     Running   0          62m

9. Connect with StarRocks

kubectl exec -it kube-starrocks-fe-0 -- /bin/bash

mysql -uroot -h127.0.0.1 -P9030

select * from files(
    "path"="s3://smith-bucket/file.parquet", 
    "format"="parquet", 
    "aws.s3.use_aws_sdk_default_behavior"="true", 
    "aws.s3.region"="us-west-2"
);

Just feel free to use StarRocks, everything is OK.

10. Destroy EKS cluster

After test, we need to destroy EKS cluster.

Delete IAM service account first:

$ eksctl delete iamserviceaccount --cluster=smith-eks --name=sr-service-account --region=us-west-2
2024-09-09 13:12:16 [ℹ]  1 iamserviceaccount (default/sr-service-account) was included (based on the include/exclude rules)
2024-09-09 13:12:18 [ℹ]  1 task: { 
    2 sequential sub-tasks: { 
        delete IAM role for serviceaccount "default/sr-service-account" [async],
        delete serviceaccount "default/sr-service-account",
    } }2024-09-09 13:12:18 [ℹ]  will delete stack "eksctl-smith-eks-addon-iamserviceaccount-default-sr-service-account"
2024-09-09 13:12:18 [ℹ]  deleted serviceaccount "default/sr-service-account"

Delete EKS cluster:

eksctl delete cluster --name=smith-eks --region=us-west-2 

After that, everything is clean.

原创文章,作者:Smith,如若转载,请注明出处:https://www.inlighting.org/archives/starrocks-aws-eks-iam-deploy

打赏 微信扫一扫 微信扫一扫
SmithSmith
上一篇 2024年3月2日 下午5:26
下一篇 2023年9月6日 下午11:09

相关推荐

  • StarRocks 中关于 Hadoop Hedged Read 性能测试

    这篇文章围绕 StarRocks 接入 HDFS 后启用 Hadoop Hedged Read 的效果做了一轮系统性能测试,重点观察不同线程池大小和超时阈值对查询耗时的影响。正文先介绍 Hedged Read 的工作机制和实验环境,再分别在单线程、慢节点和高并发 CPU 打满等场景下比较多组配置结果,最后结合 DFSClient 的线程池实现分析其共享方式和扩缩容行为,并给出较大的超时阈值配合较宽线程池的推荐配置。

    2023年7月6日
    1.6K2
  • StarRocks perfect IDE development setup (Support IDEA & Clion)

    This article provides an English guide for setting up a workable StarRocks development environment with both IDEA and Clion, aiming to let contributors write code on a local MacBook while compiling and debugging on a remote Ubuntu server. It walks through local and remote dependency setup, the first full build of StarRocks, FE development and remote debug in IDEA, and BE development in Clion with remote toolchain, deployment mapping, generated source handling, and GDB-based debugging, so that FE and BE can both be developed with normal code navigation and analysis support.

    2023年1月20日
    2.0K0
  • StarRocks FE 在 IDEA 上开发环境设置

    这篇文章记录了如何在 M1 MacBook 上为 StarRocks FE 配置可直接运行和调试的 IDEA 开发环境,并补充说明这套方法对 Linux 环境也有参考价值。正文先梳理 Thrift、Protobuf、Maven、JDK 和 Python3 等基础依赖,再说明 `thirdparty` 目录、环境变量、源文件生成和 FE 编译步骤,随后重点介绍如何调整 IDEA 工程、复制运行所需目录、创建日志和元信息目录,以及修改 `fe.conf` 和启动参数,使 FE 能在本地正常启动并进入 Web UI。

    2022年7月2日
    5.0K3

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注