Customizing Spark Operator

Customizing Spark Operator

To customize the operator, you can follow the steps below:

  1. Compile Spark distribution with Kubernetes support as per Spark documentation.

  2. Create docker images to be used for Spark with docker-image tool.

  3. Create a new operator image based on the above image. You need to modify the FROM tag in the Dockerfile with your Spark image.

  4. Build and push multi-arch operator image to your own image registry by running the following command (docker buildx is needed):

    make docker-build IMAGE_REGISTRY=docker.io IMAGE_REPOSITORY=kubeflow/spark-operator IMAGE_TAG=latest PLATFORMS=linux/amd64,linux/arm64
    
  5. Deploy the Spark operator Helm chart by specifying your own operator image:

    helm repo add --force-update spark-operator https://kubeflow.github.io/spark-operator
    
    helm install spark-operator spark-operator/spark-operator \
        --namespace spark-operator \
        --create-namespace \
        --set image.registry=docker.io \
        --set image.repository=kubeflow/spark-operator \
        --set image.tag=latest
    

Feedback

Was this page helpful?


Last modified March 12, 2025: docs: customize spark operator (4b3d732)