Blueprints

Run a Python script on Azure with Azure Batch VMs

About this blueprint

Azure Python Task Runner

This flow will execute a Python script on Azure Batch. This requires you to setup Azure Batch, Azure Blob Storage in the same region that you want to run Azure Batch Jobs and Azure credentials.

In order to support inputFiles, namespaceFiles, and outputFiles, the Azure Batch task runner currently relies on resource files and output files which transit through Azure Blob Storage.

Since we don't know the working directory of the container in advance, we always need to explicitly define the working directory and output directory when using the Azure Batch runner, e.g. use cat {{ workingDir }}/myFile.txt rather than cat myFile.txt.

yaml
id: azure_batch_runner
namespace: company.team

variables:
  pool_id: "poolId"
  container_name: "containerName"

tasks:
  - id: scrape_environment_info
    type: io.kestra.plugin.scripts.python.Commands
    containerImage: ghcr.io/kestra-io/pydata:latest
    taskRunner:
      type: io.kestra.plugin.ee.azure.runner.Batch
      account: "{{ secret('AZURE_ACCOUNT') }}"
      accessKey: "{{ secret('AZURE_ACCESS_KEY') }}"
      endpoint: "{{ secret('AZURE_ENDPOINT') }}"
      poolId: "{{ vars.pool_id }}"
      blobStorage:
        containerName: "{{ vars.container_name }}"
        connectionString: "{{ secret('AZURE_CONNECTION_STRING') }}"
    commands:
      - python {{ workingDir }}/main.py
    namespaceFiles:
      enabled: true
    outputFiles:
      - "environment_info.json"
    inputFiles:
      main.py: |
        import platform
        import socket
        import sys
        import json
        from kestra import Kestra

        print("Hello from Azure Batch and kestra!")

        def print_environment_info():
            print(f"Host's network name: {platform.node()}")
            print(f"Python version: {platform.python_version()}")
            print(f"Platform information (instance type): {platform.platform()}")
            print(f"OS/Arch: {sys.platform}/{platform.machine()}")

            env_info = {
                "host": platform.node(),
                "platform": platform.platform(),
                "OS": sys.platform,
                "python_version": platform.python_version(),
            }
            Kestra.outputs(env_info)

            filename = 'environment_info.json'
            with open(filename, 'w') as json_file:
                json.dump(env_info, json_file, indent=4)

        if __name__ == '__main__':
          print_environment_info()

Commands

Batch

New to Kestra?

Use blueprints to kickstart your first workflows.

Get started with Kestra