Edit

Share via


Configure WebAssembly (WASM) graph definitions for data flow graphs (preview)

Important

WebAssembly (WASM) graph definitions for data flow graphs are in preview. This feature has limitations and is not for production workloads.

See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or not yet released into general availability.

Graph definitions are central to WASM development because they define how your modules connect to processing workflows. Understanding the relationship between graph definitions and data flow graphs helps you develop effectively.

This article focuses on creating and configuring the YAML graph definitions. For information about deploying and testing WASM data flow graphs, see Use WebAssembly with data flow graphs.

Important

Data flow graphs currently only support MQTT, Kafka, and OpenTelemetry endpoints. Other endpoint types like Data Lake, Microsoft Fabric OneLake, Azure Data Explorer, and Local Storage are not supported. For more information, see Known issues.

Graph definition structure

Graph definitions follow a formal JSON schema that validates structure and ensures compatibility. The configuration includes:

  • Module requirements for API and host library version compatibility
  • Module configurations for runtime parameters and operator customization
  • Operations that define processing nodes in your workflow
  • Connections that specify data flow routing between operations
  • Schemas for optional data validation

Basic graph structure

moduleRequirements:
  apiVersion: "0.2.0"
  hostlibVersion: "0.2.0"

operations:
  - operationType: "source"
    name: "data-source"
  - operationType: "map"
    name: "my-operator/map"
    module: "my-operator:1.0.0"
  - operationType: "sink"
    name: "data-sink"

connections:
  - from: { name: "data-source" }
    to: { name: "my-operator/map" }
  - from: { name: "my-operator/map" }
    to: { name: "data-sink" }

Version compatibility

The moduleRequirements section ensures compatibility using semantic versioning:

moduleRequirements:
  apiVersion: "0.2.0"          # WASI API version for interface compatibility
  hostlibVersion: "0.2.0"     # Host library version providing runtime support
  features:                    # Optional features required by modules
    - name: "wasi-nn"

Tip

For guidance on enabling in-band ONNX inference with the wasi-nn feature, see Run ONNX inference in WebAssembly data flow graphs.

Example 1: Simple graph definition

The simple graph definition demonstrates a basic three-stage pipeline that converts temperature data from Fahrenheit to Celsius:

moduleRequirements:
  apiVersion: "0.2.0"
  hostlibVersion: "0.2.0"

moduleConfigurations:
  - name: module-temperature/map
    parameters:
      key1:
        name: key2
        description: key2
operations:
  - operationType: "source"
    name: "source"

  - operationType: "map"
    name: "module-temperature/map"
    module: "temperature:1.0.0"

  - operationType: "sink"
    name: "sink"

connections:
  - from:
      name: "source"
    to:
      name: "module-temperature/map"

  - from:
      name: "module-temperature/map"
    to:
      name: "sink"

For step-by-step deployment instructions and testing guidance for this example, see Example 1: Basic deployment with one WASM module.

How the simple graph works

This graph creates a straightforward data processing pipeline:

  1. Source operation: Receives temperature data from the data flow's source endpoint
  2. Map operation: Processes data with the temperature WASM module (temperature:1.0.0)
  3. Sink operation: Sends converted data to the data flow's destination endpoint

The temperature module converts Fahrenheit to Celsius using the standard formula (F - 32) × 5/9 = C.

Input format:

{"temperature": {"value": 100.0, "unit": "F"}}

Output format:

{"temperature": {"value": 37.8, "unit": "C"}}

Example 2: Complex graph definition

The complex graph definition demonstrates a sophisticated multi-sensor processing workflow that handles temperature, humidity, and image data with advanced analytics:

moduleRequirements:
  apiVersion: "0.2.0"
  hostlibVersion: "0.2.0"

moduleConfigurations:
  - name: module-temperature/map
    parameters:
      key1:
        name: key2
        description: key2
  - name: module-snapshot/branch
    parameters:
      snapshot_topic:
        name: snapshot_topic
        description: Transform app snapshot_topic in snapshot branch's init routine
operations:
  - operationType: "source"
    name: "source"

  - operationType: delay
    name: module-window/delay
    module: window:1.0.0
  - operationType: "map"
    name: "module-format/map"
    module: "format:1.0.0"
  - operationType: map
    name: module-snapshot/map
    module: snapshot:1.0.0
  - operationType: branch
    name: module-snapshot/branch
    module: snapshot:1.0.0
  - operationType: accumulate
    name: module-snapshot/accumulate
    module: snapshot:1.0.0
  - operationType: map
    name: module-temperature/map
    module: temperature:1.0.0
  - operationType: branch
    name: module-temperature/branch
    module: temperature:1.0.0
  - operationType: filter
    name: module-temperature/filter
    module: temperature:1.0.0
  - operationType: accumulate
    name: module-temperature/accumulate
    module: temperature:1.0.0
  - operationType: accumulate
    name: module-humidity/accumulate
    module: humidity:1.0.0
  - operationType: concatenate
    name: concatenate1
    module:
  - operationType: accumulate
    name: module-collection/accumulate
    module: collection:1.0.0
  - operationType: map
    name: module-enrichment/map
    module: enrichment:1.0.0

  - operationType: "sink"
    name: "sink"

connections:
  - from:
      name: source
    to:
      name: module-window/delay

  - from:
      name: module-window/delay
    to:
      name: module-snapshot/branch

  - from:
      name: module-snapshot/branch
      arm: "False"
    to:
      name: module-temperature/branch

  - from:
      name: module-snapshot/branch
      arm: "True"
    to:
      name: module-format/map

  - from:
      name: module-format/map
    to:
      name: module-snapshot/map

  - from:
      name: module-snapshot/map
    to:
      name: module-snapshot/accumulate

  - from:
      name: module-snapshot/accumulate
    to:
      name: concatenate1

  - from:
      name: module-temperature/branch
      arm: "True"
    to:
      name: module-temperature/map

  - from:
      name: module-temperature/branch
      arm: "False"
    to:
      name: module-humidity/accumulate

  - from:
      name: module-humidity/accumulate
    to:
      name: concatenate1

  - from:
      name: module-temperature/map
    to:
      name: module-temperature/filter

  - from:
      name: module-temperature/filter
    to:
      name: module-temperature/accumulate

  - from:
      name: module-temperature/accumulate
    to:
      name: concatenate1

  - from:
      name: concatenate1
    to:
      name: module-collection/accumulate

  - from:
      name: module-collection/accumulate
    to:
      name: module-enrichment/map

  - from:
      name: module-enrichment/map
    to:
      name: sink

For step-by-step deployment instructions and testing guidance for this example, see Example 2: Deploy a complex graph.

How the complex graph works

The complex graph processes three data streams and combines them into enriched sensor analytics:

Diagram showing a complex data flow graph example with multiple modules.

As shown in the diagram, data flows from a single source through multiple processing stages:

  1. Window module: Delays incoming data for time-based processing
  2. Branch operation: Routes data based on content type (sensor data vs. snapshots)
  3. Temperature processing path:
    • Converts Fahrenheit to Celsius
    • Filters invalid readings
    • Calculates statistical summaries over time windows
  4. Humidity processing path:
    • Accumulates humidity measurements with statistical analysis
  5. Image processing path:
    • Formats image data for processing
    • Performs object detection on camera snapshots
  6. Final aggregation:
    • Concatenates all processed data streams
    • Aggregates multi-sensor results
    • Adds metadata and overtemperature alerts

The graph uses specialized modules from the Rust examples:

  • Window module for time-based processing delays
  • Temperature modules for conversion, filtering, and statistical analysis
  • Humidity module for environmental data processing
  • Snapshot modules for image data routing and object detection
  • Format module for image preparation for processing
  • Collection module for multi-sensor data aggregation
  • Enrichment module for metadata addition and alert generation

Branch operations enable parallel processing of different sensor inputs, allowing the graph to handle multiple data types efficiently within a single workflow.

How graph definitions become data flows

Here's how graph definitions and Azure IoT Operations data flow graphs relate:

Your YAML file defines the internal processing logic with source/sink operations as abstract endpoints. This becomes the graph definition artifact. Referenced modules implement the actual processing operators as WASM modules. Both graph definitions and WASM modules are uploaded to a container registry (such as Azure Container Registry) as OCI artifacts for registry storage.

The Azure Resource Manager or Kubernetes resource "wraps" the graph definition and connects it to real endpoints as the data flow graph resource. During runtime deployment, the data flow engine pulls the artifacts from the registry and deploys them. For endpoint mapping, the abstract source/sink operations in your graph connect to actual MQTT topics, Azure Event Hubs, or other data sources.

For example, this diagram illustrates the relationship between graph definitions, WASM modules, and data flow graphs:

Diagram showing the relationship between graph definitions, WASM modules, and data flow graphs.

Registry deployment

Both graph definitions and WASM modules must be uploaded to a container registry as Open Container Initiative (OCI) artifacts before data flow graphs can reference them:

  • Graph definitions are packaged as OCI artifacts with media type application/vnd.oci.image.config.v1+json
  • WASM modules are packaged as OCI artifacts containing the compiled WebAssembly binary
  • Use semantic versioning (such as my-graph:1.0.0, temperature-converter:2.1.0) for proper dependency management
  • Registry support is compatible with Azure Container Registry, Docker Hub, and other OCI-compliant registries

The separation enables reusable logic where the same graph definition deploys with different endpoints. It provides environment independence where development, staging, and production use different data sources. It also supports modular deployment where you update endpoint configurations without changing processing logic.

For detailed instructions on uploading graph definitions and WASM modules to registries, see Use WebAssembly with data flow graphs. For complete deployment workflows including registry setup, authentication, and testing, see the examples in that guide.

Module configuration parameters

Graph definitions can specify runtime parameters for WASM operators through module configurations:

moduleConfigurations:
  - name: my-operator/map
    parameters:
      threshold:
        name: temperature_threshold
        description: "Temperature threshold for filtering"
        required: true
      unit:
        name: output_unit
        description: "Output temperature unit"
        required: false

These parameters are passed to your WASM operator's init function at runtime, enabling dynamic configuration without rebuilding modules. For detailed examples of how to access and use these parameters in your Rust and Python code, see Module configuration parameters.

For a complete implementation example, see the branch module, which demonstrates parameter usage for conditional routing logic.

Next steps