C#

Using app settings when configuring function bindings

For example, how to manage the queue name of a queue-triggered function using the app settings

Mark the app setting name with percent signs, i.e. %queue_name%, as seen below.

[FunctionName("QueueTriggeredFunc")]
public static void Run(
    [QueueTrigger("%queue_name%")]string queueItem)
{
    //...
}

or using function.json:

{
  "bindings": [
    {
      "name": "order",
      "type": "queueTrigger",
      "direction": "in",
      "queueName": "%queue_name%",
    }
  ]
}

– via Microsoft Docs

How to connect to Azure Data Lake Storage Gen 2 using a Service Principal

Assuming you’ve already configured everything, you need to use an instance of DefaultAzureCredential.

var credentials = new DefaultAzureCredential();            
var serviceClient = new DataLakeServiceClient("https://<your_storage_account_name>.dfs.core.windows.net/", credentials);

– via Microsoft Docs

Retrying failed executions

With retry policies (all triggers)

Use either fixedDelay or exponentialBackoff, as documented here and here.

With maxDequeueCount (for queue triggers)

Use maxDequeueCount as documented here.

System.Net.Sockets.SocketException in function apps

Try setting the environment variable DOTNET_SYSTEM_NET_HTTP_USESOCKETSHTTPHANDLER to 0, as documented here.

– via Stack Overflow and GitHub

Configuring a queue-triggered function app to execute messages one at a time

Set batchSize to 1, and newBatchThreshold to 0. Per the docs, “the maximum number of concurrent messages being processed per function is batchSize plus newBatchThreshold. This limit applies separately to each queue-triggered function.”

{
  "extensions": {
    "queues": {
      "batchSize": 1,
      "newBatchThreshold": 0,
    }
  },
  ...

To get this to work when testing the function locally, set this in your local.settings.json, as documented here and explained here.

Same approach works when you need to override the host.json settings in the function’s Configuration blade.

{
  "Values": {
    "AzureFunctionsJobHost__extensions__queues__batchSize": 1
  }
}

Improve performance for queue-triggered functions ingesting lots of messages

Instead of messing around with batchSize and newBatchThreshold, use dynamicConcurrencyEnabled and snapshotPersistenceEnabled. First one makes the runtime increase concurrency until the VM can’t take it, the second one persists the settings to storage so that new instances will use those values instead.

{ 
  "version": "2.0", 
  "concurrency": { 
    "dynamicConcurrencyEnabled": true, 
    "snapshotPersistenceEnabled": true 
  } 
} 

And if you just want to set them in the function’s Configuration blade:

  {
    "name": "AzureFunctionsJobHost__concurrency__dynamicConcurrencyEnabled",
    "value": "true",
    "slotSetting": false
  },
  {
    "name": "AzureFunctionsJobHost__concurrency__snapshotPersistenceEnabled",
    "value": "true",
    "slotSetting": false
  }

You will need Microsoft.Azure.WebJobs.Extensions.Storage.Queues v5.x to use this, see the tip below for fixing a still undocumented behavior.

– via MS Learn

Fixing startup errors in queue-triggered function apps after upgrading to Microsoft.Azure.WebJobs.Extensions.Storage.Queues 5.x from 4.x

For example System.InvalidOperationException: Can't bind parameter 'dequeueCount' to type 'System.Int32'.

This happens because they changed the DequeueCount parameter type from Int32 to Long 😕, as (un)documented in this bug right here.

To fix this, simply update the binding of dequeueCount, from int to long.

Disable Application Insights sampling from the Configuration

{
  "name": "AzureFunctionsJobHost__logging__applicationInsights__samplingSettings__isEnabled",
  "value": "false",
  "slotSetting": false
}

Update a queue-triggered function to set a non default time-to-live when moving messages to the -poison queue

You’ll need to override the default QueueProcessorFactory and QueueProcessor with your own implementations, as suggested here.

Sample IQueueProcessorFactory registration and implementation: https://github.com/Azure/azure-webjobs-sdk/blob/ed4ff86f527178bfb73c27e90793ea955c793841/test/Microsoft.Azure.WebJobs.Host.EndToEndTests/AsyncChainEndToEndTests.cs#L220

Ideally you would register your own implementation of IQueueProcessorFactory, which would return an instance of your custom TtlQueueProcessor that inherits from the original QueueProcessor. TtlQueueProcessor should override CopyMessageToPoisonQueueAsync (protected) and instead of poisonQueue.AddMessageAndCreateIfNotExistsAsync it should use an overload of AddMessageAsync that supports setting the TTL.

Default implementations:

Monitor execution count of queue-triggered functions

In the Azure Portal, open the Function App, then open your function. See the Total Execution Count chart, and click it to open it in Application Insights.

Alternatively, go to your Application Insights instance, Metrics blade, and pick Metric Namespace = Log-based metrics and Metric = Your_Function_Name Count. Aggregation should be Sum, of course.

Queue-triggered Azure functions would sometimes not execute (dashed line in the Metrics monitor), even though the queue had plenty of items

Man, it took some time to figure this out. The function would stop executing sometime during the evening (not always at the same time, mind you), and resume the next day, always at 5:15am. No auto-scaling settings, no Free tier limitations, no exceptions and/or traces logged in Application Insights, nothing.

Google didn’t help. GPT-4 didn’t help. I didn’t even get to DuckDuckGo.

What helped was that, in a call with a colleague, I randomly decided to explain the issue to him and tried showing him the activity log for the app service.

Only I had the Application Insights instance open and clicked its own Activity Log by mistake. I had never thought to look there.

And there there were, to my surprise, a series of messages along the lines of Application Insights component daily cap reached this, and Application Insights component daily cap warning threshold reached that, posted at the same hours my app service went down for the night.

Our app was logging too much, and was shutdown automatically every time it reached a certain threshold (don’t ask me about the costs đŸ„¶). We remembered setting samplingSettings\isEnabled to false sometime ago, to debug some issue. We never enabled it back. And now it was back to haunt us. đŸ‘»

I’ve just enabled sampling, and pushed the changes. Tonight should be a good night, filled with lots and lots of processed messages.

And remember, friends don’t let friends disable sampling in AppInsights.

Control the time-to-live for messages that get sent to the poison queueu for queue-triggered functions

Create a custom IQueueProcessorFactory implementation, register it as a Singleton, and use it to return your own implementation of QueueProcessor.

Your very own MyQueueProcessor should override CopyMessageToPoisonQueueAsync, to specifically replace poisonQueue.AddMessageAndCreateIfNotExistsAsync with code that sets timeToLive when calling queue.SendMessageAsync. That’s pretty much it.

I’ve gotten the general idea from here, and used the following references:

Python

Issues when deploying functions with func azure functionapp publish <APP_NAME>

Such as Unable to connect to Azure. Make sure you have the az CLI or Az.Accounts PowerShell module installed and logged in and try again.

Make sure to az login before running the command.

Deployment successful, No HTTP triggers found for Python functions v2

Aka The operation was a success, but the patient died.

No error messages, obviously. Everything runs fine locally, in a conda environment specially created for the function, using the same requirements.txt.

Utterly frustrating to debug. Such a poor development experience. Here’s what I’ve tried

  1. Make sure your Azure function has the "AzureWebJobsFeatureFlags":"EnableWorkerIndexing" config key present.
  2. While you’re there, make sure your Azure function has all configuration keys from local.settings.json to the function’s Configuration.
  3. If this still doesn’t work make sure to move all your imports inside the main function, so they only get evaluated when main is run, and not when function_app.py is imported

Before:

import azure.functions as func

from src.mypckg import foo
# ... other imports

app = func.FunctionApp()

@app.function_name('myfunction')
async def main(...)
    ...
    

After

import azure.functions as func

app = func.FunctionApp()

@app.function_name('myfunction')
async def main(...)
    # YOUR IMPORTS HERE <--
    from src.mypckg import foo
    # ... other imports
    ...

Basically, if I were to generalize, it looks like any sort of runtime error within your function_app.py will cause a) the deployment to succeed and b) your triggers to be ignored.

– via this GitHub Issue

Using Playwright with Python Azure Functions

This was interesting to get working. In the end, I stumbled upon this GitHub repo of Anthony Chu, with very clear, working instructions. Which I’ve changed just a little bit to make things easier:

  • Make sure you’re referencing playwright in your requirements.txt
  • Make sure you’re using the default remote build when deploying (func azure functionapp publish $function)
  • Add two app settings to your function:
    • PLAYWRIGHT_BROWSERS_PATH = /home/site/wwwroot
    • POST_BUILD_COMMAND = PYTHON_EXECUTABLE=$(find /tmp/oryx/platforms/python -regex '.*/bin/python[0-9.]+$' -type f | sort -V | tail -n 1) && echo $PYTHON_EXECUTABLE && export PYTHONPATH=/tmp/zipdeploy/extracted/.python_packages/lib/site-packages && $PYTHON_EXECUTABLE -m playwright install
      • This is basically looking for the latest Python version available on the machine, and using it to install Playwright. I’ve only tested it on 3.10, so your mileage may vary

Here’s an (older) blog post as well – https://anthonychu.ca/post/azure-functions-headless-chromium-puppeteer-playwright/

I’ve also seen Ceruleoscope and was planning to look at it further even though it’s for Node but in the end, well, there was no need.

Getting started with Python Azure Functions on a Mac Silicon

brew tap azure/functions
brew install azure-functions-core-tools@4

Make sure to use Python 3.9, not 3.10 as the latter isn’t supported by Azure functions.

channels:  
  - conda-forge  
dependencies:  
  - python=3.9.*  
  - pip>=20.*  
  - pip:  
      - -r ./requirements.txt
conda env create -f environment.yml -n whatever
conda activate whatever
func init LocalFunctionProj --python -m V2

Either create or update func to be httpsOnly.

# either
az functionapp create
# or
az functionapp update --set httpsOnly=true --resource-group rg-whatever --name fn-whatever

Change Python version from 3.10 to 3.9 or risk getting “Local python version ‘3.9.18’ is different from the version expected for your deployed Function App. This may result in ‘ModuleNotFound’ errors in Azure Functions. Please create a Python Function App for version 3.9 or change the virtual environment on your local machine to match ‘PYTHON|3.10’.”

az functionapp config set --name <FUNCTION_APP> --resource-group <RESOURCE_GROUP>  --linux-fx-version "PYTHON|3.9"

Via SO, some good info on MS Learn as well.

CORS

Info here and here.

Locally, just add this to your local.settings.json

"Host": {
  "CORS": "*"
}

Performance

Async is a must – https://learn.microsoft.com/en-us/azure/azure-functions/python-scale-performance-reference#async

OpenTelemetry, OpenLLMetry, Tracing

OpenLLMetry calls get duplicated, plus I get a lot of noisy traces when instrumenting an Azure Function.

Some interesting links:

Other, maybe less interesting links:

  1. https://learn.microsoft.com/en-us/azure/azure-functions/opentelemetry-howto?tabs=app-insights&pivots=programming-language-python
  2. https://github.com/Azure/azure-sdk-for-python/issues/29672
  3. https://github.com/Azure/azure-sdk-for-python/issues/31292
  4. https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-add-modify?tabs=python
  5. https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-configuration?tabs=python
  6. https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-enable?tabs=python

Set

OTEL_PYTHON_REQUESTS_EXCLUDED_URLS=".*.in.applicationinsights.azure.com/.*,.*.documents.azure.com/.*"

And configure your telemetry as follows

exporter = AzureMonitorTraceExporter(connection_string=config.APPLICATIONINSIGHTS_CONNECTION_STRING)

configure_azure_monitor(connection_string=config.APPLICATIONINSIGHTS_CONNECTION_STRING)
logging.getLogger("azure.core.pipeline.policies.http_logging_policy").setLevel(logging.WARNING)
logging.getLogger("azure.monitor.opentelemetry.exporter.export").setLevel(logging.WARNING)

Traceloop.init(app_name="MyApp",
               exporter=exporter,
               disable_batch=True)

Some code that may be worth exploring as well


from opentelemetry import trace

# Initialize TracerProvider
trace_provider = TracerProvider()

# Create Azure Monitor Trace Exporter
exporter = AzureMonitorTraceExporter(connection_string=config.APPLICATIONINSIGHTS_CONNECTION_STRING)

# Create and add BatchSpanProcessor with the trace exporter
span_processor = BatchSpanProcessor(exporter)
trace_provider.add_span_processor(span_processor)

# Set the global tracer provider
trace.set_tracer_provider(trace_provider)

Traceloop.init(app_name="MyApp",
               exporter=exporter, processor=span_processor,
               disable_batch=True)

# Get a tracer
tracer = trace.get_tracer_provider().get_tracer(__name__)

# Example of creating a span
with tracer.start_as_current_span("hijinks"):
    # Your traced code here
    pass
class SpanFilteringProcessor(SpanProcessor):
    """
    Developed as per the [docs](https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-add-modify?tabs=python#filter-telemetry),
    should in theory prevent exporting spans from internal activities.
    However, when setting ALL spans as DEFAULT, the traces (but not the dependencies) still reach AppInsights.
    Kept for future reference.

    Usage:

    ```
    exporter = AzureMonitorTraceExporter(connection_string=config.APPLICATIONINSIGHTS_CONNECTION_STRING)
    proc = SpanFilteringProcessor()


    configure_azure_monitor(instrumentation_options={"azure_sdk": {"enabled": False}},
                            disable_logging=True, disable_tracing=True, disable_metrics=True,
                            enable_live_metrics=False, disable_azure_core_tracing=True, span_processors=[proc])


    Traceloop.init(app_name="MyApp",
                   exporter=exporter, processor=proc,
                   disable_batch=False)
    ```
    """

    def on_start(self, span: "Span",
                 parent_context: Optional[Context] = None):
        print(f"::::::::{span._kind}")
        print(f"\n{span._context}")
        print(f"\n{parent_context}")

        # Check if the span is an internal activity.
        if span._kind is SpanKind.INTERNAL:
            # Create a new span context with the following properties:
            #   * The trace ID is the same as the trace ID of the original span.
            #   * The span ID is the same as the span ID of the original span.
            #   * The is_remote property is set to `False`.
            #   * The trace flags are set to `DEFAULT`.
            #   * The trace state is the same as the trace state of the original span.
            span._context = SpanContext(
                span.context.trace_id,
                span.context.span_id,
                span.context.is_remote,
                TraceFlags(TraceFlags.DEFAULT),
                span.context.trace_state,
            )