Automating Local AI Inference with Azure Local Foundry and PowerShell

Cloud-based AI inference costs are rising, and privacy regulations like GDPR and the EU Cloud Act make local processing increasingly attractive. Azure Local Foundry offers a powerful CLI and SDK to run models directly on your machine, leveraging NPU accelerators. This guide explains how to automate Local Foundry with PowerShell by interacting with its REST API, enabling scripted, privacy-preserving AI workflows without cloud dependencies.

Why choose local AI models over cloud APIs?

Cloud APIs from providers like OpenAI, Anthropic, and Microsoft are raising prices and tightening token quotas. For simple tasks like formatting or summarizing, using a frontier model wastes money. Additionally, sensitive data may never leave your perimeter due to regulations like GDPR or the upcoming EU Cloud Act. Local models solve both issues: they run inference on your hardware (including NPUs), eliminate ongoing token costs, and keep data fully under your control. Tools like Ollama and Azure Local Foundry make this practical.

Automating Local AI Inference with Azure Local Foundry and PowerShell — Source: dev.to

What is Azure Local Foundry and how do I install it?

Azure Local Foundry is a toolkit that runs AI models locally using your machine's AI accelerator (NPU). It ships as an SDK (Windows, Linux, macOS) and as a CLI (Windows, macOS preview). Installation is straightforward: on Windows, run winget install Microsoft.FoundryLocal; on macOS, use brew install microsoft/foundrylocal/foundrylocal. Once installed, the CLI lets you list, download, load, and run models. For example, foundry model run deepseek-r1-14b handles all steps in one command. The service starts automatically when needed.

How do I manage models with the Foundry CLI?

The CLI provides four key commands: foundry model list shows available models with details; download pulls a model into your local cache; load puts the model into the running service; and run does everything in one step (download, load, inference). For quick tests, foundry model run <model-name> is ideal. While the CLI is interactive, automation requires the REST API that the CLI exposes when the service is active.

Can I use PowerShell to automate Local Foundry?

Yes, because PowerShell lacks a direct SDK binding, the best approach is to use the REST API exposed by the Foundry service. This requires three prerequisites: the service must be running, a model must be loaded, and you need the service's REST API URI. The CLI's foundry service status returns the URI once the service is up. You can automate the entire flow: start the service if it's stopped, load a model, extract the base URI, then make API calls for inference or model management.

How do I start the Foundry service and get its REST API URI in PowerShell?

First, define a function to check service status: function get-foundryServiceStatus { return & foundry service status }. If the output contains 'service is not running', start it with & foundry service start | Out-Null. Next, load a model (e.g., & foundry model load phi-3-mini-128k | Out-Null) so the service exposes its URI. Parse the status output with a regex: $pattern = 'https?://[^\s"]+', then $uri = [regex]::Match($getServiceStatus, $pattern).Value. Remove trailing /openai/status to get the base URL: $uri = $uri -replace '/openai/status$',''. Store this URI for subsequent API calls.

How do I invoke the Foundry REST API from PowerShell?

Create a reusable function Invoke-FoundryRequest that takes method, base URL, path, optional headers, and body. Build the full URI as $FoundryBaseUrl$Path, then use Invoke-RestMethod with the parameters. For example, to get a completion, use POST /chat/completions with a JSON body containing the model name and messages. Remember to set the Content-Type header to application/json. This function lets you run any inference, list models, or perform other operations programmatically from PowerShell scripts, enabling scheduled tasks or integration with other tools.

What are the benefits of using the REST API over the CLI?

The CLI is great for interactive exploration and quick tests. However, for automation—such as batch processing, integration into CI/CD pipelines, or embedding inference in larger scripts—the REST API is far more flexible. It allows you to pass input dynamically, handle errors gracefully, combine with other PowerShell modules, and run headlessly. You can also load multiple models and switch between them without interactive input. The API follows OpenAI-compatible endpoints, reducing learning curves. Ultimately, the REST API unlocks the full potential of Local Foundry for production-like scripting scenarios.

Tags: