Control Azure Data Lake costs using Log Analytics to create service alerts

综合技术 2018-05-23 阅读原文

Azure Data Lake customers use the Data Lake Store and Data Lake Analytics to store and run complex analytics on massive amounts of data. However it is challenging to manage costs, keep up-to-date with activity in the accounts, and proactively know when usage thresholds are nearing certain limits. Using Log Analytics and Azure Data Lake we can address these challenges and know when the costs are increasing or when certain activities take place.

In this post, you will learn how to use Log Analytics with your Data Lake accounts to create alerts that can notify you of Data Lake activity events and when certain usage thresholds are reached. It is easy to get started!

Step 1: Connect Azure Data Lake and Log Analytics

Data Lake accounts can be configured to generate diagnostics logs, some of which are automatically generated (e.g. regular Data Lake operations such as reporting current usage, or whenever a job completes). Others are generated based on requests (e.g. when a new file is created, opened, or when a job is submitted). Both Data Lake Analytics and Data Lake Store can be configured to send these diagnostics logs to a Log Analytics account where we can query the logs and create alerts based on the query results.

To send diagnostics logs to a Log Analytics account, follow the steps outlined in the blog post Struggling to get insights for your Azure Data Lake Store? Azure Log Analytics can help!

Step 2: Create a query that can identify a specific event or aggregated threshold

Specific key questions about the state or usage of your Azure Data Lake account can be generally answered with a query that parses usage or metric logs. To query the logs in Log Analytics, in the account home (OMS Workspace), click on Log Search.

In the Log Search blade, you can start typing queries using Log Analytics Query Language :

There are two main types of queries that can be used in Log Analytics to configure alerts:

  • Queries that return individual events, these single events will show a single entry per row (e.g. every time a file is opened).
  • Queries that aggregate values or metrics for a specific window of time as a threshold by aggregating single events (e.g. 10 files opened in the past five minutes), or the values of a metric (e.g. total AUs assigned to jobs).

Here are some sample queries, the first two return events while the third aggregate values:

  • This query returns a new entry every time a new Data Lake Store folder is created in the specified Azure Data Lake Store (ADLS) account:
| where Category == "Requests"
| where ResourceProvider == "MICROSOFT.DATALAKESTORE"
| where Resource == "[Your ADLS Account Name]"
| where OperationName == "mkdirs"
  • This query returns a new entry every time a job fails in any of the Data Lake Analytics accounts configured to the Log Analytics workspace:
| where ResourceProvider == "MICROSOFT.DATALAKEANALYTICS"
| where OperationName == "JobEnded"
| where ResultType == "CompletedFailure"
  • This query returns a list of jobs submitted by users in a 24-hour interval, including user account and sum of jobs submitted in the 24h interval:
| where ResourceProvider == "MICROSOFT.DATALAKEANALYTICS"
| where OperationName == "SubmitJob"
| summarize AggregatedValue = count(identity_s) by bin(TimeGenerated, 24h), identity_s

Queries like these will be used in the next step when configuring alerts.

Step 3: Create an alert to be notified when the event is detected or when the threshold is reached.

Using a query such as those shown in the previous step, Log Analytics can be used to create an alert that will notify users via e-mail, text message, or webhook when the event is captured or metric threshold is reached. Check out this blog post for creating a new alert: Simple Trick to Stay on top of your Azure Data Lake: Create Alerts using Log Analytics .

Please note that the alerts will be slightly delayed and you can read more details regarding the delays and Log Analytics SLAs in Understanding alerts in Log Analytics .

Tell us what you think

Setting up alerts in Log Analytics can help you understand usage and manage costs as utilization increases. The process to set up alerts allows enough flexibility to adapt to your specific needs. Are you looking for a specific metric or usage activity? Reach out and let us know in the comments, or on our feature requests UserVoice . Check out theAzure Data Lake blog, where we regularly share updates and tips on how to get the most out of your Azure Data Lake accounts.

Microsoft Azure Blog

责编内容by:Microsoft Azure Blog阅读原文】。感谢您的支持!


PowerShell强势入驻Azure Cloud Shell先睹为快... 在微软的 BUILD 2017 大会上,微软宣布Azure Cloud Shell支持Bash Shell,同时PowerShell也会入驻Azure Cloud Shell,作为客户您可以选择自己最顺手的Shell完成云平台管理。Pow...
How to Use More Than 4 TB of Space in MS Azure for... Every year the amount of data that requires to be backed up is increased. There are two ways to store backups – by using...
Moving to Azure Kubernetes Service (AKS) We recently moved our production service to the new Azure Kubernetes Service (AKS) from Microsoft. AKS is a managed...
Intelligent Healthcare with Azure Bring Your Own K... Sensitive health data processed by hospitals and insurers is under constant attack from malicious actors who try to gai...
Azure Service Fabric Reserve Proxy Exploring Working with Azure Service Fabric you may hear about the Reverse Proxy. However, what is the reverse proxy and why it is...