Storage Account for Azure Stream Analytics Monitoring

logo-asa
Azure Stream Analytics (ASA) is a Platform-as-a-Service (PaaS) for consuming streaming data from different types of input. It primarily reads the data stream, processes it and then sends the data to the output. The input and output could be any other Azure services like IoT hub, event hubs, service bus queue and topic, blob storage, power BI etc. This service is recommended for IoT scenario to ingest the stream of data from the device to cloud (D2C) and form a hot and cold path analytics.

To check the throughput of the highly loaded IoT and transactional system, a performance load test is required to find the streaming unit(SU) % utilization and ascertain how ASA jobs can scale up under the load. By default, stream analytics comes with key metrics data like input events, output events, SU% utilization etc. and display them on the monitor tab of the Azure portal page. These metrics help us to check the performance of the ASA job. A screenshot is shown belowsnip_20151122124436
The metrics data can be used to monitor the stream analytics job. Under the hood, these metrics data is stored in the table storage. The name of the table storage is given at the time of creating the stream analytics job. When ASA job runs, it logs the metrics data in five different tables and the process happened internally by the execution engine. It is worthwhile to mention that there is a certain delay (like 5 to 10 minutes) to transfer the data to the table storage. The naming convention of the tables is given below

  • WADMETRICSP1DP10DV2S20150819
  • WADMETRICSPT1HP10DV2S20150819
  • WADMETRICSPT1MP10DV2S20150819
  • WADMETRICSPT30MP10DV2S20150819
  • WADMETRICSPT6HP10DV2S20150819

It is important to check the naming rule because they follow certain patterns. The details are given below

  • P1D – Data is aggregated over 1 day
  • PT1H – Data is aggregated over 1 hour
  • PT1M – Data is aggregated over 1 minute
  • PT30M – Data is aggregated over 30 minute
  • PT6H – Data is aggregated over 6 hours

The date format is YYYYMMDD and the tables would be created at the interval of 10 days based on the start date. If your ASA job starts on 09-Jan-2015 then the name of the tables would be 20150109, 20150119, 20150129 and so on.
Once the data is stored, the next step is to consume the metrics data and lets see how can they be retrieved from the table storage.

namespace ASAMetricsForMonitoring {
 class Program {
  static void Main(string[] args) {
   //* Refer the storage account from storage connection string
   var storageAccount = CloudStorageAccount.Parse("<storage connection string>");
   //* Instantiate a table storage client
   var tableClient = storageAccount.CreateCloudTableClient();
   //* Reference the particular table.
   //* Please consider the time interval and date format to identify the table name
   //* Following statement would give the data at the interval of 1 minute
   CloudTable MetricsTable = tableClient.GetTableReference("WADMetricsPT1MP10DV2S20151117");
   //* Select the partition accordingly
   string TargetPartition = ":002Fsubscriptions:002F78541c1a:002D5011:002D4beb:002Db21f:002D7fa41cba0743:002FresourceGroups:002Floadtest:002Fproviders:002FMicrosoft:002EStreamAnalytics:002Fstreamingjobs:002Floadtest:005FProcessEvent";

   //* select the target row accordingly
   string TargetRow = "InputEvents__";
   //string TargetRow = "OutputEvents__";
   //string TargetRow = "ResourceUtilization__";

   while (true) {
    //* Consider the start time to search the data
    DateTime TimeStart = TimeZoneInfo.ConvertTimeToUtc(new DateTime(2015, 11, 19, 19, 30, 0));

    //* Begin building the filter for the query with the PartitionKey
    string Filter = TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, TargetPartition);

    //* Add the TimeStamp https://msdn.microsoft.com/en-us/library/azure/microsoft.windowsazure.storage.table.tablequery.generatefilterconditionfordate.aspx
    Filter = TableQuery.CombineFilters(Filter, TableOperators.And,
     TableQuery.GenerateFilterConditionForDate("Timestamp", QueryComparisons.GreaterThan, TimeStart));
    //* Add the 1 hour end time from start time
    Filter = TableQuery.CombineFilters(Filter, TableOperators.And,
     TableQuery.GenerateFilterConditionForDate("Timestamp", QueryComparisons.LessThan, TimeStart.AddMinutes(60)));

    //* We can't filter on the start of a string, but we can use lt and gt to get all rows starting with InputEvents__
    Filter = TableQuery.CombineFilters(Filter, TableOperators.And,
     TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.GreaterThan, TargetRow));
    Filter = TableQuery.CombineFilters(Filter, TableOperators.And,
     TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.LessThan, TargetRow + "A"));

    //* Just add the Total and Timestamp columns and build up the query
    TableQuery InputEventsQuery = new TableQuery().Select(new string[] {
      "TimeStamp",
      "Total"
     })
     .Where(Filter);

    //* Just add the Average and Timestamp columns and build up the query for SU % Utilization
    //* TableQuery InputEventsQuery = new TableQuery().Select(new string[] { "TimeStamp", "Average" })
    //* .Where(Filter);

    //* Clear the console window
    Console.Clear();

    IList < EntityData > _entities = new List < EntityData > ();

    foreach(DynamicTableEntity entity in MetricsTable.ExecuteQuery(InputEventsQuery)) {
     EntityProperty Total;
     entity.Properties.TryGetValue("Total", out Total);
     //* Use Average for SU Utilization
     // entity.Properties.TryGetValue("Average", out Total);
     _entities.Add(new EntityData() {
      TimeStamp = entity.Timestamp, Total = Total.DoubleValue, RowKey = entity.RowKey
     });
    }

    var _entlist = from _ent in _entities
    orderby _ent.TimeStamp ascending
    select _ent;

    //* Loop through the results and display the Totals with the timestamps
    foreach(EntityData ent in _entlist) {
     Console.WriteLine("|{0}|{1}|{2}", ent.TimeStamp, ent.Total, ent.RowKey);
    }
   }
  }
 }

 class EntityData {
  public DateTimeOffset TimeStamp {
   get;
   set;
  }
  public double ? Total {
   get;
   set;
  }
  public string RowKey {
   get;
   set;
  }
 }
}

This code helps to show the metrics data in tabular format to the console window. Then the data can be used by any data visualization tool (Power BI, D3.js etc) and show them in the chart to understand the behaviors of ASA jobs. Please remember to change the event type in line number 19,20,21 as per the need and change the metrics parameter to Average in case of SU % utilization in line number 61,62.