This post was written by Gretchen Loihle and Andrew Richards in the Windows Reliability/Windows Error Reporting team.

Welcome to our second post to inform Windows Store app developers about the most common failures being seen in the Windows Error Reporting (WER) telemetry across multiple Windows Store apps, and provide recommendations for how to avoid them. We invite you to read the introductory article for a more thorough discussion about WER and the availability of failure telemetry through the Store dashboard.

In this segment we touch on new failures in a familiar area and introduce some new issues. In particular, we discuss JSON object deserialization, managing unexpected HTTP events, and updating Live Tiles.

Before we begin, note that Microsoft has released the code for .NET 4.5.1 under the Microsoft Reference Source License (MS-RSL), and made it available to browse and download at http://ReferenceSource.microsoft.com. We’ll occasionally reference the affected code, so please feel free to examine the functions shown in the call stacks to see some of the error origins. For more on all this, see these additional resources:

JSON parsing failures

Newtonsoft.Json.dll!Newtonsoft.Json.JsonTextReader.ParseValue

The talented engineers contributing to the Newtonsoft JSON framework at Json.Net have produced a free framework for transmitting objects using the JSON (JavaScript Object Notation) model. Because these libraries are free, open source, and simple to use, they have become extremely popular, and we see them increasingly bundled with many Windows Store apps. Unfortunately, these libraries often take the blame for failures that occur when handling corrupted or invalid objects, even when they are not at fault. The call stack below shows one example of this failure, where an invalid sequence of characters was read while deserializing an object.

A review of a large number of crash dumps from the WER telemetry shows that most of the failures occur while deserializing objects, often in background tasks. This reinforces the belief that any content delivered from a remote source must be treated with suspicion, and apps should always protect themselves against invalid, unexpected, and corrupted content. Not surprisingly, the recommendation for this failure is to wrap calls that use the Newtonsoft JSON libraries with try/catch blocks, both in app code and background tasks. A decision can then be made on whether or not the app should continue, based on how crucial the objects are to app execution.

Additional resources:

HTTP request failures

The next two failures are seen while executing HTTP requests. While these are somewhat similar to the System.Net.Http.HttpClientHandler.GetResponseCallback errors discussed in the previous article (caused by name resolution failures), the underlying causes here are a bit different. Like the JSON failures, these errors are seen most frequently in background tasks.

mscorlib.dll!System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd

These failures occur when the app has not implemented a task cancellation handler. The exception type and message on the outer exception, plus the information in the inner exception, shows that the web request has actually been cancelled.

The call stack below shows both exceptions, followed by the matching code from ReferenceSource that shows why the exception was thrown. Note that System.Net.HttpWebRequest.EndGetReponse is throwing the first exception when it detects that the EndCalled has already been set on the async web result. This was set earlier as part of cancelling the web request.

From http://referencesource.microsoft.com/#System/net/System/Net/HttpWebRequest.cs#10e37e4a3f52a590

System.Net.Http.HttpClientHandler handles the inner exception (as it is supposed to), and lets cancellation and cleanup continue for the task. This ultimately calls into the app’s task, where the second (outer) exception occurs because the app’s task is not catching the System.Threading.Tasks.TaskCanceledException exception. The outer exception fails in mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.ValidateEndwhen checking to see if the task is wait-notification enabled, or if it has not been run to completion. If the app implements a handler for System.Threading.Tasks.TaskCanceledException, execution would actually never reach this code.

The recommendation here is to expect that any asynchronous activity can be cancelled at any time, and handle the appropriate cancellation exceptions that might be passed.

Additional resources:

System.dll!System.Net.Sockets.Socket.EndConnect

This failure is something of a catch-all for a variety of networking-related issues, so there is no specific recommendation for resolution. But the underlying causes serve as a good demonstration of the variety of environments the app might be running in, and show why defensive programming is so important.

In this case, the inner exception shows that a managed network socket call is surfacing an error returned from the lower network layers when trying to connect to a network endpoint.

While this specific example implies a conflict between socket use and access permissions, examination of a large set of dumps from the WER telemetry shows a variety of error messages being returned:

“No connection could be made because the target machine actively refused it”

“An attempt was made to access a socket in a way forbidden by its access permissions”

“A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond”

These errors can be caused by a number of things in the user’s environment, most of them beyond the developer’s control. For example:

  • Firewall settings
  • Antivirus blocking ports
  • The port is already in use
  • The port number is incorrect
  • The machine has too many open connections
  • Other issues in the user environment (restricted network access settings, router configuration, etc.)

It’s important to recognize that your app may be run in a huge variety of networking environments, and handle network connection failures gracefully. For example, a polite message stating that the network is unavailable at the moment delivers a very professional impression to the user. If sufficiently important, you may even be able to offer a list of suggestions or troubleshooting items to check.

Live Tile failures

Windows.UI.dll!Windows.UI.Notifications.TileUpdateManager.CreateTileUpdaterForApplication

Windows.UI.dll!Windows.UI.Notifications.TileUpdateManager.CreateTileUpdaterForSecondaryTile

Many apps use tile updates to convey updates and status changes. Tile updates are often implemented as background tasks that run on a scheduled basis, or when some triggering event occurs. This failure occurs when an app (or background task) attempts to update a tile using the underlying notification system. The failure indicates that the system was unable to obtain a valid application identifier.

We’ve found two scenarios that cause this failure. The first is during app development when running the app in the simulator in Visual Studio. This error may be thrown when updating tiles. The recommendation is to run the app under the Local Machine setting, as seen below.

image

Secondly, this failure can occur when the underlying notification platform is not available on the user’s machine. If the notification platform has encountered an issue that caused it to terminate, it causes tile notification and updating to fail as well. The call to TileUpdateManager.CreateTileUpdaterForApplication normally retrieves the package full name, creates a notification endpoint, and performs package and app name validation for the notification subsystem. Problems with either of the last two steps can cause “The application identifier provided is invalid” to be returned, generating this exception.

Additionally, this failure can also occur when using secondary tiles and calling TileUpdateManager.CreateTileUpdaterForSecondaryTile to change the content or appearance of a secondary tile. Make sure the secondary tile is being referenced correctly with the appropriate tileID, and that the TileDisplayAttributes are set correctly. Please see the forum post, Can’t update secondary tiles, for more information.

In either case, the presence of the notification system on the customer’s machine is outside the control of the app. If notifications aren’t available, inadvertently using the notification interfaces generates an exception that should be handled by the app. It’s recommended to wrap the call to CreateTileUpdaterForApplication in a try/catch block and allow the app to keep running. For most Windows Store apps, a temporary inability to update tiles should not be a fatal condition.

Additional resources:

image

Link:

Understanding and resolving app crashes and failures in Windows Store apps, part 2: JSON and tiles