Rewrite File Access for Cloud Storage
Now that there was a system in place where the web and worker roles could communicate with each other, the difference between cloud storage and local file access had to be addressed. The Monte Carlo simulation used a comma-delimited text data file consisting of approximately 17,000 lines (512Kb in size) that was loaded into an array for processing. The simulation iterated through the array one row at a time with one set of randomized variables. Successive iterations would yield another random set of variables. However, the worker role could not access the local file system as the standalone application could, hence the need to rewrite this code section for the cloud. Two options were considered to address this rewrite:
- Embed the data file as a static VB resource
- Upload the file to the web role and then write it to BLOB storage, in which:
- The web role would send a message to worker role with BLOB ID
- The worker role would retrieve file from BLOB storage
The second approach would be the most flexible for future implementations that had different data files. Note that these were only “possibilities”. One Azure publication warned of “gotchas”, such as the inability of the Azure emulator to limit local functionality to what was available in the actual cloud environment (Krishnan, 2010). In other words, it was possible to write code to access the local file system or send emails with standard methods and they would function fine while running in the local emulator but they would fail in the Cloud.
Therefore, both approaches were coded and, by default, an attempt was made to load the data file as a VB resource. An operator initiated action could start the second method of uploading the data file to the web role and using BLOB storage to store and retrieve it. The Lab examples were helpful; however they were only manipulating binary files. Searching the MSDN proved to be the answer as the BLOB does support text file storage and by using the appropriate methods and properties, the code was written. After debugging and testing both methods, all could be tied together.
Polishing the Application for the Cloud
As stated previously, this was a scientific application running Monte Carlo simulations on a dataset and producing a single final result. Therefore, a typical scenario from an operational viewpoint would be to initialize n-worker-roles with different sets of parameters, and then have them start long-running simulations. Interim status and final results would be obtained by the web role polling the worker queue for messages and presenting them in a status list box. This status display was accomplished by selecting the Status command from the function list and clicking on the Submit button. However with 10 to 20 worker roles, or more, running, the status update needed to be automated.
A better approach would be to poll the worker message queue at a timed interval and repaint only the status list box. Visual Studio 2010 proved invaluable, as this was accomplished using a combination of three ASP.NET AJAX server controls: the ScriptManager control, the UpdatePanel control, and the Timer control. This function was made optional to give the user explicit control when needed.
Testing the Completed Azure App in the Local Emulator
The Windows Azure platform emulator, running under Visual Studio, has the capability of testing locally in two stages. The first is to test an application with local code and local storage; and then the second with local code and actual cloud storage. The first stage was completed without any notable issues.
In order to test code in the local Azure emulator using the cloud storage account, it was necessary to reconfigure both the web role and worker role configuration, specifically the properties on the Settings tab (when using Visual Studio 2010). The DataConnectionString and Diagnostics.ConnectionString settings needed to be changed from “UseDevlopmentStorage = True” to the storage credentials just obtained.
With the re-configuration complete and the application rebuilt and restarted, the Azure application launched and ran. Since the latency of the cloud storage was known to impact performance, the initial concern was only with testing to ensure the application still functioned properly.
Deploying the Application to the Cloud
The final step in this test project was to actually move the application to the Azure cloud platform. Having done a thorough job of testing the application in the local cloud emulator, confidence was high that the objective was about to be achieved of harnessing the power of the cloud.
The Visual Studio 2010 Create Service Package Only option was used to publish the Azure application and upload the Azure application files via the Windows Azure management portal. The application was deployed to the Azure staging area which presented a Globally Unique Identifier-type of URL to access the cloud application.
A mistake, but also a lesson learned, was that the application was configured for nine worker roles. The deployment time was directly proportional to the number of instances configured. Until the deployment is solid, it is more expedient to configure the minimum number of roles needed to do a preliminary test of an application.
The moment when the web role and nine worker roles turned from busy to ready, and finally green, was a significant milestone. When the application url in the Cloud was clicked on, however, a long-spinning browser activity icon was followed by an error stating:
“Server Error – Unknown Error, Cannot display error details from a Remote Server”.
The default Web page, which didn’t have any Azure specific code, did not appear.
The remoteness of the cloud was evident. The application was known to work fine in the local Azure emulator and, in fact, the ASP.net application was successfully running on an external IIS Web server, so the problem was not immediately apparent. This was a difficult problem, since there wasn’t a detailed error message to explain the situation. At the end of the day a decision was made to temporarily undeploy.
After searching the Internet, the conclusion was reached that the problem was really IIS and ASP.net centric. A fix was found that would enable a detailed error message to be produced. The web.config had to be modified to allow a remote server to display a detailed error message.
The next day the application was rebuilt, redeployed, and tried again. After several tries, it was properly configured, finally producing the real error message:
“Default.aspx cannot be found or does not exist”.
After researching the problem and searching ASP.net/IIS issues, the answer was found – in the three ASP.net files copied from the ASP.net application to the Azure application, “CodeFile” needed to be changed to “CodeBehind”. The root cause was related to how existing, working, ASP.net files are added to an Azure project.
Table 1: Performance Metrics
Application Type | Number of Simulations | Time in Minutes:Seconds |
Excel Spreadsheet | 5000 | 27:29 |
VB standalone | 5000 | 08:05 |
ASP.net | 5000 | 02:55 |
Azure emulator | 5000 | 04:23 |
Azure Cloud, Extra Small 1 Worker | 5000 | 03:30 |
Azure Cloud, Extra Small 5 Workers | 5 × 5000 = 25,000 | 03:25 |
Azure Cloud, Extra Small 9 Workers | 9 × 5000 = 45,000 | 03:15 |
Azure Cloud, Extra Small 9 Workers | 9 × 25,000 = 225,000 | 16:17 |
Azure Cloud, Extra Small 9 Workers | 9 × 50,000 = 450,000 | 34:52 |
At last, the web page finally launched. Local testing in the Azure emulator paid off as the application worked properly and as intended. Its performance, as shown in Table 1, was as expected – each worker role subsequently added performed their simulations in the same amount of time. The more workers that were added, the more work that was accomplished.