Approaching performance issues

Performance issues can be due to many factors, and to identify the root cause, you may need to involve multiple groups within the organization. This makes it challenging and, sometimes, even political. Also, performance issues are complex and hard to reproduce. Therefore, it's important to understand the issue clearly, set priorities, and get the appropriate people involved for the analysis.

Understanding the issue

The very first step is understanding the issue. "We are having performance issues!" is a very broad statement. You need to identify all the symptoms, and these symptoms may help you define the course of action. It's important to ask the right questions, such as:

  • How many users are affected and in what areas of the business?
  • Is this a general performance issue or related to specific processes?
  • Is there a pattern for the issue like particular users and/or times of the day?
  • Can it be recreated in a test environment? If not, can it consistently be recreated in the production environment?
  • What are the expected results, such as duration, concurrent users, and so on?

Planning and defining the analysis strategy

When you have enough details and an understanding of the issue, it's time to formulate an action plan and an analysis strategy.

  • It is more important to resolve performance issues affecting the end-user productivity earlier rather than with a nightly batch job that is taking longer than expected (unless the nightly job is impacting the business SLAs or making the business slow down).
  • You may need resources from different teams, or you many need to source an external/contract resource. Coordinate and find the right people in the team.
  • Identify appropriate performance monitoring tools and install and configure them in the affected environment to collect the data.

Based on the issues defined in the investigation strategy, the following diagram can be a good starting point on where to look:

Planning and defining the analysis strategy

Corrective action and review

Solutions to performance issues could be as small as changing a parameter setting, or it could be so complex that it requires design or code changes. The following are a few tips on implementing the corrective actions:

  • If the problem is not just limited to a specific process, a quick validation of the environment setup and configuration is a good place to start.
  • Validate the SQL server and AOS configurations, as recommended. Some issues can be resolved by correcting the simple setup issues like rebuilding indexes, updating the statistics on the affected tables, or setting the MAXDOP setting in SQL Server to 1.
  • Performance tuning is an iterative process; try one tuning at a time and verify the result.
  • Minimum effort maximum result: When analyzing a performance issue, you may discover many factors that could be adding to the performance issue. Start with the one that requires minimum effort and gives the maximum result.
  • You should also know when to stop tuning a particular scenario and move on. Remember the law of diminishing returns; this means that in each iteration of performance tuning, the potential for improvement reduces exponentially.

General scenarios and investigation strategies

The following sections define a few scenarios from my experience to help in brainstorming the identification of the root cause for the performance issues.

Issue 1

The entire company reports slowness issues. The performance is getting worse day by day.

Investigation:

  • Check the application for the following:
    • The number of concurrent users connected
    • The batch jobs running at the moment
  • Check the AOS utilization of:
    • The CPU
    • Memory
  • Check the DB for symptoms like:
    • CPU, memory, and IO
    • Any blockage
    • Long running queries
    • Index statistics not being up-to-date

Root cause: After investigation, it was found that index maintenance was not put into place. The DBA used the DynamicsPerf tool and observed bad execution plans and several long-running queries.

Solution: Reindexing and defragmentation of the indexes resolved the issue. Index maintenance was put into place to avoid such issues in the future.

Issue 2

Operations in all warehouses are slow.

Investigation:

  • Check the network for connectivity, bandwidth, and latency issues
  • Check SQL server for blocking, CPU, and memory utilization

Root cause: Testing the network connectivity revealed that the bandwidth between the warehouse locations and the headquarters was limited.

Solution: The AX AOS configuration was updated to enable the sending of smaller data packets. This option is available under Microsoft Dynamics AX 2012 Server configuration/performance/minimum packet size to compress (in KBs). For more details, visit the TechNet article at https://technet.microsoft.com/en-us/library/aa569624.aspx.

Issue 3

Operations at specific locations are slow.

Investigation:

  • Check the network connectivity, bandwidth, and latency issues
  • If RDP or the Citrix layer are used, check the resources on the RDP and Citrix Servers

Root cause: It was found that the RDP Server's CPU and memory utilization was very high. The RDP Server was over-provisioned.

Solution: Upgrading the resources on the RDP Server resolved the issue.

Issue 4

Printing in the warehouses is slow.

Investigation:

  • Check the drivers on the printer
  • Check the bandwidth and latency
  • Check the resources on the print server

Root cause: An outdated driver on the printer.

Solution: Updating the printer driver resolved the issue.

Issue 5

The business users are experiencing performance issues when creating the PO invoices. The PO invoice form takes several minutes to open. The same behavior is observed in other environments with the same dataset.

Investigation: Since this is limited to one specific process, we used the trace parser tool to generate a trace for the invoice posting processes with specific datasets. It was observed that there are hundreds of receipts for each purchase order, and the system typically matches all the receipts against a new invoice. However, as per the business process, the customer usually gets an invoice only for a few receipts. The invoicing clerk was facing double issues: first, he was waiting for minutes to open the invoice form, and then he had to deselect all the receipts and then select an individual one.

Root cause: Code and business logic inappropriate as per the business process.

Solution: We created a new button on the purchase order to open the invoice form without matching any receipt. This enabled the opening of the invoice form within fractions of a second. Additional index was added for improving the query performance during the posting process.

Issue 6

Nightly jobs for generating the file output for the e-commerce solution (custom process) is taking several hours to finish when the data set is large.

Investigation:

  • Check the memory and CPU utilization on the batch server
  • Check the blocking processes when the batch process is running
  • Check if there are enough batch threads available for all the batch tasks
  • Check if we can we utilize the regular AOS during the night for extra threads

Root cause: We found that the process used multiple nested while loops to look for different information, such as product, product dimension, trade agreement, and inventory on-hand, and then combined them in staging to generate the final file. The issue was too many database calls.

Solution: A development resource was assigned for investigation and performance tuning at the code level. The nested while loops were replaced with joins and set-based operations. The updated code was tested with a large set of data. The performance improved from 6-7 hours to under 30 minutes.

Issue 7

Users are getting kicked out (AOS is restarting).

Investigation:

  • Check if this is being caused by a specific user's action. (Every time the user tries to confirm the order, it causes the custom code to go into an infinite loop. The system reaches 100 percent memory utilization and the AOS restarts)
  • Check the AOS server event log
  • Utilize the windows AOS server memory dump if the crash happens frequently
  • Check if your AOS has the latest binary updates

Root cause: After analysis, it was found that the installed AOS sever did not have the latest binary updates.

Solution: Installing the latest kernel version on the AOS server resolved the issue.

Issue 8

System is slow at 6 p.m. everyday.

Investigation:

  • Check the scheduled backups or maintenance activities running at this time
  • Check the CPU and memory utilization on the AOS and database servers
  • Check blocking at the database server
  • Check if you have any Dynamics AX batch processes running at 6 p.m.
  • Check an anti-virus scan is running on the servers
  • Check for any network issues caused by massive data transfer (unrelated to Dynamics AX)

Root cause: An antivirus scan was scheduled to run every day at 6 p.m. causing high utilization of the memory and the CPU.

Solution: The antivirus schedule was moved to a later time, after the business hours.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset