This post is primarily aimed at non technical readers and those responsible for coordinating or managing these issues, but with links drilling through to more in depth blogs containing further details.
By 'general performance', I mean a set of unidentified performance issues across one or more modules, or indeed the entire application.
This is intended to provide a suggested approach drawing on experience, to get you started in what can be a challenging area.
General performance issues can often be highly political and complex. Many organisations and groups within organisations may be involved due to the number of different potential factors and root causes; it can often be a combination of factors. There is often no clear error message or steps to reproduce each issue consistently and the issue itself may not be clearly defined and subjective. As such, collaboration and user perception is key, hence the importance of understanding the issues and setting the right level of expectation from the start. Generally speaking though, the important thing to remember is the same relatively straightforward approach applies every time when analysing AX performance issues.
Crashes and AOS memory issues, while they may be related to performance in some cases, should at least initially be treated as a separate area, but then once the call stack is identified from crash dump analysis, some of the principles discussed here may be applied if it is found to be performance related. There is a series of articles available to help with these, as summarised in this article:
These steps are based on typical approaches we would follow on support combined with my own previous experience.
1. Set User Expectations
Setting the right levels of expectation from the start is key to keeping any performance tuning project within scope. I say project here deliberately because general performance issues should be treated as such, including scoping, timelines and allocation of multiple resources. There may be questions like 'Could [X technical issue] in some way relate to this performance issue?'; bear in mind that while positive contributions should be encouraged, be careful how you use the information - be careful to stick to your original goals and not be sidetracked.
Get a list of processes and validate expected durations for those processes from the end users, i.e. whether they are actually realistic or not based on the underlying logic. If you don't think it's realistic, say so, it's better to have these conversations as early as possible; ask them to define in business terms what the requirement is and if possible, provide supporting information to go with it. If possible get the target defined in terms of volume and concurrent users as well. For example:
"With 200 concurrent users, we expect process 'X' to take a maximum of 30 seconds and an average of 10 seconds, for a 100 line order. We have calculated this based on the order volume a user in that area would typically need to process to meet their targets."
It relates back to the principle of SMART: Specific, Measurable, Achievable, Realistic and Time related.
Users saying "AX is generally slow" or "'X' process is slow" may be valid, but it doesn't provide enough information on its own to properly analyse the problem; the Performance Analyser can help but there is nothing better than first-hand information from the users. It can help to stress the importance of their role in this process. This brings us on to point 2.
Ask for further details and if you can't get the required information by simply asking, try other approaches such as site visits to observe users while they are experiencing the issues. If you don't get your answers, keep asking - you may well find that a lot of "noise" simply disappears and some specific issues start bubbling to the surface which you can then begin to address.
Once you do start getting the information though, users need to see that their efforts are worthwhile to offer you continued support (i.e. first hand information), so ensure you at least demonstrate that you are working on it and ensure they are being kept informed (ideally, directly if you can). If you can also achieve some quick wins, even better.
Some examples of questions you might want to ask the end users:
- Is there a general performance issue or can specific processes be identified?
- For each process that is slow: is the issue intermittent, if so is there any pattern to it, e.g. particular users and/or times of day?
- In some instances, I have seen end users (or someone on the 'shop floor') recording details in a spreadsheet as and when they occurred - the more first-hand information the better.
- Even if it is described only as a 'general performance' issue:
- Can they provide examples of processes they found to be particularly slow to focus in on?
- Consider asking for the top 2 worst performing processes from each business area, or the top 20 worst performing processes overall, for example; it's subjective and different people from within the business may not entirely agree, but at least they are engaged in the process.
- How many users are affected and in what areas of the business?
- Following on from that, are there users or business departments which are not affected?
- Where an issue is identified with a specific process:
- Can it be recreated on a test environment?
- If not, can it consistently be recreated on the production environment?
- What are the steps for the process (clear and detailed enough for anyone to understand, specifying the AX path)?
- If applicable, what parameters were used?
- Is there any setup required before running the process, if so what are the steps?
(e.g. you might need to ask them to provide a file used for an import - one which would recreate the issue if it depends on the file type/size)
- What are the expected and actual results? As mentioned in point 2, make this as quantitative as possible, including durations, transaction volumes and concurrent users ideally.
By this point (if not earlier), you should be in a position to formulate an action plan based on:
- What the issues and priorities are.
- The resources (organisations and individuals) required to address the issues and their availability.
- Initial tasks required to address the issues and the start/end dates (including development, quality assurance,testing, deployment, etc.), taking into account the steps below.
Keep in mind that the plan may well change as time goes on and further issues/actions are identified, but obviously the key is to have one!
4. Install the key recommended tools
There are 2 main tools which together can mostly be used to diagnose any performance issue on AX.
An added benefit is that as these tools are also the main ones used by Microsoft support, Premier Field Engineers and product team members, if you needed assistance from Microsoft, you would be able to send a back up of the Performance Analyser database and/or AX traces including your findings so far which can reduce the time to solution.
5. Validate setup
It is important to start troubleshooting on a good solid foundation. If the problem is clearly identified with a specific process less of this may be necessary, but a quick validation of the environment at least should still be done for any performance case. The reason being that some issues with specific processes have been known to be resolved by addressing simple setup issues, for example rebuilding indexes or updating statistics on the affected tables or setting the MAXDOP setting in SQL Server to 1.
You can use the following guides to check known areas of setup that can affect AX performance:
6. General performance analysis
General performance issues can involve a lot of potential factors and therefore can often require collaboration across a lot of different organisations and groups within organisations, e.g. general infrastructure, development, database, AX support, desktop support, etc. In the most successful performance projects I have seen, there has been a spirit of openness and collaboration towards the shared goal of improving performance, without which people may tend to try to defend their own areas and can be less willing to contribute information if they feel there is a risk in doing so.
If you haven't already, depending on the scale of the issue you may wish to consider setting up a regular conference call to share information, agree and assign actions and review progress, for example on a weekly basis.
Your main tool of choice to begin with should be the performance analyser (see step 1).
Following on from the configuration checklists (parts 1 a and b above), you should then move to Part 2, which can be found here: http://blogs.msdn.com/b/axsupport/archive/2014/09/08/ax-performance-troubleshooting-checklist-part-2.aspx
There are scripts included in the Performance Analyser package that will help you to check all of the above areas, as described here: http://blogs.msdn.com/b/axsupport/archive/2014/09/01/microsoft-dynamics-ax-general-performance-analysis-scripts.aspx
On the SQL Server tier, you may also be able to draw on the experience of the customer's DBA and they may already have a lot of relevant information which could provide some insight. Bear in mind though that they probably won't have a great deal of experience with AX and SQL tuning for AX can be different in some ways (e.g. in AX 2009, included columns are not available and index changes need to be made in the AOT).
The next phase, which can also overlap to a degree, is to analyse specific processes. Having said that, it's important to get as much done at the 'general' end (e.g. setup, hardware, etc) first when dealing with general performance issues, to avoid costly and potentially unnecessary additional monitoring / analysis time later.
One analogy is the funnel: the investigation starts wide, looking at general settings, gathering information, etc and can then gradually become more focused as you narrow things down. This would also apply to specific processes, but you would expect to filter down to a granular level much more quickly.
Another analogy is the onion: performance tuning is iterative, where the cycle includes analysis, corrective actions/tuning, deployment of changes then review.
7.Analyse specific processes
Even when investigating general performance, as mentioned above you should still get some examples from users of where processes are particularly slow. This is for 2 main reasons:
- You can measure durations before and after to see the impact of your changes (discussed in step 6) - but bear in mind a lot of those changes don't target specific processes, so again user expectations should be set accordingly beforehand.
- After general performance tuning, you may still need to look at x++ and specific processes in more granular detail.
Following on from step 4, you can analyse the AX traces using the Trace Parser tool. See the following blog regarding trace analysis.
Review should be a regular part of the process because as mentioned already, performance tuning is iterative; resolution of some underlying technical issues may help, but then more can be identified after changes are deployed and the performance tuning is able to become more focused and in depth.
However, at the same time it's important to be able to recognise when to stop tuning and move on. There is generally a 'law of diminishing returns' to be applied here, meaning in each iteration of the performance tuning of a specific process, you would expect the potential for improvement to reduce exponentially.
So some kind of exit criteria should be applied (and predefined as early as possible). For example, you may have simply reached the target duration agreed with the end user or some kind of cost / benefit decision was made, such as:
- The estimated hours of analysis to improve average duration of opening of form X by 0.2 seconds are too great to justify, or
- To do this requires changing the design and the end users would prefer to live with the additional 0.2 seconds and keep the existing design.
As well as reviewing performance fixes that are deployed, I have also seen QA (quality assurance) processes put in place for performance of every code deployment. One benefit of this is that the customer can feel more reassured that any downturn in performance is not seen to be a result of a recent code deployment. Other things I have seen (following a similar principle) are: putting all other code deployments (i.e. anything other than performance fixes) on hold during the period of performance tuning; being prepared to reverse deployments if there is any doubts over whether or not they caused a performance issue.
Finally, review what could have been done pro-actively that can be applied in future to avoid the issue happening again and plan to have it in place on every AX implementation project. Areas to consider may include:
Performance Resource Page for Microsoft Dynamics AX
- Benchmarking and user acceptance testing activities.
- Agreeing key performance metrics from the start
- Hardware sizing
- Solution design
- Setup/configuration according to best practices
- Performance QA