Updated: 07 June 2022
Please note this document and all other documents it links to are living documents so will evolve over time as new things are discovered, new functionality is provided, best practises adjusted and/or when I get time to add content; so please make sure you come back and visit this source document often.
When it comes to troubleshooting performance issues, the more complex your environment and the more intermittent the issue, the more difficult identifying root cause will likely be. Even for apparently isolated performance issues, getting to the root cause could be difficult and may need multiple iterations of testing and detailed data gathering to be able to do so. It can sometimes be tricky, but patience and a forensic approach will eventually win through!
In summary the basic methodology I would normally follow is:
- Understand system architecture and versions
- Understand the business’s performance requirements and expectations
- Understand the current overall system baseline performance
- Gather evidence to compare expectation against current performance
- Identify areas of concern
- Make any relevant change(s)
- Monitor impact of any changes introduced
- Repeat the gather, identify, change, and monitor steps as required
As you need to consider the system performance end-to-end, you must ensure you engage with any/all partners who look after some component of the system under consideration. For example, if a customer looks after their own IT, then their IT team needs to be involved in the data gathering and identification of root cause; it is not good enough for any partner to say “It’s Sage not us” unless or until it is proven beyond doubt by the evidence available. You may also need to engage Cloud Service providers, Third Party software providers, Database experts, etc. to get to the root cause.
An essential starting point is to understand the system architecture (hardware) and the current software versions in use. If you do not already have an up-to-date system architecture diagram, it would be a good starting point to create one.
Your document should include the hardware specification (even if the hardware is virtualised) i.e. CPU, Memory, Disk allocations, as well as number and speed of network cards.
Then perform an audit of the Operating System, Database and Sage X3 software component versions.
The key to solving any problem is to have an accurate and detailed definition of what the problem(s) is/are. We need to understand the nature, scope and extent of any problems being experienced. Saying “Sage is slow” is imprecise and is meaningless when trying to understand a performance issue. We also need precise details as to what is considered slow, how long “slow” is, what timing is expected and why.
My starting set of questions for general problem discovery are below, some of which may need to also be directed to specific end users:
- When was the instance first installed, how long has it been in use with the current user load?
- When did you start seeing this performance issue?
- Has performance been getting gradually worse, or did it suddenly start going much slower?
- What has changed in the environment just before the performance issue started to occur?
- Can the problem happen at any time of day, or does it seem to be only certain times of day or perhaps only certain days of the week/month when it happens?
- When the issue occurs, does it affect all users at the same time or do some users function OK and others do not?
- Does the issue only happen to a specific subset of users, for example: certain PCs or a specific location?
- What is the frequency of the issue?
- Does the issue happen with different browsers?
- Does the issue still occur if you launch the browser directly from the Sage X3 Windows Server?
- Can the issue be reproduced at will? If so, what actions do you need to take?
- Can it be reproduced in all environments? e.g. LIVE and TEST instances; using SEED folder
- Has this problem, or similar problems happened before?
- Are there any workarounds or things that change nature of the problem? g. what if user closes all browser windows and re-tries?
- Any other unusual things being observed?
- Do you have any customizations or other site-specific modifications that may affect this area? If so, does the issue reproduce with all customizations disabled?
There may be several iterations of questioning to get to the required level of detail, so the initial answers may then trigger additional questioning; and if necessary be prepared to watch users who are having issues to see exactly what keys they are pressing!
It is also good to know what a target solution might look like, by understanding the customer expectation. For example, if a process is taking 4 hours to complete, how long would be considered reasonable/expected by the customer and on what are they basing this expectation.
If there are multiple performance issues, it is important to have a list of all such problems as there could be some inter-relation between the issues, but then need to focus on the most important issue(s) first.
It is important to be able to measure and monitor system performance, firstly to identify current system performance but also to then see how it then changes over time, or after modifications are introduced into the system. The key thing any data you collect should be able to provide is to get a picture of the system resources and database performance. This should ideally be both when the system is performing poorly, but just as important is to get the same data when the system is performing well, as this will allow you to compare these two different states and identify differences in behaviour.
Third party monitoring tools
Some customers may already have invested in performance monitoring software or use a cloud provider who have such tools available. Such commercial software will often allow you to review Operating System and Database statistics in detail over time, providing extremely useful information as you may be able look at these stats before, during and after a poor performance incident. If these tools are available, it obviously makes sense to engage with the expert in the tools to extract the required information.
Sage Support Investigation scripts
If no other tools are available, or to provide additional data, you may wish to implement the Sage Support provided “Investigation Scripts”. Review the presentation describing the Sage Support Investigation Scripts. In essence, they are PowerShell and SQL scripts run over time so capture a snapshot of the system at that moment in time. Reviewing the output can take some time, but these outputs could provide key evidence if the absence of any other monitoring tool.
Operating System tools
Both Linux and Windows offer their own performance monitoring tools. The Sage Support Investigation Scripts leverage the Windows “PerfMon” tool already, but you can setup your own monitoring using this or any other home-grown script if you already have your own built.
Performance testing is an essential part of any system implementation or major upgrade, but is sometimes done as an afterthought, or just omitted altogether due to budget or time restrictions.
Whilst the Sage Architecture and Prerequisite Guides give good generic indications of hardware and software needed for performance, these recommendations are based on a generic workload. Every customer has their own unique way of working, a unique dataset, and a mostly unique hardware architecture/setup. This means the only way to predict system performance for a specific customer is to configure a test system that is a mirror of the live system hardware/architecture and then test the system with a representative workload.
Generating a “representative workload” is perhaps the tricky part when considering testing methodology. Ideally you would also have the ability to perform multiple performance tests, using the same starting point for the system data, and applying the same workload.
Option 1: Real users, real data
When I first started in computing, we generally recommended customers running a month on the new system in parallel with the original system! Whilst this is unlikely to be feasible with anything other than a very small workload, having real users run through at least part of their real workload on the test system is the gold standard of testing, in my opinion. This option will need commitment from the users as well as the customer management team, as well as co-ordination and control/monitoring in place to track activity and results. In mitigation of the resources needed for this activity, it could be considered as part of UAT and/or acceptance testing, rather than just being a performance test.
This option can be tricky to orchestrate and difficult to repeat with consistency. If there are external interfaces, this may also provide challenges to capture and repeat test. However, with the correct overall approach, this option gives the most confidence in results as it is likely to be the most realistic workload and uses the real network and PCs.
Option 2: Automated testing
There is a plethora of commercial and free software tools which allow you to setup test scripts to run a performance test against your test environment.
The disadvantages of these tools tend to be the investment in software and training to become familiar enough with the tool in the first place to use it effectively, but then you also have the time investment needed to create test scripts for your customer site. These test scripts can generally be parameterised so could be re-used across different customer sites, but still need to be made customer specific as workload will depend on the customer configuration, specific data values and which functions need to be tested.
Having said that, automated testing scripts can be re-run as often as needed, so can be very effective in showing the impact of changes made to the test system. It is also definitely better than doing nothing, if having real users test the system in any depth is not possible.
At this point I will mention Sage’s Automated Test Platform (ATP) https://online-help.sageerpx3.com/erp/12/public/index.html but only to say this is definitely NOT a performance testing tool. ATP runs a series of scripts as a single user only and although this may be useful to confirm a customer site is functioning consistently, it does not prove anything with regards to how the system will perform under load.
I have played around with Apache JMeter (https://jmeter.apache.org/ ) in the past to test Sage X3. It is good in that it allows you to record a user session and from that you can then edit the generated test script as needed. Be warned though, it takes a lot of work to create even one decent test script, but if nothing else I found it a good way to learn about the HTTP traffic flow from the client to Syracuse server. Once you have your test scripts, you can then run them to simulate multiple user sessions.
Another free tool you could consider is Artillery ( https://www.artillery.io/ ) I haven’t played with it much yet, but is based on NodeJS so has more affinity with Syracuse. As with JMeter, it will still take a significant commitment of time to put together realistic test scripts.
- Architecture and system requirements
- V12 prerequisites
- Prerequisites overview
- Sage Summit 2014 BUS-316: Prepare Your Sage ERP X3 Solution for Infinity and Beyond–System Performance Best Practices