User sessions unexpectedly timeout or give "Fatal Error"

9 minute read time.

There have been several issues raised recently whereby user sessions are giving intermittent "Fatal error" messages or are timing out even though the users are still working within Sage X3.   Some of these sorts of issues can be technical, but some may be down to misunderstanding of how the session timeouts work so this article will try to lift the lid on what’s going on with user sessions in X3

This article focuses on Version 12 Patch 26 (Syracuse 12.11.0.58) however the concepts will apply to earlier V12, V11 and PU9 systems

The user session timeout is defined by the "timeout" value in the Syracuse configuration file "nodelocal.js", located in the Syracuse\bin directory


 

In my case, it is set to 10 minutes

First, I will demonstrate what we expect to happen in a simple use case.   I will login to X3 and launch a classic page, then keep my browser in foreground, but leave it alone to time out and we will see what happens.


Whilst I wait for the timeout to occur, I’ll check the user session information.  Login using a different browser as an Administration user, then navigate to Administration, Usage, Sessions management, Session information.  (For versions earlier than V12 Patch 25, the Syracuse session information is in Administration, Usage, Sessions management, Web Client sessions and the classic session information is in Administration, Usage, Sessions management, Classic client sessions)

Select the user of concern then click the arrow on the left to see the full information about the users X3 sessions, including the "Last access" and "Expiry date" which are the most important for understanding session timeout.


If you happen to have direct access to the MongoDB database and are familiar with navigating around it, you can also see this same information in the MongoDB collections using Mongo shell.  There is the Syracuse session collection (SessionInfo up to V12 P24, or SessionState from patch 25)


 
The items of most interest are the "lastAccessUTC" and "expiredAtUTC" fields as these show when that session was last used and the exact date/time when the session will be considered expired
Notice the expiry for the Syracuse session is 15 minutes rather than 10 minutes.  This is because 5 minutes is added to the Syracuse session to allow time for the classic sessions to expire first.

There is a separate Classic Session collection (CvgSession up to V12 P24, or X3ClientState from patch 25), which shows two entries for my user.  This is because I have one session for the home page (in effect) which has "type" of "sessionPool" and another session will be shown for the classic page I also have open with "type" set to "classicPage" as we saw from in the Sessions Information page earlier


However you are looking at the user session information, Windows task manager shows the adonix processes related to the user sessions


If you are running Fiddler (or network tracing) on the browser, you will see a "ping" URL (/pollSession) every minute, even though you are not doing anything with the browser.  NOTE: this "ping" request was taken out from Syracuse 12.14.


These "pings" are also reflected in the Syracuse logs of course

After 9 minutes of inactivity the user will get a timeout countdown.  This allows 1 minute to respond before the 10 minute timeout actually happens.


Once the timer expires, the browser is sent back to the login page.

Our Fiddler trace (or Syracuse logs) shows the browser being redirected to the logout URL, then gives a status 401 (Not authenticated) and then the redirection back to the login page.

Checking the Syracuse collection in mongodb the session information has now disappeared as you would expect (as the session is no more)

So now we understand the concept, what about if the user has multiple tabs open?   Now I will create a new user session and open three tabs in different functions.   I will stay on the third tab and click around the screen to maintain the session as active


After 11 minutes, my first two tabs would have expired, but my third tab I have kept active, so if I now click back to the second tab, I get the timeout message.

I say "Yes" here and this allows me to continue this session.

If I click on the first tab, I also get the same message, if I say "No" this time (or don’t respond in time) I get a message saying I cannot logout as other sessions are still running (which they are)


 I have to click OK and then see a message that classic sessions have been closed because of a timeout.  This is correct as my original session on the first tab has been closed (as I said "No" to maintain the session)


 
In fact, I can continue using this first tab and launch a new function, as I still have a valid Syracuse session.  If all my tabs had expired, then I would need to login again.

These steps have demonstrated that although multiple tabs can timeout, the user can recover the original session if they wish and continue where they left off.


Other things to think about


Record Locking

Whilst it is good users can have multiple tabs open as this may help them be more efficient, if they are editing records then moving away from that tab it may result in records being locked by that user and then they forget about it whilst they do other things.  This could impact other users if they want to access that same record.   

Badges

Will users with multiple tabs open consume additional badges?    The good news here is that multiple tabs do not consume additional badges.   If a user launches different browsers and login twice from the same PC, they are also only consuming one badge, up to a value of 5 sessions.   This is discussed further in the online help ( https://online-help.sageerpx3.com/erp/12/wp-static-content/public/Licensing-information/Content/How-to%20guides/Platform/Sage%20X3%20licensing%20information/Topic%209%20Badge%20consumption.htm )  

Tuning Syracuse

Syracuse is sensitive to the number and type of users, so needs to be tuned appropriately for the user load.    This is documented in the online help ( https://online-help.sageerpx3.com/erp/12/public/administration-reference_node-js-sizing.html )

Perhaps the most significant sentence in this context is "A session is a single browser tab running a classic page function, or a set of tabs opened by a user running functions in Syracuse mode. Make sure that two tabs of the same browser displayed in different windows are running under the same session. Be careful, if you open a browser tab in incognito/private mode, it is considered as a different session"

Essentially this means, for Syracuse sizing purposes, every tab a user has open is a "session"   For example, if you have 20 users who all have three tabs open, then this is 60 sessions, so you would need to ensure you have enough node.js processes to cope with the 60 sessions rather than 20.

Potential issues

Syracuse process crash

The Syracuse "nanny" process manages the child processes that service the user requests.  (More information on this is the "Syracuse logging" session from https://www.sagecity.com/gb/sage-x3-uk/b/sage-x3-uk-support-insights/posts/sage-x3-technical-support-tips-and-tricks---march-2021-index )   If the Syracuse child process becomes unresponsive or crashes for some reason, then all user sessions attached to that particular process will lose their session.   Appropriate Syracuse tuning as discussed above should mitigate this situation. 

For example, if you have a screen which is handling large numbers of records in a grid (1000+ records would be considered a large data set) this can cause issues with Syracuse, depending on the browser being used.  With Chrome/Edge you may find the node.exe is going into "clean_up mode"   In this situation the Syracuse process being used by the user session is hitting the memory threshold and causing a new Syracuse process to be started.  As the memory needed is greater than that available, the original process will eventually become unstable or crash, so the browser will have lost it's session of course and give the "Fatal error"

Specific issue with Chrome browser

When using Chrome browser, you may find users complain about "Fatal Error" messages regularly, particularly if they have left their PC for a while and come back to it.
 
For this specific issue, it may be caused by a feature introduced in Chrome 88 (January 2021) that starts throttling non-focused tabs in order to save performance/energy.  They are explained in detail in this article: https://developer.chrome.com/blog/timer-throttling-in-chrome-88/ and in Sage article "X3 Sessions do not adhere to Maintain Session Timeout setting when browser is minimized"

The solution is to disable the new throttling features in Chrome:
Go to chrome://flags/
Search for 'throttle'
Disable 'Throttle Javascript timers in background' and 'Calculate window occlusion on Windows'
Restart Chrome

Multiple Syracuse nodes

If using multiple Syracuse nodes, you need to ensure all Syracuse nodes have the same timeout specified, otherwise users will get different results depending on which server they are connected to

Browser crashes

If the browser crashes or is killed abnormally, perhaps by PC itself being shutdown, any user session the browser was being used for will remain open, but obviously will not be accessible.  The session will however still be active on the server and show up in the session management until its expiry by timeout.

Network devices

Load balancers, Reverse proxies, and the like, which sit between the users PC and Sage Syracuse server can also possibly interfere with user sessions.    Typically, this could be because:
-    Cookies are being modified or removed.
-    URLs being re-written.
-    Load balancing is not maintaining server "stickiness".

Sage do not explicitly certify or test particular network devices but as we have fairly simple requirements, providing the Sage cookies and URLs are not being interfered with, would not expect any particular issues to be caused by these devices

Windows Server tuning

There is some evidence to suggest the default configuration for TCP settings on the Sage X3 Windows servers, and the default Windows Heap size could cause intermittent "Fatal Error" messages. These setting can be modified as described in the Knowledgebase articles listed below, albeit the article titles are to do with batch server:

"Batch Server Tasks increasingly failing with ECONNRESET errors" describes the recommended TCP settings.  This change would need to be done on the Syracuse node, as well as the X3 Runtime node(s)   

"Batch Server Tasks fail with ECONNRESET errors" describes how to change the Windows desktop heap.  This change would need to be done on the Syracuse node, as well as the X3 Runtime node(s) 

Conclusion

In order to give users the best experience when it comes to their sessions, you should:

  • Ensure you have tuned your Syracuse servers for the number of user sessions in use.
  • Make sure you are using the latest Syracuse version relevant for your installation, at least Syracuse 12.10.0.54 is needed to get the best user experience.
  • For chrome users, you should implement the throttling changes described above to get consistent behaviour.
  • You could consider increasing the time allowed before a user session will timeout, however this is not best practise from a security point of view.
  • There is some evidence to suggest Windows Server default TCP and Heap Size could cause intermittent "Fatal Error" messages, so consider implementing these changes if all the other options have been addressed.