Webinar Q&A: Citrix Troubleshooting 101

Citrix Virtual Apps and Desktops deployments are performance sensitive. There are many components both datacentre-side and client-side which must be optimally performing together to deliver a consistent and performing virtualised apps and desktops solution.

With many different components in play, it can often be a challenging task for a Citrix administrator to determine the impact or cause of a Citrix related problem.

Stats from the 2018 Citrix Migration Survey

From the 2018 Migration Survey conducted by eG Innovations, some interesting statistics surfaced:

  1. 59% of 795 Citrix professionals voted that slow logons were the number one problem for them.
  2. 44% voted that frozen sessions were a problem.
  3. 33% voted that slow application launches were almost as common as any other fault.

Recently, I joined forces with eG Innovations to deliver a webinar on the topic “Citrix Troubleshooting 101”. The webinar had a great turnout with over 1450 people registering for it, and over 650 joining on the day. The webinar was hugely popular simply because as mentioned, Citrix administrators want to be able to quickly and more efficiently diagnose issues within the environment.

As highlighted in the webinar, Citrix issues result ultimately in lost productivity and company revenue. The severity of lost productivity and revenue is mainly determined by the time it takes to resolve an issue. For an administrator to be successful in Citrix troubleshooting, process of elimination is key. Process of elimination can be applied to three particular troubleshooting tactics that were highlighted in the webinar. Following these tactics will help you to become more efficient at diagnosing Citrix problems:

  1. Determine the scope of the problem – Does the user face an issue with a task they are trying to complete, or all tasks?
  2. Determine the magnitude of the problem – How many users are impacted?
  3. Determine the source of the problem – Does the issue reside client-side or within the corporate infrastructure?

WATCH THE RECORDED WEBINAR HERE >>

During the webinar around 40 questions were asked by the audience. Given that we had no time to answer them all live, I wanted to take this opportunity to answer them here. The questions are grouped into three categories for ease of readability: Citrix Troubleshooting, Citrix Optimisation and Citrix Monitoring.

Questions and Answers for Citrix Troubleshooting 101

Citrix Troubleshooting

1.Are there any tips to improve remote access performance?1. Firstly, on Citrix ADC (formerly NetScaler), bind the TCP profile “nstcp_default_XA_XD_profile” to your Gateway virtual server.
2. Secondly, edit profile “nstcp_default_XA_XD_profile” on ADC and uncheck “Use Nagle’s algorithm”.
3. Take a look at the “Optimized for WAN” Citrix policy template within Citrix Studio, which will give pointers to configuring policy settings that can help improve performance over WAN.
4. Consider preparing your ADC Gateway virtual servers and end-user devices to support Adaptive Transport. You can read more here: https://jgspiers.com/hdx-enlightened-data-transport/
2.Can you configure Citrix Director’s application probing for published desktops?No, currently, Citrix Director’s application probing only supports published applications. You may want to consider logon simulators and full session simulators available in the market. See the following links:
https://jgspiers.com/app-probing-vs-logon-simulation/

https://www.eginnovations.com/solutions/citrix-full-session-simulation
3.I was using Citrix Director and could not logoff/disconnect user’s session. What would the next step be?The next step would be to log on to the VDA and attempt to end the user’s session from there. That may involve you killing hung processes. If that does not work, see https://jgspiers.com/user-stuck-citrix-desktop-force-log-off/
4.Regarding brokering times with different versions, have you seen significant difference between 7.15 to 7.18?I haven’t personally seen any significant differences nor have I come across any Citrix publication regarding this.
5.Our user logon times are about 30 seconds, with Internet Explorer initialization taking most time. What would you advise to help us make logons faster?To improve logon times on Citrix Virtual Apps and Desktops, you can use several optimisation scripts for Windows server and desktop operating systems which I have created, See https://jgspiers.com/category/scripts/

Besides this, other common practices for reducing logon times include Group Policy housekeeping, profile management best practice configuration, Write Cache best practice configuration, auto-logon and so on.

You can refer to this webinar “How to Make Citrix Logons 75% Faster” for additional details.
6.How do you quantify slow logon. Is a 30-second logon considered slow?30 seconds and below is what I like to achieve in all my deployments. I can accept 40 seconds or less, but 30 seconds downwards is the real goal.
7.We are using Citrix Workspace Environment Management (WEM) in our infrastructure. Are there any disadvantages of using WEM over CPM? Also, is WEM available for on-premises XenDesktop?WEM and CPM are different products and they both have different uses. It can actually be beneficial to run both, as they work together well. Citrix WEM applies printers, mapped drives, registry settings and other actions to a user’s desktop session. Profile Management captures and roams the user profile between desktop or virtual application sessions.

WEM is available with XenDesktop Enterprise (now Virtual Apps & Desktops Advanced) and above subscriptions.
8.We have XenApp 6.5/XenApp 7.6, and we have published the same apps on Windows 2008 and Windows 2016 respectively. But performance is slow on XenApp 7.13/Windows 2016. Why do you think this would be?One of the reasons for this could be that you have not optimised the Windows Server 2016 image? Default settings in the operating system are not the best. Please refer to my optimisation script: https://jgspiers.com/windows-server-2016-optimisation-script/

Also keep in mind that out of the box Windows Server 2016 will require more resource than Windows Server 2008. So, you should assign an extra 1-2GB RAM and another 1-2vCPU and see if there is much difference in performance between the two environments.
9.Often our users get the “session interrupted” notification on the corner of their session. Would this be network related? Or is it an issue in the client side?“Session interrupted” notifications generally occur when there is a network issue between the Virtual Apps server or desktop and the client terminal. I would run through a process of elimination to see if the issue only happens at a specific user location, with specific endpoint clients, with specific Receiver versions and so on.

You could also monitor the VDAs and check to see if there are TCP connection drops being reported. Have your network team run tests on the networking devices that are client-side to see if there is any packet loss.
10.Do you have any thoughts on what’s the main cause of PVS target retries and how to troubleshoot them effectively?This can be caused by network blips such as spikes in latency/packet loss. A slow performing/saturated storage array where the Target Device is stored or the PVS vDisks are stored can also be the cause of retries.
11.How often do you recommend we reboot XenServer hypervisors?I only recommend rebooting XenServer hypervisors either during disaster recovery testing phases, or when applying hotfixes to XenServer.
12.How often should VDA’s be rebooted?I typically like to reboot my virtual apps workers at least every 1-3 days, however it again depends on how often the VDAs are used and how much resource is assigned to them.
13.Is there any tool that can identify slow printing in a Citrix session?Third-party products can monitor print servers, VDAs, and the network to inform you if there are problems. Often slow printing can be the result of bad printer routing e.g. a printer and print server with a lot of latency between them or too many hops in the communication path. For this, take a look at the Citrix policy setting “Direct connections to print servers” which is explained in detail here: https://jgspiers.com/citrix-universal-printing/
Other reasons for slow printing can be due to outdated/problematic print drivers in use, or lack of bandwidth/prioritisation of the printing virtual channel (ICA).
14.What UDP port is needed for EDT?UDP ports 1494 and 2598 are required. If you are providing EDT access via Citrix Gateway, then only UDP 2598 is required to be open from the Internet to Gateway.
15.Is there any way of easily finding bandwidth issues with NetScaler? We have a VPX 200 and are wondering whether it is a bottleneck for external users.The “Packet CPU Usage” counter on the Dashboard of Citrix ADC will show you if the ADC device is reaching its bandwidth limit or not. You can run reports from the Reporting tab. A built-in report named “CPU vs. Memory Usage and HTTP Requests Rate” can help.

Likely the Citrix Gateway is not the bottleneck though. If you are connecting in from a high-speed, broadband link and you still see latency, which would be cause for concern and potentially point towards it being a Gateway or DMZ issue.

Citrix ADM (Application Delivery Management – formerly known as NetScaler Management and Analytics System/MAS) can help track HDX Insight data and get reports on WAN latency, ICA RTT, datacentre latency and so on. See https://jgspiers.com/citrix-netscaler-management-analytics-system/
16.Is it possible to enable Receiver logging on a thin client, i.e. HP thin client or Dell thin client?There are procedures set out by Citrix on how to enable logging for Windows and Linux etc. Workspace app (Receiver) editions. You should consult with your thin client vendor on how these procedures can be carried out on the thin client.
17.Error 1102: The Citrix Broker Service failed to broker a connection for user ‘Domain.com\user’ to resource ‘Desktop1234’. The virtual machines ‘WIN10-091.domain.com’ rejected the request to prepare itself for a connection. This problem usually indicates that the virtual machine is engaged in an activity such as restarting, entering a suspended state, or processing a recent disconnection or logoff. Do you have any guidance to troubleshoot this?Determine if this issue only happens to particular VDAs, particular VDA versions etc.

Also check how many users are currently connected and if you have enough VDAs/resource to handle more users.

If this issue is only experienced during logon storms such as in the morning, then there might be a lack of VDAs to handle the concurrent logon rate (which can be adjusted via policy).
18.I would really like to know if Citrix offers a documented protocol for troubleshooting the software stack. Something starting with what baselines to get when things are working and what corrective actions to take when a given part of the stack is not meeting those baselines at the time people are complaining.You can capture baselines yourself at the beginning of a deployment which helps when comparing the same once users have been loaded on to the environment. However, you really need third-party monitoring solutions that can alert you when parts of the infrastructure are under stress or down. You can configure alerts when metrics breach defined thresholds for example logon times. Citrix Director has some of this capability, but the capability is dictated by the license you own and is limited more to monitoring Citrix VDAs and sessions, and not so much the supporting infrastructure.
19.We have seen many issues with respect to degraded performance and session disconnected issues. We are supporting multiple versions of XenApp & XenDesktop both on-premises and in the cloud. Performance degrades both in published application and VDI. We have not seen any network issues. Users can’t launch the session if the session is in a disconnected state (not all times). I have seen this issue often in cloud. Do you have any tips to investigate and address this? It is a problem that could be caused by many things. I would try to eliminate possibilities of it being the image, high CPU/RAM consumption on VDAs, Workspace app version, client used, network location used, VDA version used and so on. If it happens in cloud, I assume you have VDAs in Azure or AWS. The degraded performance and disconnects typically relate to network issues but it could be the VDA itself hanging. You will have to start troubleshooting from a high level and work your way down as you rule factors out one by one.
20.How to troubleshoot TDIca.sys BSOD? This is on a Windows Server 2008 R2 image and our current version of Citrix is 7.15.3000. We have them hosted on VMware 5.5 using MCS. When I first created the Site, all was working fine without any issues. I didn’t even have to do a weekly reboot, but now this seems to happen on a weekly basis. Any tips to triage this issue?I would look at what has changed in the environment. Sometimes it is quicker to build a fresh new image considering the problems you face and the time it may take to troubleshoot them.
21.Recently we upgraded XenApp 7.6 to 7.15 CU3 after which some features inside the published apps are not functioning when users launch URLs from 7.6 dedicated Windows 7 VDI. When they launch the same URLs outside of a VDI desktop (local computer), RDP, vSphere, all app features are opening as expected. Only VDI users are getting this problem. What could be the issue?You have a XenDesktop 7.6 VDI site running Windows 7 VDAs, and those guys launch published apps from a XenApp 7.15 CU3 site. Now some of the features inside the published apps no longer work, whereas they used to work when running XenApp 7.6.
Has anything else changed on the Windows 7 image such as an upgrade of Receiver for Windows?
22.When I’m undocking the laptop from the disk (wired) and continuing to work under wireless LAN for 1 hour in the conference room, and after returning to the desk and docking the laptop back on the base-station (wired), the Citrix virtualized application session that previously ran then fails to respond. What is the likely cause, how to troubleshoot it? Is there any tool available to detect or even auto-fix the issue?Have you tried updating to the latest Workspace app, or tried the same scenario using a newer VDA version? If that does not work I would suggest you contact Citrix support.
23.Launching a virtual application takes forever – crawling slowly that the response was like progressing in each of the launching stages for 5 minutes or more. Though no apparent network bottleneck was suspected when transmitted across the network outside Citrix. On another occasion, the same application just launches within 1-2 minutes. What do you think would be causing the delay?Monitoring tools such as eG Innovations can give detailed insight in to the Citrix logon process and what is causing the various steps to take so long.

You should put one of the affected VDAs into an isolated Active Directory Organizational Unit with no Group Policies applying.

Other things to try are testing the logon time through a console session rather than ICA, disabling profile management (if in use) and so on.

Process of elimination will help find the root cause quicker.
24.Is any script available to auto-delete user profiles from a profile server which will help admins from manually doing it? Profile Management can auto-delete profiles from a VDA using policy setting “Delete locally cached profiles on logoff”.
When it comes to deleting profiles from a profile server automatically, there isn’t any script out there to do that. The script wouldn’t know which profile to delete and when.
25.Does Citrix Cloud make troubleshooting any easier? The answer is both yes and no! You have less to troubleshoot because management of the control plane (Delivery Controllers, SQL servers, etc.) is done by Citrix. This said:

– You are still responsible for monitoring and managing the virtual apps servers and virtual desktops;
– And you still have ownership of the overall service performance. When there is slowness, you will still need to understand how to pinpoint where an issue lies. If the issue is with Citrix Cloud, then you must depend on Citrix to fix it.

The eG Innovations webinar “Does Using Citrix Cloud services make performance monitoring easier?” may be something you want to review for more details.

Citrix Optimisation

26. I’m optimizing a Windows 10 image using App Layering but unsure which layer I should remove UWP applications from?You should remove these applications from the OS Layer.
27.What about antivirus solutions, do you recommend installing antivirus on the main image? Or do you recommend deploying antivirus on Delivery Group desktops?Antivirus agents should be installed on the gold image, and all other infrastructure components such as your Delivery Controllers and StoreFront servers.

Hypervisor introspection is a technology that allows for lightweight agents or no agents at all to be placed on the VDA to reduce footprint. This technology can help with scalability.
28.What is the ideal spec for VDA? There is no ideal specification as it depends on the workloads of your users. Typical “Task Worker”, “Knowledge Worker”, and “Power User” Windows 10 workloads may be able to use “2vCPU/2GB RAM”, “2vCPU/4GB RAM”, and “4vCPU/8GB RAM” configurations respectively, but you need to test these numbers in your own environment.
29.For best performance, you recommended, ‘Have enough DDCs to handle requests.’ What numbers would you recommend?A Delivery Controller can support up to 5000 VDAs. If you have 10,000 VDAs for example, deploy 3 DDCs minimum. You should always follow the N+1 model. This allows you to endure a Delivery Controller failure without impact.
30.I have seen dramatic differences in the use of Write Cache space if a machine vDisk is ‘optimized’ after imaging or updating a master VM. Can you explain why this is so?An optimised image is leaner, so there is less going on. That is the reality. As a result, the Write Cache should not be used as much as a bulky image with everything turned on would use it. I suggest regularly performing a disk defragmentation on your vDisks as that also drives down Write Cache usage.
31.Will session pre-launch utilize system resources even if the user has not launched the application?Yes. Some processes on the VDA will be running, ultimately consuming resource. However, the resource utilisation should be low given the session will be idle.

Citrix Monitoring

32.For monitoring AppFlow, you mentioned a Premium ADC license is required. Does an Advanced ADC license not give AppFlow monitoring? Is there any other option there to monitor AppFlow?For HDX Insight, a Premium license offers historical capturing of this data. An Advanced license only provides 1 hour, so basically real-time capturing. Web Insight is different, and does not have a licensing requirement.

eG Innovations and other Citrix Ready monitoring partners offer AppFlow monitoring capabilities that will work if you have an advanced ADC license.
33.I have some users who report that their session was slow outside of general business hours and Citrix Director doesn’t show if anything was wrong at that time. What tools can I use to capture historic statistics about each virtual desktop?Citrix ADM (Application Delivery Management) is useful if the affected users are remote workers and your bottleneck is in the network.

Citrix Director provides visibility into specific parts of the infrastructure. To get complete end-to-end visibility into the Citrix tiers (StoreFront, Virtual app servers, license servers, Delivery Controllers, ADCs, PVS, WEM, etc.) and the supporting infrastructure, you can look at eG Innovations and other third-party monitoring solution vendors.
34.Can eG Enterprise detect issues with NetScaler? What is there are session drops? Can you monitor NetScaler devices and flows?Yes, eG Enterprise monitors Citrix ADC/NetScaler in-depth. See https://www.eginnovations.com/solutions/citrix-netscaler
All the key metrics of NetScaler can be monitored agentless. AppFlow data from NetScalers can be exported to eG Enterprise and analyses as well.
35.From a security standpoint, I don’t want to send monitoring data outside my datacenter. Is that possible with eG Enterprise? Yes, eG Enterprise offers an on-premises solution. The management server, reporting engine and agents can all be deployed on-premise and no data is sent to the cloud.
36.Can Smart Tools only be used if you have an active Citrix Cloud platform?Smart Tools is available to Citrix cloud customers and on-premises Virtual Apps and Desktops customers that hold a “Customer Success Services – Select” agreement. See https://www.citrix.co.uk/products/smart-tools/feature-matrix.html
37.Which tool(s) can provide a logon breakdown? GPO, full breakdown of interactive session, etc.?Citrix Director 7.18 can provide statistics for Profile Load, Brokering time, GPO Processing time and so on. A further enhancement was made to the product in 7.18 that breaks Interactive Session time down into three sub-sections.

Third-party monitoring tools such as eG Enterprise have been providing breakups of Citrix logon time including details of interactive session time. This is useful for administrators to quickly determine which logons are slow because of profile loading and which GPOs are slowing down logons. See https://www.eginnovations.com/solutions/citrix-logon-monitoring
38.Can eG Enterprise be used in conjunction with 3rd party profile management tools such as FSLogix or Liquidware?eG Enterprise is compatible with all third-party Citrix profile management solutions.
39.In the demo of eG Enterprise, you showed some use cases for detecting slow logon issues, virtualization issues, and non-corporate apps being the cause of resource depletion. What other Citrix problems can eG Enterprise help troubleshoot that Citrix Director cannot?There are many ways in which eG Enterprise brings value to Citrix customers. These include:
– Ability to monitor the user experience using synthetic monitoring (logon simulation and full session simulation)
– More granular insights into why Citrix logons are slow.
– Real-time monitoring of application launch times and proactive alerting through auto-baselining.
– Unified visibility into all the Citrix tiers – StoreFront, WEM, PVS, ADC, License server, Delivery Controllers, Virtual App servers and Virtual Desktops
– Integrated monitoring of all the supporting tiers including network, virtualization, cloud, storage, Active Directory and so on.
– Embedded auto-correlation and root-cause diagnosis technology that helps easily determine if slowness is due to a Citrix problem or not.
Refer to this blog post for a detailed comparison of Citrix Director and eG Enterprise: https://www.eginnovations.com/blog/citrix-director-end-to-end-citrix-workspace-monitoring/
40.We end up chasing Citrix issues, only to find that it is a problem with a user’s network connection. Can eG Enterprise help us identify these types of problems?Yes, eG Enterprise monitors ICA round trip time (ICA RTT) which is the latency that the user perceives. In addition, it can report the network latency between the user terminal and the server farm. By comparing these two values, administrators can easily identify if there is a network issue that is affecting Citrix performance. See this short video on how eG Enterprise makes Citrix troubleshooting simple.
41.How can we monitor and identify what is causing long VDI logons. We are currently using VDI’s for about a year now, but since last year December something changed (we don’t know what) that our logon time has increased from about 45 seconds to 2 minutes or more. I know we can use Citrix Director, but I wanted to see if there is another tool that can show more details.Third-party solutions such as eG Enterprise will provide you with more detail around the logon steps and what is causing them to take a long time. Whilst you say nothing has changed, it would be worth reviewing if any Group Policy settings have been added, or what Windows updates have been installed since December. Other things to investigate is drive maps and printer mapping via Group Policy. Are the printers/drive map locations still available? Does Event Viewer on the VDAs give any hints?
42.I have no data being written to the monitoring database, so I see no data in Director. This is after migrating the database from the local SQL Express database to SQL Server 2016 on another server. What should I check to troubleshoot this?Check that the connection strings have been correctly updated. You may refer to the following blog for assistance: https://www.citrix.com/blogs/2014/02/05/xendesktop-7-x-database-migration/

If you have any further questions on the topic of Citrix troubleshooting, or even if you want to let me know how some of the tips shared in this webinar had a positive effect on your ability to troubleshoot, please leave a comment in the comments section below.

You can watch the recording of the Citrix Troubleshooting 101 webinar at your convenience: Citrix Troubleshooting 101

Helpful Resources:


5 Comments

  • Ray

    February 12, 2019

    Amazing stuff.

    Reply
  • Anonymous

    February 13, 2019

    Why are you suggesting turning off Nagle’s algorithm? I have looked around and have not yet found a good explanation for turning it off on the netscaler.

    Reply
    • George Spiers

      February 13, 2019

      Nagle’s algorithm reduces the number of packets that need to be sent by buffering multiple small packets together. When using ICA, we need to deliver small packets as fast as possible to avoid latency. That’s why it should be disabled.

      Reply
  • Ray

    April 2, 2019

    @ Anonymous
    1.
    https://support.citrix.com/article/CTX121149
    Select the Use Nagle’s algorithm option. In NetScaler 10.5, 11.0 and 11.1 builds the “Nagle’s algorithm” option is under System > Settings > Change TCP Parameters.
    Points to Note:
    – Select this option for heavy flow of small packets. Not recommended for ICA traffic.
    – Nagle’s algorithm is disabled in default TCP profile. However, in a custom TCP profile it is enabled currently and this will be fixed in an upcoming release . Custom profiles have to be explicitly bound for the settings to be honored.

    2
    https://twitter.com/msandbu/status/1083493195969974272

    Reply
  • Vinay Singh

    April 9, 2019

    Beautiful Article. Many Thanks George

    Reply

Leave a Reply