Java EE Enterprise Performance Problems and Reasons

Performance issues are among the biggest obstacles to anticipate when developing and implementing Java EE associated technologies. Some of these common issues can be faced when carrying out either light-weight or huge IT environments; which typically consist of numerous distributed systems from Web portals & ordering applications to enterprise service bus (ESB), data warehouse and legacy Mainframe storage systems.

It is extremely important for IT designers and Java EE developers to comprehend their customer environments and make sure that the proposed solutions will not only meet their growing business needs but also guarantee a long term scalable & reliable production IT environment; and at the most affordable cost possible. Performance problems can disrupt your client business which can result in short & long term loss of income.

Here are few top causes of Java EE performance problems:

Lack of proper capacity planning

Capacity planning can be defined as a thorough and evolutive process measuring and predicting current and future required IT environment capacity. A correct implemented capability planning process will not only guarantee and keep an eye on current IT production capacity and stability however also ensure that new jobs can be deployed with minimal risk in the existing production environment. Such workout can also conclude that extra capacity (hardware, middleware, JVM, tuning, and so on) is needed prior to project deployment.

One key element of capacity planning is load and performance testing that everyone should be familiar with. This involves creating load against a production-like environment or the production environment itself in order to:

  • Determine how much simultaneous users/ orders volumes your application(s) can support
  • Expose your platform and Java EE application bottlenecks, enabling you to take restorative actions (middleware tuning, code modification, facilities and capability improvement, etc.).

 

There are numerous technologies out there allowing you to achieve these objectives. Some load-testing items enable you to produce load from within your network from a test lab while other emerging technologies allow you to produce load from the “Cloud”.

Despite the load and performance testing tool that you decide to use, this exercise must be done on a regular basis for any dynamic Java EE environments and as part of a thorough and adaptive capability planning process. When done appropriately, capacity planning will help increase the service accessibility of your customer IT environment.

Inadequate Java EE middleware environment requirements.

The second most typical reason for performance problems for Java EE enterprise systems is an inadequate Java EE middleware environment and/ or facilities. Not making proper decisions at the beginning of new platform can lead to significant stability problems and enhanced costs for your customer in the long term. Because of that, it is very important to spend sufficient time brainstorming on required Java EE middleware specifications. This workout needs to be integrated with an initial capacity planning version because the business processes, anticipated traffic, and application(s) footprint will ultimately dictate the initial IT environment capability demands.

Now, discover listed below common examples of issues:

  • Deployment of a lot of Java EE applications in a single 32-bit JVM.
  • Deployment of too numerous Java EE applications in a single middleware domain.
  • Lack of appropriate vertical scaling and under-utilized hardware (e.g., traffic driven by one or simply a couple of JVM processes).
  • Extreme vertical scaling and over-utilized hardware (e.g., a lot of JVM processes vs. offered CPU cores and RAM).
  • Absence of environment redundancy and fail-over capabilities.

 

Trying to leverage a single middleware and/ or JVM for numerous big Java EE applications can be rather appealing from a cost viewpoint. However, this can lead to an operation headache and serious performance problems such as excessive JVM trash collection and many domino effect scenarios (e.g., Stuck Threads) triggering high business effect (e.g., App A triggering App B, App C, and App D to decrease due to the fact that a complete JVM restart is commonly required to fix issues).

Recommendations

  • Project team must invest enough time creating a proper operation model for the Java EE production environment.
  • Try to discover a great “balance” for your Java EE middleware specs to offer to the business & operation group correct versatility in the event of outages situations.
  • Prevent deployment of too many Java EE applications in a single 32-bit JVM. The middleware is developed to handle many applications, but your JVM might suffer the many.
  • Select a 64-bit over a 32-bit JVM when it is needed however combine with proper capability planning and performance testing to ensure your hardware will support it.

 

Excessive Java VM garbage collections.

Now let’s jump to pure technical problems starting with extreme JVM trash collection. The majority of you recognize with this well-known (or notorious) Java mistake: java.lang.OutOfMemoryError. This is the result of JVM memory space deficiency (Java Heap, Native Heap, etc.).

Keep in mind that a garbage collection issue will not always manifest itself as an OOM condition. Extreme trash collection can be defined as an excessive number of small and/ or significant collections carried out by the JVM GC Threads (collectors) in a brief amount of time leading to high JVM pause time and performance destruction. There are numerous possible causes:.

  • Java Heap size chosen is too small vs. JVM simultaneous load and application(s) memory footprint.
    Unsuitable JVM GC policy made use of.
  • Your application(s) fixed and/ or dynamic memory footprint is too big to fit in a 32-bit JVM.
  • The JVM OldGen space is leaking in time * quite typical problem *; extreme GC (major collections) is observed after few hours/ days.
  • The JVM PermGen space (HotSpot VM only) or Native Heap is leaking gradually * rather typical issue *; OOM mistakes are often observed gradually following application dynamic redeployments.
  • Ratio of YoungGen/ OldGen space is not optimal to your application(s) (e.g., a bigger YoungGen Space is required for applications generating massive amount of brief lived things). A larger OldGen space is needed for applications creating great deal of long lived/ cached Objects.
  • The Java Heap size used for a 32-bit VM is too big leaving small space for the Native Heap. Issues can manifest as OOM when aiming to a new Java EE application, developing new Java Threads or any computing task that needs native memory allocations.

 

Prior to pointing a finger at the JVM, bear in mind that the real “root” cause can be associated with our # 1 & # 2 causes. An overloaded middleware environment will create many symptoms, including extreme JVM garbage collection.

Appropriate analysis of your JVM relevant data (memory areas, GC frequency, CPU relationship, etc.) will permit you to identify if you are dealing with a problem or not. Much deeper level of analysis to understand your application memory footprint will require you to evaluate JVM Heap Dumps and/ or profile your application using profiler tools (such as JProfiler) of your choice.

Suggestion

  • Ensure that you keep an eye on and understand your JVM trash collection really closely. There are several commercial and free tools offered to do so. At the minimum, you need to enable verbose GC, which will provide all the data that you need for your health evaluation.
  • Bear in mind that GC relevant issues are unlikely to be captured throughout development or effective testing. Proper garbage collection tuning will require you to carry out load and carry out testing with high-volume from simultaneous users. This workout will allow you to tweak your Java Heap memory footprint as per your applications behaviour and load level forecast.

 

Too lots of or bad integration with external systems

The next typical reason for bad Java EE performance is mainly applicable for extremely distributed systems; common for Telecom IT environments. In such environments, a middleware domain (e.g., Service Bus) will hardly ever do all the work but rather “hand over” some of the business processes, such as product qualification, customer profile, and order management, to other Java EE middleware platforms or legacy systems such as Mainframe by means of different haul types and communication procedures.

Such external system calls implies that the client Java EE application will activate production or reuse of Socket Connections to write and read data to/from external systems across a private network. Some of these calls can be set up as concurrent or asynchronous depending of the implementation and business process nature. It is very important to keep in mind that the response time can alter gradually depending on the health of the external systems, so it is crucial to shield your Java EE application and middleware through proper use of timeouts.

Major issues and performance stagnation can be observed in the following situations:.

  • Too numerous external system calls are performed in a concurrent and sequential manner. Such execution is also completely exposed to instability and downturn of its external systems.
  • Timeouts in between Java EE customer applications and external systems are missing out on or values are expensive. This will cause client Threads to obtain Stuck, which can cause a full cause and effect.
  • Timeouts are appropriately executed but middleware is not fine-tuned to handle the “non-happy” path. Any increase of response time (or failure) of external system will lead to increased Thread utilization and Java Heap usage (increased # of pending haul data). Middleware environment and JVM have to be tuned in a method to forecast and deal with both “happy” and “non-happy” courses to avoid a full domino result.

 

Lack of appropriate database SQL tuning & capacity planning.

The next typical performance problem need to not be a surprise for any person: database concerns. A lot of Java EE enterprise systems rely on relational databases for various business procedures from portal content management to purchase provisioning systems. A solid database environment and foundation will ensure that your IT environment will scale effectively to support your client growing business.

In my production support experience, database-related performance problems are typical. Because most database transactions are generally carried out by means of JDBC Datasources (consisting of for relational determination API’s such as Hibernate), performance issues will initially materialize as Stuck Threads from your Java EE container Thread manager. The following are common database-related problems:

  • Isolated, long-running SQLs. This problem will materialize as stuck Threads and normally a symptom of absence of SQL tuning, missing indexes, non-optimal execution strategy, returned dataset too big, and so on
  • Table or row level data lock. This issue can materialize specifically when dealing with a two-phase commit transactional model (ex: infamous Oracle In-Doubt Transactions). In this situation, the Java EE container can leave some pending transactions awaiting last dedicate or rollback, leaving data lock that can set off performance issues until such locks are eliminated. This can happen as an outcome of a trigger occasion such as a middleware failure or server crash.
  • Sudden change of execution strategy.
  • Lack of correct management of the database facilities. For example, Oracle has numerous locations to take a look at such as REDO logs, database data files, etc. Problems such as lack of disk space and log file not turning can activate major performance problems and a blackout scenario.

 

Recommendations

  • Appropriate capacity planning involving load and performance testing is vital here to tweak your database environment and discover any problems at the SQL level.
  • If you are using Oracle databases, ensure that your DBA group is examining the AWR Report regularly, specifically in the context of an occurrence and origin analysis process. Exact same analysis approach should also be performed for other database suppliers.
  • Benefit from JVM Thread Dump and AWR Report to determine the sluggish running SQLs and/ or use a monitoring tool of your option to do the exact same.
  • Make certain to invest sufficient time to fortify the “Operation” side of your database environment (disk space, data files, REDO logs, table spaces, and so on) along with correct monitoring and informing. Failure to do so can expose your customer IT environment to major blackout situations and lots of hours of downtime.

 

Application particular performance issues

To wrap up, up until now we have actually seen the value of correct capability planning, load and performance testing, middleware environment specifications, JVM health, external systems integration, and the relational database environment. However what about the Java EE application itself? After all, your IT environment could have the fastest hardware on the marketplace with numerous CPU cores, huge amount of RAM, and dozens of 64-bit JVM procedures; but performance can still be horrible if the application execution wants.

My primary suggestion is to ensure that code reviews belong to your routine development cycle in addition to release management process. This will allow you to identify significant execution issues according to below and prior to major testing and implementation phases.

Thread safe code problems

Appropriate care is required when using Java synchronization and non-final fixed variables/ things. In a Java EE environment, any static variable or object have to be Thread safe to make sure data honesty and foreseeable results. Incorrect usage of fixed variable for a Java class member variable can lead to unforeseeable results under load because these variables/objects are shared in between Java EE container Threads (e.g., Thread B can modify static variable value of Thread A triggering unexpected and incorrect behavior). A class member variable should be defined as non fixed to stay in the current class instance context so each Thread has its own copy.

Java synchronization is also rather vital when dealing with non-Thread safe data structure such as a java.util.HashMap. Failure to do so can set off HashMap corruption and limitless looping. Be cautious when handling Java synchronization since extreme use can also lead to stuck Threads and bad performance.

Lack of communication API timeouts

It is extremely important to execute and test transaction (Socket read () and compose () operations) and connection timeouts (Socket link () operation) for every single communication API. Lack of proper HTTP/HTTPS/TCP IP … timeouts between the Java EE application and external system(s) can lead to serious performance deterioration and interruption due to stuck Threads. Proper timeout implementation will prevent Threads to wait for too long in the event of significant stagnation of your downstream systems.

I/O, JDBC or relational determination API resources management issues

Appropriate coding best practices are necessary when executing a raw DAO layer or using relational persistence APIs such as Hibernate. The goal is to ensure correct Session/ Connection resource closure. Such JDBC associated resources need to be closed in a finally block to effectively handle any failure circumstance. Failure to do so can cause JDBC Connection Pool leakage and ultimately stuck Threads and full outage circumstance.

Exact same rule apply to I/O resources such as InputStream. When no longer utilized, correct closure is needed; otherwise, it can lead so Socket/ File Descriptor leakage and complete JVM hang.

Lack of proper data caching

Performance issues can be the outcome of repeated and extreme computing tasks, such as I/O/ disk access, content data from a relational database, and customer-related data. Static data with sensible memory footprint must be cached correctly either in the Java Heap memory or by means of a data cache system.

Static files such as property files should also be cached to prevent extreme disk access. Simple caching techniques can have an extremely favorable effect on your Java EE application performance.

Data caching is also important when dealing with Web Services and XML-related APIs. Such APIs can produce excessive vibrant Class loading and I/O/ disk access. See to it that you follow such API best practices and use appropriate caching methods (Singleton, etc.) when suitable.

Excessive data caching

Ironically, while data caching is crucial for correct performance, it can also be accountable for significant performance issues. Why? Well, if you try to cache excessive data on the Java Heap, then you will be having problem with excessive trash collections and OutOfMemoryError conditions. The goal is to find a proper balance (through your capacity planning process) in between data caching, Java Heap size, and readily available hardware capability.

Extreme logging

Last however not the least: excessive logging. It is an excellent practice to make sure correct logging within your Java EE application implementation. Nevertheless, take care with the logging level that you enable in your production environment. Extreme logging will activate high IO on your server and boost CPU utilization. This can particularly be a problem for older environments using older hardware or environments dealing with extremely heavy simultaneous volumes.

Java EE middleware tuning issues

It is very important to recognize that your Java EE middleware specifications might be sufficient however may lack proper tuning. Many Java EE containers available today provide you with several tuning opportunities depending upon your applications and business processes requires. Failure to execute appropriate tuning and best practices can put your Java EE container in a non-optimal state.

Insufficient proactive monitoring

Absence of monitoring is not really “triggering” performance issues, but it can avoid you from understanding the Java EE platform capability and health scenario. Ultimately, the environment can reach a break point, which may expose a number of gaps and problems (JVM memory leakage, and so on). From my experience, it is much more difficult to support an environment after months or years of operation as opposed to having correct monitoring, tools, and processes carried out from day one.

  • That being said, it is never far too late to enhance an existing environment. Monitoring can be carried out fairly quickly. My suggestions follow.
  • Review your existing Java EE environment monitoring capabilities and identify improvement opportunities.
    Your monitoring solution need to cover the end-to-end environment as much as possible; consisting of proactive signals.
  • The monitoring solution need to be aligned with your capacity planning process talked about in our very first section.

 

Saturated hardware on common infrastructure

Another typical source of performance issues is hardware saturation. This issue is typically observed when too numerous Java EE middleware environments in addition to its JVM procedures are deployed on existing hardware. A lot of JVM procedures vs. accessibility of physical CPU cores can be a genuine issue eliminating your application performance. Once again, your capability planning process must also care for hardware capacity as your customer business is growing.

My primary suggestion is to take a look at hardware virtualization. Such a technique is rather typical these days and has several advantages such as reduced physical servers, data center size, devoted physical resources per virtual host, quick execution, and minimized expenses for your customer. Devoted physical resources per virtual host is quite crucial because the last thing you desire is one Java EE container bringing down all others due to extreme CPU utilization.

Network latency issues

Our last source of performance issues is the network. Major network problems can occur from time to time such as router, switch, and DNS server failures. Nevertheless, the more typical issues observed are generally due to regular or periodic latency when dealing with a highly distributed IT environment. The diagram below highlights an example of network latency gaps between 2 geographical regions of a Weblogic cluster communicating with an Oracle database server situated in one geographical area just.

Intermittent or regular latency issues can definitely set off some major performance issues and affect your Java EE application in different ways.

  • Applications using database inquiries with large datasets are fully exposed to network latency due to high number of fetch versions (back and forward across network).
  • Applications handling huge data payloads (such as large XML data) from external systems are also exposed to network latency that can set off periodic high-response time when sending and receiving reactions.
  • Java EE container replication process (clustering) can be influenced and endangered its fail-over capabilities (e.g., multicast or unicast package losses).

 

Tuning methods such as JDBC row data “prefetch”, XML data compression, and data caching can assist mitigate network latency. But such latency problems must be examined closely when first designing the network geography of a new IT environment.

Post a comment