This sponsored post explores new benchmarks from Computer Simulation Technology on their recently optimized 3D electromagnetic field simulation tools that compare the new Intel Xeon Scalable processors with previous generation Intel Xeon processors.
New benchmarks from Computer Simulation Technology (CST) on their recently optimized 3D electromagnetic (EM) field simulation tools compare the performance of the new Intel Xeon Scalable processors with previous generation Intel Xeon processors.
The company’s 3D EM field simulation tools are used in a diverse range of industries, including aerospace, automotive, defense, electronics, education, healthcare, energy, semiconductors and telecommunications. The insights they’ve gleaned from optimizing code allow them to make performance recommendations that allow engineers and researchers to get the most out of their devices.
Small changes to the dimensions of a component can have a big effect on its tuning and efficiency. When multiple variables are involved, the interactions between them can be complex, and finding the optimum values analytically is often impossible1.
[clickToTweet tweet=”New benchmarks compare the new Intel Xeon Scalable processors with previous generation Intel Xeon processors.” quote=”New benchmarks compare the new Intel Xeon Scalable processors with previous generation Intel Xeon processors.”]
CST has extensive experience in configuring high performance computing (HPC) systems and optimizing parallel processing code. The parameterization and optimization tools built into the CST STUDIO SUITE mean that users can check how a device’s behavior is affected as its properties change, and find the parameters which maximize or minimize a given effect or fulfill a certain goal. The recent benchmarks on their optimized codes were performed on a system equipped with dual Intel Xeon Gold CPUs 6148 and has seen performance improvements for a broad range of applications and solvers when comparing to a system with dual Intel Xeon processors E5-2697 v42.
“CST provides our customers with better performance and shorter development cycles. Customer compute models, such as time parameter studies, that were previously too compute intensive can be handled thanks to CST’s continuous code optimizations and improved hardware support including MPI clusters (especially parallelization for multi-core and many-core systems as well as for distributed memory systems),” states Dr. Ilari Hänninen, CST principal engineer.

Figure 1: Speedup for different solvers and models thanks to the Intel Xeon Processor Gold 6148 processor vs E5-2697 v4 (Courtesy of CST)
Simulation of Lightning Strike with CST Transient Solver
The Transient Solver tool is one of the most important solvers which CST has optimized for parallel processing with Intel Xeon Phi processor support. Using the Transient Solver with Message Passing Interface (MPI) computing allows CST customers to handle electrically very large and complex models efficiently. For example, such models might include simulations with electrically very large structures such as a lightning strike, large antenna arrays, or antenna placement3.

Figure 2. Transient simulation of lightning strike (Courtesy of CST)
Implementing IA-Based Systems and Improving Code Performance
The CST technical team has extensive experience in IA systems and parallel code optimization including multi-threading, distributed computing and MPI computing. CST uses standard dual Intel Xeon Phi processor-based servers and customers use mostly Intel Xeon processor E5 based systems. According to Hänninen, “A key benefit for our customers will be enhanced performance of the Intel Xeon Phi platform over a normal high-end CPU platform allowing them to get a higher simulation performance throughput and shorter time to market.”
Dr. Fabrizio Zanella, Systems Manager at Computer Simulation Technology (CST) states, “We tune and optimize our code to take advantage of the parallel processing capabilities of the Intel Xeon and Xeon Phi processor.” CST ships Intel MPI libraries with the CST software to provide customers with the MPI libraries so they don’t need to worry about the cluster environment settings. CST works on parallelization, vectorization, multi-threading, cache optimization and NUMA awareness with a special focus on optimizing CST software solvers on Intel-based cluster systems.”
The Transient Solver tool is one of the most important solvers which CST has optimized for parallel processing with Intel Xeon Phi processor support.
Benchmarking and Scaling of CST Transient Solver
CST has done extensive optimization and testing of the transient solver since it is a tool frequently used by CST customers. They optimized the tool using hardware acceleration to improve performance on a multi-core versus single core system.
MPI Computing and Speed up of the Solver Loop
Typically, the solver loop/solver run is where most of the computational time is spent during a transient simulation. The bottleneck which limits the performance of the transient solver is the memory bandwidth of the system indicating that the transient solver algorithm is memory bound and many CPU cores are competing for memory access. MPI computing allows the computational workload of a large model to be split across computer cluster with a high-speed interconnect. CST STUDIO SUITE 2017 adds support for Intel Omni-Path Architecture4.
CST Solves HPC Challenges to Increase Performance for Customers
There are challenges in working with large, complex 3D EM models on high performance computing (HPC) systems due to the complexity of the system and model sizes. Hänninen states, “There are more cores, so scalability is a challenge with up to 100 cores. To optimize CST code to run well on parallel systems, we must consider NUMA aware shared memory systems, system typology as well as how coprocessors are connected to the CPU and if they are running on the same socket. There can be memory bandwidth bottlenecks in models with large amounts of data or streaming data. We need to be aware of the complexity of the systems and how to handle issues, part of our service is to work with customers and use generic models in benchmark testing. Our team works with the customers in terms of testing of models and configuration settings to make good recommendations for customers so they get a well performing system and the best performance when running the models.”
Learn more about how CST MICROWAVE STUDIO* performs with the latest Intel HPC technologies in a December 19, 2017 webinar. Register for 3D EM MPI Simulation on Intel® Xeon® Scalable Processors with Intel Omni-Path Architecture.
References
- https://www.cst.com/products/hpc
- http://cst-simulation.blogspot.de/2017/09/performance-of-cst-studio-suite-solvers.html
- https://www.cst.com/products/cstmws/solvers
- http://cst-simulation.blogspot.de/2017/09/support-of-intel-omni-path-hpc.html
Linda Barney is the founder and owner of Barney and Associates, a technical/marketing writing, training and web design firm in Beaverton, Ore.