Issue 1,071 | 26 October 2020
Guest editorial by Roman Lygin, CAD Exchanger
In CAD, parallel computing is tough.
Parallelism is where the computer runs two or more pieces of computer code at the same time on CPUs and/or GPUs. Nearly all CPUs sport multiple cores (the equivalent of two or more CPUs in a single chip) and handle multiple threads (more than one software operation at a time). GPUs are the graphics boards in our computers, and they commonly boast thousands of tiny CPUs, huge amounts of on-board data storage, and high data speeds.
Parallel operations on CPUs in CAD are limited to loading files and regenerating drawings, for the most part. In GPUs, parallel operations are useful for speeding up predictable actions like renderings and finite element analyses. Most of what we do in CAD cannot, however, be predicted by software. When we, for instance, start drawing a 3D solid cylinder, the software has no idea what step we take next -- specify the radius, drag the height, switch to a mesh model, specify an angle... And so all CAD software for the most part is single-threaded, running on a single-core.
Should someone make headway in CAD, then this is an important advance for our industry, because then our favorite software will run much faster. Roman Lygin heads up CAD Exchanger, and he talked with me about how his team works with parallelism in their software.
- Ralph Grabowski, editor
- - -
Indeed, parallelism in CAD is tough. There are two important issues here.
Firstly, parallelism is challenging for the ordinary software developer. And the truth is that not every developer on the team should be involved in parallelism implementation. The entry barrier is quite high and it is best to have skilled people to do it.
Secondly, bringing parallelism to CAD is especially difficult due to CAD-specific challenges.
Parallelism requires good understanding of lower-level architecture (CPU and memory, processes and threads) as well as parallelism-specific issues, such as data races, work imbalance, and associated overhead.
Truth to be told, I did not see decent GPU-parallelism coming to CAD, except in the limited case of triangulation of huge 3D models, where the overhead of data transfers between RAM and GPU memory is offset by speed-ups due to the massive parallelism made possible by the GPU. The problem is that GPU parallelism is vulnerable to branching (such as through if/else statements), which is very common in CAD. I believe that CPU parallelism has a greater potential for CAD, and so let's set aside the GPU to focus on the CPU.
CPU parallelism can be viewed as a three-level pie, a metaphor I first heard from Arch Robison, former Intel senior principal engineer and the first architect of Intel Thread Building Blocks, the C++ library for parallel computing.
The three levels are as follows:
1. Multi-process parallelism. Usually MPI-enabled [message passing interface], the program runs in independent processes, with instances communicating between one another. Communication can be via shared memory when running on the same computer, or through a network when running on a cluster.
2. Multi-threading. Each process works through multiple threads that share data in the same (i.e. shared) memory. This applies to multi-core CPUs and multi-CPU workstations.
3. SIMD (Single Instruction, Multiple Data). Thanks to long CPU registers, one arithmetic command can be applied simultaneously to several data items, such as two or four doubles [8 bytes each].
Multi-process parallelism (approach #1) can probably be applied reasonably well to processing huge assemblies, when the entire assembly simply cannot fit into a single process memory. We have never experimented with that, but we have seen our clients trying to achieve it with our APIs, and it made sense. You still have to apply to a lot of intelligence to achieve efficient scalability and reduce overhead, as assemblies can be very imbalanced internally.
In CAD, the greatest potential lies in working with a multi-threaded approach (approach #2), in most cases and when working with medium to large models. This is where we apply parallelism the most.
Approach #3 can be applied quite efficiently to NURBS [non-uniform rational B- splines] evaluation. In 2013, I wrote a paper “Applying Vectorization Techniques for B-Spline Surface Evaluation” (see figure 1) that found the local speed-up was 15x or more, thanks to applying FMA [fused multiply-add] instructions in most recent CPUs of that time.
Figure 1: From the paper, the function for determining surface point coordinates
CAD-specific challenges manifest themselves through several cases, especially the following ones:
Data structures with strong interdependence. For instance, a b-rep [boundary representation] graph has one shell referring to multiple faces. Each face refers to an edge shared by two faces (in a closed-manifold solid body). Each vertex can be shared by multiple edges, and so on. To efficiently process such a structure, you have to take care of proper data access synchronization to prevent data races, which can cause invalid executions and make it hard to trace bugs.
But if you do this naively by just applying a typical mutex [mutually exclusive flag], then you can easily kill all scalability because each fine-grain access to a vertex, an edge, and so on will be mutex-protected, incurring significant overhead -- which you don't have in pure serial code. [A mutex is a gatekeeper that allows one thread in but blocks access to the others.]
Data imbalance. In a typical assembly, you likely have parts with distinct complexity, such as a bolt being different from an impeller. Inside each part, you can have very different geometries, such as planar and NURB surfaces. Computation complexities (e.g. a 3D point from UV parameters) can be distinct in several orders of magnitude. So, again you cannot simply divide the work between n threads and expect good parallelism. You end up with pathological imbalances, with one unlucky thread crunching a big chunk of data, while the others sit idle.
We address these challenges efficiently by restructuring the data to reduce/avoid interdependence, and by applying work-stealing vs. work-sharing paradigms. We also heavily exploit nested parallelism, such as across parts in an assembly, across bodies in a part, and across faces in a body. Imagine a thread working on a first part of a thousand NURBS surfaces in one sub-assembly, while three other threads are processing hundreds of parts of twenty other sub-assemblies, with nesting.
This allows us to scale up 8x or more threads easily on typical medium-to-large models. Of course, the larger the model, the farther the algorithms can scale. In CAD Exchanger, these techniques are implemented in the core architecture, and most our converters take advantage of that.
But the effect strongly depends on the capabilities of the file formats themselves. In STL, there is essentially no parallelism, as the format does not allow for anything except a flat list of triangles. In JT, by contrast, parallelism is leveraged very efficiently.
In addition to conversion procedures, we apply parallelism in triangulations and organizing visualization pipelines by asynchronously loading/converting the data and gradually displaying the complex models. This video demonstrates the effect. We also apply a lot of parallelism tricks in I/O [input/output], reaching a great speed-up and efficient scalability.
Q&A
Ralph Grabowski: When customers license your firm's SDK, what kinds of projects do they use it for?
Roman Lygin: The most typical use cases include:
-
CAD/CAM/CAE, BIM/AEC, PCB, and the like, both general-purpose and domain-specific, such as in Altium Designer [PCB design software] and OMRON Sysmac Studio [virtual factor planning]
-
Visualization, AR/VR including Unity-integration
-
On-demand manufacturing and in particular cost estimations
-
3D Web apps
Grabowski: You mentioned that CAD Exchanger is more than a converter or viewer. What other functions does it perform?
Lygin: It imports/exports assembly hierarchies, b-rep and mesh geometries, meta-data, and PMI [product manufacturing information]. The SDK and our other developer tools also offer:
-
Measurements/computations of bounding boxes, surface areas, volumes, and so on
-
Visualization in multiple display modes, with interactive selection/hover, and so on
-
Direct Unity integration in run-time
-
Manufacturing-related algorithms, such as for CNC [computer numerically controlled] machining feature recognition of milled surfaces, drilled holes, countersinks; sheet metal unfolding, and so on
-
Advanced meshers for FEA [finite element analysis]
-
Web toolkit for in-browser visualization
-
Cloud-based conversion API [application programming interface]
-
Model simplification through b-rep shrink-wrapping, internal body removal, and so on, along with mesh decimation by factor 100x and more; see figure 2
Figure 2: CAD Exchanger reducing meshes by 99%
Grabowski: You support five native file formats -- Solidworks, NX, Creo, Catia V5, and DWG. Why did you choose these ones?
Lygin: Based on the general market and particular customers' demands. We currently support 20+ formats, with three just released and more underway. All the converters are based on common-core architecture and design principles, so the converters can leverage scalable parallelism on multi-core architectures.
Grabowski: Do you write your own translators, or do you license them from other developers?
Lygin: All translators are our own, except for a proprietary Autodesk FBX SDK to parse/format FBX and related formats. Other than that, CAD Exchanger does not depend on any proprietary technologies as we own entire technology stack for CAD conversion. We certainly use several open source libraries, though.
Grabowski: Has the coronavirus benefited or hindered your company's operations?
Lygin: YTD [year to date] our product revenue has grown strongly, on par with previous years. In the early weeks of coronavirus, we did notice some cautiousness in customers' orders, as some orders were temporarily put on hold, but in the following weeks things got back to normal.
I believe we offer excellent value to price and our customers appreciate that. We added many Fortune 500 companies to our customer list.
In operations, the team moved to and continue working remotely most of the time. We are now revisiting our lease to reduce the occupancy given the ongoing remote working mode.
[Roman Lygin spent ten years at Intel working in the software division that develops Intel Parallel Studio and Intel Cluster Studio, both developer suites for parallel programming. For the last five years at Intel, he managed engineering teams developing Intel TBB, OpenMP, and MPI libraries. He is the founder of CAD Exchanger. Parallel computing has become his second professional passion, after CAD data exchange.]
Sponsor
== 3D Conversion of Ultra-Massive 3D Models
via DWF-3D & Okino's PolyTrans|CAD ==One of the most refined aspects of Okino's PolyTrans|CAD software is in transforming ultra-massive MCAD models of oil and gas rigs, LNG processing plants, 3D factories, and other unwieldy datasets into Cinema-4D, 3ds Max, Maya, and Unity (among others).
What often takes days using blindly incorrect methods takes minutes or an hour with Okino's well-defined optimization and compression methods using its DWF-3D conversion system.
Popular CAD data sources include SolidWorks, ProE/Creo, Inventor, AutoCAD, Revit, Navisworks, DGN, IGES, STEP, Parasolid, and JT. DCC data sources are Cinema-4D, 3ds Max, Maya, FBX/Collada, and many more.
Perfected over three decades, we know 3D data translation intimately, providing you with highly personalized solutions, education, and communication.
Contact CTO Robert Lansdale at [email protected] .
And in Other News
Bricsys announces the new release of BricsCAD v21 online tomorrow, 27 Oct. I'm expecting to see the company expand its offerings in direct modeling with BIM and MCAD.
To watch, register at https://summit.bricsys.com/
- - -
Heads up as Graebert breaks up their two-day in-person new-release event to two online events:
-
neXt is a 30-minute what's-going-to-be-new Webinar on Dec 1 (or Dec 2 or 3, depending on your residential continent)
-
ARES 2022 launch event is in April
You read correctly: Grabert is skipping a version number, going from ARES 2020 to 2022. Save the date, as the registration link is not yet live.
- - -
The CAD software business tends to be recession-proof, with sales going up in good times (people looking to spend money) and up in bad times (people looking to save money).
But this year the world's tiniest living object felled the world's largest CAD company, as Dassault Systemes’ Q3 license revenues fell 15% and the company lowered expectations for the year, as it expects license revenues to fall 19-20% -- reduced more from July's prediction of a 16-18% fall.
Now, Dassault gets revenues from more than just licenses, and so overall organic revenues [revenues excluding recent acquisitions] fell a more modest 3%, and stalwarts like Solidworks were even up 5% on the quarter.
Monica Schnitger does important work summarizing CAD earnings at https://schnitgercorp.com/2020/10/22/a-mixed-bag-schneider-electric-and-dassault-systemes-announce-very-different-q3s/ so be sure to follow her.
Thank You, Readers
Thank you to readers who donate towards the operation of upFront.eZine:
-
Holly Stratford: "I continue to enjoy your ezine!"
-
3DBrains PTE, Singapore
To support upFront.eZine through PayPal.me, then the suggested amounts are these:
$25 for individuals > paypal.me/upfrontezine/25
$150 for small companies > paypal.me/upfrontezine/150
$750 for large companies > paypal.me/upfrontezine/750
Should Paypal.me not operate in your country, then please use www.paypal.com and use the account of [email protected].
Or mail a cheque (US$ or CDN$ only, please) to upFront.eZine Publishing, Ltd., 34486 Donlyn Avenue, Abbotsford BC, V2S 4W7, Canada.
*4794
Comments