Finite possibilities of the finite element method

Recently, I have been on a conference of young scientists in Warsaw. These were great three days of meeting people, eating, drinking, sightseeing, and, occasionally, watching some presentations. I didn’t prepare anything myself, but it was still a good opportunity to see what other PhD students work on, and how they handle all the difficulties caused by the imperfect system of higher education in Poland (mainly related to money). One of the precious observations I have made during the conference is that engineers are not as fond of the finite element method as I had imagined. This is a great news for me because I want to put a lot of effort to improve the FEM by making it easier to use, and, this way, available for more people. Let me share with you some of my thoughts.

The finite element method has made an amazing career as a general and effective approach to solve partial differential equations. It has a well-developed theory, can handle fancy domain shapes, and is supported by plenty of numerical tools. The number of various applications of the method is uncountable, from a helmet design to an explosion simulation. Many scientists make their careers following the simple pattern: create a model, solve it with FEM, change the model a bit, solve it with FEM, change it a bit again, solve it with FEM, and so on. But, FEM has also disadvantages. It’s far from being automatic. Using FEM requires understanding how it works, and that’s not an easy thing to learn. Using FEM requires adjusting it to specific problems. It’s important whether you use triangles or pentagons as the elements. If you simulate a breaking desk, you have to know a priori where the break will occur and generate a mesh which suits the breaking edge. Generally, you have to try really hard to make the method work. There are plenty of those little things that just wait to ruin your results. This is why being a finite element method expert guarantees a good position in industry. There are people who are good in nothing else than FEM, and they make good money on that. It’s a huge business. Why should anyone care to improve the method, and make it easy to use? As long as it earns money, no one will bother to seek revolutionary inventions. It’s all about the money. I don’t mind, but I’m out. I’ve got other plans. The question is if the method can actually be automated. Obviously, it can, and it’s not even that hard!

Mesh generation is a way to make a continues problem domain discrete. Why to do it? Because, unfortunately, computers cannot solve continues problems. Problems given to a computer have to be not only discrete, but even finite. After all, the computer memory is finite. Stating a continues function as a finite formula is not good enough. Formulas look nice, but their worth is little. What we are usually interested in is the value of a function for a given argument. While knowing the formula, we still have to calculate the value somehow. If the formula is f(x,y) = x + y, we’re cool, but if it contains integrals, it starts to be messy. Actually, even a simple square root is more problematic than probably many of you think. There is no easy way to calculate a square root. The only way is to guess the value, check if it’s good enough, correct it, check again, correct it, and so on, until the square of our approximation is close enough to the argument. Now, pay attention to how I called the value of a square root calculated by a computer. I called it an approximation. Even if there is just such an innocent looking square root in your formula, some approximation has to be already made. Fortunately, this approximation is made automatically, fast, and as precisely as it’s possible. Unfortunately, it doesn’t work like that for integrals, or differentials.

So, formulas are not good enough. What else, then? I have already written what a mesh generation is. Now, it’s time to write what a mesh itself is. A mesh is a finite approximation of a function. This is it. So simple, so pretty. Finite means that it contains a finite amount of data. Solution to every problem can be expressed as a function. Thus a mesh can be used to approximate a solution to any kind of problem. Now, let’s think what kind of trouble we can run into when working with meshes. The problem with approximation is always the same: it may vary too much from the exact solution. OK, but what can we do with that? Well, generally, an approximation can get better when we let it store more data. However, with a given amount of data, a lot of approximations can be constructed, varying in exactness. Thus, we can expect from an approximation to be as close to the exact solution as it is possible for a given limit of a data amount. There’s a catch: how high should we set the limit? This is a very important question because the more data we agree to store the more time any data processing will take. And, processing the data is the purpose of having it. So, we have two contradictory goals: we want the approximation to be as exact as possible, but, in the same time, we want it to be as small as possible. Choosing between precision and efficiency is crucial in computer science. It would be an amazing achievement if we could generalize this problem and find an automatic approach to solve it. Such a solution would have to take into account our preferences, but setting only one parameter, with values from 0 to 1, should be enough. The value 0 would mean that the precision is unimportant, so the result can be random, but it has to be given immediately. The value 1 would mean give me the best result you can, no matter how long it takes. Sometimes, we may want to set also the minimal precision, and the maximal calculation time. Those three parameters are, in my opinion, completely sufficient to control the balance between precision and efficiency.

We know already what to aim at, but we still don’t know how to get it. Our approach will depend strongly on the mesh structure. So, the first thing to do is to choose how to represent a finite approximation of a function by mathematical and computer structures. After a brief research on approximation techniques, we end up with the strong impression that the biggest possibilities we can expect of an approximation are offered by the splines. Putting trust in this impression, we will follow up with defining a mesh as a multidimensional spline. There are some variations of splines, especially in multiple dimensions. In one dimension, a spline is defined as a set of points, values at those points, and, optionally, values of some number of derivatives (first, second, and so on). The points of a spline will be called the mesh points. Having just the values of a function at every mesh point lets us make a linear approximation between those points. Having the values of a function and the values of the first derivative of the function lets us make a second order approximation. The more derivatives the more accurate approximation we get. The accurateness depends also on the number and the location of the mesh points. The amount of data can be controlled by the number of the mesh points, while the location of the mesh points can be used to control the accurateness of the approximation. This way we can obtain a well-defined optimization problem, and try to solve it.

Taking splines to multiple dimensions is not obvious. The most popular way is to imagine a one-dimensional spline as it was consisted of segments which join the neighbouring mesh points. With such an interpretation of a one-dimensional spline, we can take it to a d-dimensional space by replacing the segments with d-dimensional polygons, eg. quadrilaterals in 2D. The polygons have to cover the whole domain. The vertices of the polygons are the mesh points. Such a multidimensional spline has the advantage of indicating the neighbourhood relation. The neighbours are simply the mesh points connected by an edge. Now, you probably think Hey, I know that from somewhere. Yep, this kind of a multidimensional spline is exactly what the finite element method uses. However, this approach has a disadvantage. A mesh consisted of polygons is not much flexible. The polygons are supposed to be as regular as they can. Moving one mesh point can ruin all the regularity. In order to keep the regularity after moving one point, we would have to move a lot of other points. That’s problematic. What does FEM do about it? It’s simple: it doesn’t move the mesh at all. The mesh in FEM is fixed, it’s not supposed to move. Why should it move? We have generated a mesh, and it’s fine, let’s not touch it, right? Well, it’s not exactly fine. You see, FEM doesn’t even try to optimize the location of the mesh points. In my humble opinion, this is a huge mistake. A simple example to make it clearer: computer graphics doesn’t use FEM to simulate fluids because it’s completely inefficient and ineffective. Instead, particle systems are widely in use. A particle system is a technique where a lot of particles of a given fluid are followed. Basing on how they move, the motion of the whole fluid is estimated. This way, a surface of a wavy sea can be effectively tracked. In FEM, when a surface moves between two mesh points, it diffuses due to the interpolation error. Actually, there are algorithms which incorporate a particle system into FEM. In those algorithms, a particle system is used to track a surface, and correct the results calculated by FEM. I’m sorry, but that’s ridiculous. The mesh should track any surface by itself, and it should be done automatically, without even requiring from the user to identify the surface. A surface can be identified by a computer easily. It’s just the place where the value of a function changes mostly. However, tracking the surface would require moving the mesh. As I mentioned before, a mesh consisted of polygons is a rather static structure. Something else is needed.

We will stay with the splines, but we will make them multidimensional in a different way. Let’s forget about the neighbourhood relation for a while. A one-dimensional spline is a set of points. A multi-dimensional spline can be defined in exactly the same way. Now, we just have to bring back the neighbourhood relation somehow. To do that, we must realize what is the relation needed for. For any point outside the mesh, the value of a function has to be interpolated using some surrounding mesh points. In FEM, every point is interpolated using the edges of the polygon it lays in. In our approach, we will choose the mesh points which satisfy the following criterion: the neighbours have to form the smallest polygon which contains the given point. The smallest means with the lowest sum of the lengths of the edges. The interpolation method is very important here. Note that there can be more than one polygon satisfying the given criterion. No matter which one we choose, the result should be the same. Such a definition of a mesh has a disadvantage. Every time we move a point, we have to find the neighbourhood relation again. It doesn’t seem to be more flexible than the mesh used in FEM. Finding a new triangulation of a set of points in 2D takes O(n log n) time, and we still have to find the proper polygon for a given point in space. The pessimistic complexity of finding the neighbourhood relation in our approach is the same. However, it is possible to do it with the average complexity equal to O(n) thanks to something called locality sensitive hashing. Also, we can consider our approach to be more handy. There is no good reason to make a triangulation of a space. The neighbourhood relation doesn’t require that. I prefer to do only the things which can be sensibly justified.

Once we have a definition of a mesh, we can try to formulate our optimization problem. In general, it will be something like: find the set of mesh points for which the cost function is minimal. Yeah, I know, that’s very general. What we can tell for sure about the cost function is that it has to contain the number of points and some estimate of the approximation error. The approximation error depends on the thickness of a mesh and the irregularity of the approximated function. The irregularity of a function is described by the second derivative, which can be easily estimated with the values stored in the mesh. Thus we have everything that we need to define the cost function. Unfortunately… I haven’t done it yet. But I will. Soon. Be patient and check my blog. Next entries will be more scientifically exciting, I promise. You can also check the web site of the project. It’s called The Magic Mesh :) I would appreciate any comments, especially harsh critique. Cheers.

Related