Some really basic curve fitting

Problem

I would like it if I were able to set some really basic curve fitting for a plot without having to rely on other packages and complex solutions to smooth out curve data. Its not the most useful feature data analysis wise, but its much more pleasant to look at to see a general smooth trend rather than always just having the straight lines between points.

Proposed Solution

Add an option to plotting to customize the amount of smoothness of the curve. 0 smoothness being what plot already does by just connecting the dots, and increasing values of smoothness increase the amount of smoothing done to the curve.

Additional context and prior art

There are some close solutions but nothing is perfect. The closest I’ve gotten is

x_np = np.array(x)
y_np = np.array(y)
y_smooth = np.linspace(x_np.min(), x_np.max(), 200)
spl = make_interp_spline(x_np, y_np, k = 3)
y_smooth = spl(x_smooth)
plot(x_smooth, y_smooth)

where x and y are some lists of data. This solution comes close in some cases, but it is very finicky and doesn’t always work. Any proposed solution I have found, usually works about the same and isn’t a perfect fix.

I know that curve fitting is a very complex problem, and I’m not looking for a perfect interpolation of data, I just want to be able to smooth out the jaggedness of a curve to show a general trend in the data for visuals sake.

1 possible answer(s) on “Some really basic curve fitting

  1. Thanks for the proposal. As you say, there is no general good solution for smoothing data. It depends on the context. There is no free lunch here.

    I’m strictly 👎 on this for multiple reasons:

    • From a technical standpoint I’m against adding half-working features to the library. This will trigger complaints on the non-working cases.
    • Providing a “smooth this data” function is dangerous. It creates the false illusion that one can simply apply a fixed algorithm and get meaningful data out.
    • Most of the time, you should plot data as is anyway. Choosing a good style helps to keep the desired message across. If you have noisy data, it’s often good to use individual markers and not a line. Then you don’t have a jagged curve. You might add a model as a line on top, but then again, you have to think what the model should be and cannot just “smooth”. In that sense, providing generic smoothing would rather seduce people to misrepresent their data.