Calculating the kinetic parameters of the OH-CH3OH gas-phase reaction using linear regression

Shota-kun · March 11, 2025, 5:06am

I have an unfinished project on calculating the kinetic parameters of the OH-CH3OH gas-phase reaction using linear regression.

In this project, I implemented two methods: one using the sklearn library and another using gradient descent from scratch. Surprisingly, the loss value for the gradient descent method was lower than when using sklearn—something I wasn’t expecting.

This project was inspired by what I learned in the Machine Learning Specialization on Coursera by DeepLearning.AI, combined with my background in chemical engineering. As a beginner in coding and machine learning, I’m still learning the ropes and figuring out the standard practices. Any insights or feedback would be greatly appreciated! If anyone can point out why the results weren’t what I was expecting, I would really appreciate it!

You can find the project on GitHub: GitHub - mjmortega/OH-methanol_LinearReg

mahmutc · March 11, 2025, 7:41am

hi @Shota-kun

I don’t understand this part in the version without sklearn:

# multiply variables by 1000 for larger gradient
X=1000*X
y=1000*y

I believe the loss calculation is not accurate due to the line loss /= (2 * m * 1000000).

mahmutc · March 11, 2025, 7:46am

Here’s the Qwen2.5-Coder’s reply when I asked what’s wrong with the code:

The code you provided has a few issues that need to be addressed:

The variables X and y are being multiplied by 1000, but this affects the scale of the gradient descent process. This can lead to numerical instability because the loss function will be very large initially.

The learning rate is set to a relatively small value (0.001), which might cause the gradient descent to converge slowly or not at all if the initial parameters are far from the optimal solution.

The error history and parameter history are only recorded for the first 100,000 iterations, which might be too few iterations to capture meaningful trends in the loss function.

Shota-kun · March 11, 2025, 8:07am

Hi @mahmutc . When I initially run the code without multiplying X and y by 1000, the w parameter would converge very slow.

If you can inspect the values of X, they are very low values. Also, this is the formula to calculate the gradient of the w parameter:

dloss_dw += error * X[i]

We can see that the loss or the error calculated will be multiplied by the corresponding X value for a particular index. Hence, the gradient of w will also become very low in just a few iterations and would not converge as fast as the b parameter.

Basically, the reason why I multiplied X and y by 1000 is to obtain a larger gradient for w for it to converge faster.

Shota-kun · March 15, 2025, 2:45pm

Update

I found the issue. Basically, the way I used to implement gradient descent and calculate the loss function (or the cost function) is a bit different from how the Sklearn library uses linear regression.

sklearn.linear_model.LinearRegression()

loss = mean_squared_error(y, y_pred)

The functions above calculate the loss function as:

loss = 1/m Σ (yi - ŷi)^2

On the other hand, I drew inspiration from Andrew Ng’s course to use gradient descent, and this is the formula that I used for the loss function:

loss = 1/2m Σ (yi - ŷi)^2

We can see that the formula I used divides the standard cost function by 2. Which is the reason why Sklearn library gave me a loss function of 0.0096 while the gradient descent from scratch gave me 0.0048. I fixed the issue in my code by just using the standard cost function where dividing by 2 is not needed, but I also multiplied the gradients of the parameters by 2 so that I am still following the rules of calculus. Shown below is the sample derivation for just the parameter b, or theta0:

I now calculated approximately 0.0096 loss function for both methods. I am not saying that Andrew Ng’s method is wrong in any way and I think that he only used a different formula so that the gradient computation would look simpler and easier to remember. I believe that he is a great instructor and I highly recommend his online courses to everyone.

I just want to share this little discovery of mine. Of course I am still a beginner so please tell me if there are things that I mentioned incorrectly because I am always eager to learn something new .

Topic		Replies	Views
🔬 Exploring Reinforcement Learning for Molecule Generation with GPT-Based Models; Loss Fluctuations Intermediate	2	283	April 11, 2024
Adding learnable coefficients for multi-objective losses? Research	2	761	November 25, 2020
Additional loss logging 🤗Transformers	1	643	January 4, 2024
Supervised Fine-tuning Trainer - Custom Loss Function 🤗Transformers	3	4596	November 7, 2024
The loss plateau of pratraining Bert using run_mlm.py Models	4	1942	April 4, 2023

Calculating the kinetic parameters of the OH-CH3OH gas-phase reaction using linear regression

Related topics