Hi, I’m new to the fine tuning task. I’ve prepared over 500 examples for my training, and run two fine tune jobs for gpt-4omini. Can someone help me analyze the following results and suggest a better fine tune hyperparameter?
—Version 1 Result—
Hyperparamter used:
Batch size = 32, Learning rate multiplier = 2, epochs = 5
step train_loss train_mean_token_accuracy valid_loss valid_mean_token_accuracy
1 1.223983883857727 0.7312172651290894 0.9252833432474432 0.799030122167304
2 1.1674647331237793 0.7475640177726746 0.5941609295564302 0.8627935723114957
3 0.9744173288345337 0.7645379304885864 0.5770454281352505 0.8494939884093071
4 1.015865445137024 0.741278350353241 0.43793509023616617 0.8864472071441079
5 0.7584841251373291 0.8055519461631775 0.561777539547724 0.8502039654681719
6 0.8092712163925171 0.7855374813079834 0.49266161459475993 0.8685709278892491
7 0.7327184081077576 0.8027288317680359 0.4328274742345327 0.8802526537415563
8 0.6495775580406189 0.829208254814148 0.5033842056043153 0.8619557872381504
9 0.854096531867981 0.7737263441085815 0.3729245558956796 0.8969718064740689
10 0.7585716843605042 0.80228191614151 0.4963070153092978 0.8583655438983987
11 0.9064979553222656 0.7615699172019958 0.35174561399713683 0.900052603892688
12 0.6902016401290894 0.8095133900642395 0.4106475178746978 0.8794130724766563
13 0.7084276676177979 0.8097351789474487 0.3385785975757844 0.902247778358599
14 0.8393263220787048 0.7761092185974121 0.37088152250284073 0.89281210592686
15 0.8501049280166626 0.7712512016296387 0.38096020136417175 0.8892114628207529
16 0.6419965028762817 0.8254801630973816 0.30945074553155355 0.9082270353565619
17 0.7826900482177734 0.7881622314453125 0.36574560079795254 0.8944336028250163
18 0.6780310273170471 0.8088566064834595 0.2534700838535518 0.9256697601120645
19 0.6242154836654663 0.824070394039154 0.3383867441213744 0.9008636200071969
20 0.4584072530269623 0.8679026365280151 0.26177146068130264 0.9228015020522226
21 0.5827918648719788 0.8392739295959473 0.31360479052779744 0.9101391271202592
22 0.6062675714492798 0.8269094228744507 0.2622073710659325 0.9258523953388001
23 0.5932649970054626 0.8292341828346252 0.25196206358057505 0.9268186564566655
24 0.47557249665260315 0.8599757552146912 0.30946511839485596 0.9148605306638157
25 0.4841950535774231 0.8645213842391968 0.22860966517647766 0.9357374807725175
26 0.7061476707458496 0.7996266484260559 0.2925021669254576 0.9165945997886038
27 0.6015442609786987 0.8315213322639465 0.20215039940516044 0.9420202714310256
28 0.43521547317504883 0.8741108775138855 0.24719053309984135 0.9313070976809558
29 0.4239364266395569 0.8793813586235046 0.2097700149081843 0.9422942206654992
30 0.5048272609710693 0.8593810796737671 0.2401556129483657 0.9309646539027983
31 0.489469975233078 0.8627908229827881 0.22536287135240027 0.9415836101882613
32 0.4647248387336731 0.8671607971191406 0.20903944469786978 0.940456081081081
33 0.5304650664329529 0.8471976518630981 0.24628775630858765 0.9372413129271344
34 0.34490254521369934 0.8998329043388367 0.1833824870476928 0.9495656798439993
35 0.5354068875312805 0.8458157181739807 0.23160343482017096 0.940521431727795
36 0.29858243465423584 0.9098154306411743 0.17850942183282695 0.9527180562838664
37 0.29642680287361145 0.9162688255310059 0.22617750950240242 0.9432742054693274
38 0.4861167371273041 0.8573485016822815 0.18226540506301675 0.9541189635878454
39 0.3010188639163971 0.9082650542259216 0.1837235573703967 0.9513513513513514
40 0.41039326786994934 0.8822667002677917 0.2379425308819243 0.9425186055620838
41 0.3613796830177307 0.8922756314277649 0.17909636414762375 0.954633939906898
42 0.3663441836833954 0.89372718334198 0.22591740866913815 0.9456902261330306
43 0.30851638317108154 0.9080442786216736 0.1494070034792702 0.9612565445026178
44 0.394888699054718 0.8823418617248535 0.19560620598128078 0.9520476931052358
45 0.36645153164863586 0.8998531699180603 0.16021833797492605 0.9587164073550212
46 0.3333863914012909 0.9018248915672302 0.1999048011272296 0.9505310229178312
47 0.2874438464641571 0.9160529971122742 0.16666816257604808 0.960165081643639
48 0.3492628335952759 0.8969290256500244 0.18108391457485634 0.9548982407726803
49 0.26795411109924316 0.9243243336677551 0.21701353653393765 0.9508528476438277
50 0.248086079955101 0.9305405616760254 0.16292359000351517 0.960112705820199
51 0.3259166181087494 0.9070965051651001 0.19723174867869758 0.9538953540019544
52 0.3302572965621948 0.9024418592453003 0.1508477964913319 0.961190855927698
53 0.27958548069000244 0.9207113981246948 0.1904797747824282 0.9543767538698289
54 0.16433565318584442 0.948958694934845 0.17251680215833257 0.9598856350719113
55 0.27173298597335815 0.92266845703125 0.16782141611709758 0.9599893541518808
56 0.2755664587020874 0.9186662435531616 0.22669832923696626 0.9516470814871894
57 0.189011812210083 0.9415751099586487 0.1675886069683395 0.9613572101790764
58 0.2607693374156952 0.9214391112327576 0.2151498398784825 0.9551431502378066
59 0.2746150493621826 0.918997585773468 0.1380430418927615 0.9676849726293484
60 0.21890798211097717 0.9355718493461609 0.19920565840509377 0.9581351094196003
61 0.15521526336669922 0.9544575214385986 0.14964216850493173 0.9644545613727894
62 0.25583744049072266 0.9254676103591919 0.20288241586991884 0.9565506118964046
63 0.17124205827713013 0.9474444389343262 0.16137715266712455 0.9648646280557259
64 0.21466396749019623 0.9381143450737 0.1749702367942389 0.9606983068690236
65 0.27771100401878357 0.9192952513694763 0.22375213640722855 0.9553045660778068
66 0.26340624690055847 0.9214857220649719 0.1539960362294639 0.9651931778628611
67 0.14390389621257782 0.9563758373260498 0.20468247180351035 0.9574820541137493
68 0.2060651183128357 0.9404266476631165 0.14046846865019044 0.968788357005085
69 0.22334806621074677 0.9356790781021118 0.1867908813625826 0.9606936416184971
70 0.263373464345932 0.9197981953620911 0.1636001367896333 0.9648893535459139
71 0.14088337123394012 0.9582729339599609 0.172205272998236 0.9629796433075122
72 0.16707344353199005 0.9497310519218445 0.22274104484705293 0.9572017231691328
73 0.18338549137115479 0.9464616775512695 0.16009202504360087 0.9630168649944354
74 0.1776522696018219 0.9469922184944153 0.2380899750669033 0.9558591208995446
75 0.15418972074985504 0.9549180269241333 0.13821197904527677 0.969094729469445
76 0.24340903759002686 0.9277198314666748 0.2047599586234242 0.9594278517452322
77 0.14811696112155914 0.9572144150733948 0.15122933005070688 0.9663784822286263
78 0.2442699372768402 0.9245946407318115 0.2007233374500947 0.9605488850771869
79 0.2504882216453552 0.9278743267059326 0.16786470853618676 0.9646957272334916
80 0.09753310680389404 0.9727811217308044 0.16190347678147662 0.9654542290288897
81 0.12955155968666077 0.9611342549324036 0.2350659158628621 0.9557780153562057
82 0.1640556901693344 0.9537267684936523 0.1616236491603457 0.9647923431891984
83 0.12819471955299377 0.9622443914413452 0.21609570307221665 0.9590660132603056
84 0.15176016092300415 0.9576533436775208 0.14456109323978342 0.9674454561071981
85 0.15537168085575104 0.9550252556800842 0.1806748856647033 0.9646872803935348
86 0.16997376084327698 0.9518248438835144 0.17191327273741286 0.9636602451838879
full_valid_loss and full_valid_mean_token_accuracy for 5 epochs:
epochs, full_valid_loss, full_valid_mean_token_accuracy
epoch 1:0.31483180259075205, 0.9072434607645875
epoch 2: 0.21055486082311128, 0.9437122736418511
epoch 3: 0.17589645213044625, 0.9572434607645876
epoch 4: 0.17671230333431864, 0.9617706237424547
epoch 5: 0.1795135809861918, 0.9639336016096579
—Version 2 Result—
Hyperparamter used:
Batch size = 16, Learning rate multiplier = 1.2, epochs = 5
step train_loss train_mean_token_accuracy valid_loss valid_mean_token_accuracy
1 0.43189537525177 0.921484649181366 0.21273402596543523 0.9600138480180024
2 0.42393168807029724 0.9153726696968079 0.26159323018618164 0.9522846744844319
3 0.38738900423049927 0.9318317770957947 0.10961287551008395 0.973808719161879
4 0.2736471891403198 0.9544827342033386 0.28052981435899904 0.941198224852071
5 0.17514564096927643 0.9653102159500122 0.19717209323329266 0.9585482682387619
6 0.2140514850616455 0.9495588541030884 0.17734154048604325 0.9608674384477417
7 0.2762037515640259 0.940207839012146 0.24560403230626007 0.9407744874715261
8 0.2963683307170868 0.9292734861373901 0.13536482950312365 0.9636045494313211
9 0.324065625667572 0.926846981048584 0.21911179024692679 0.9520753102267865
10 0.3919684886932373 0.9076589941978455 0.11047193710251725 0.9710243736151355
11 0.2450418919324875 0.9404680132865906 0.2104776581458658 0.939612188365651
12 0.44408413767814636 0.8888119459152222 0.15874627543592817 0.9614871623874625
13 0.14514826238155365 0.9633381962776184 0.19065130261386587 0.9482791047583223
14 0.16256770491600037 0.9608996510505676 0.1439259528172953 0.9593883590924038
15 0.1808147430419922 0.9534288644790649 0.15932348770881766 0.9576492537313432
16 0.16920186579227448 0.9613741636276245 0.21277083900361615 0.9505901180236047
17 0.24061264097690582 0.9365837574005127 0.10922363706924373 0.9709871244635193
18 0.20352333784103394 0.9468533992767334 0.1833776504626997 0.9530615846126699
19 0.20636878907680511 0.9448661804199219 0.16456625673951222 0.9657142857142857
20 0.15028052031993866 0.9614084362983704 0.2033144525631935 0.950366151342555
21 0.2507861852645874 0.926957368850708 0.08347860418060059 0.9767624430836866
22 0.3309449553489685 0.9150654673576355 0.1891160676799787 0.95473496128648
23 0.2610990107059479 0.9295814037322998 0.20097168595318632 0.9559420289855073
24 0.2596650719642639 0.9278410077095032 0.11205066322691162 0.971499176276771
25 0.2815389633178711 0.925652265548706 0.17269353410075083 0.9543951261966928
26 0.18293045461177826 0.9501661062240601 0.1548588538323733 0.9644165358451072
27 0.15631188452243805 0.9607421159744263 0.22782525708598475 0.9487179487179487
28 0.1804698407649994 0.9483796954154968 0.0858754798165914 0.9774976061283115
29 0.15336890518665314 0.9549483060836792 0.17814949029768987 0.9543704274162496
30 0.23244576156139374 0.9358959794044495 0.20778908545217925 0.9542304886943836
31 0.27950039505958557 0.9250527620315552 0.11598562857591851 0.9724137931034482
32 0.17105869948863983 0.9572479128837585 0.16068660890964379 0.9561795743158648
33 0.2705594003200531 0.9306616187095642 0.15069855995224898 0.9638175144205559
34 0.14705871045589447 0.9563832879066467 0.23752587333558098 0.9474206349206349
35 0.12761597335338593 0.9674128890037537 0.08262021540758022 0.9785739374780471
36 0.11598173528909683 0.9694864153862 0.14186885170430444 0.9605446927374302
37 0.10020824521780014 0.9722169637680054 0.15782693691823965 0.9682539682539683
38 0.164589062333107 0.9512903094291687 0.17549285406646248 0.957002457002457
39 0.11086989939212799 0.9675757884979248 0.1191982862382632 0.9686334350627331
40 0.1451531946659088 0.9598774313926697 0.14608805179166595 0.9672249234647938
41 0.1073744148015976 0.9690449833869934 0.23428076060850228 0.9546019900497512
42 0.1486506313085556 0.9600721597671509 0.11234176238798381 0.9742504409171076
43 0.11642973870038986 0.9676239490509033 0.15819018105536578 0.9597327237559345
44 0.0905413031578064 0.9731410145759583 0.15203743556589544 0.9718548660562902
45 0.16237851977348328 0.9499695897102356 0.20854299338501456 0.9522298936368726
46 0.13676676154136658 0.9620065689086914 0.08107621569083795 0.9776618294472526
47 0.12156200408935547 0.9660566449165344 0.17481176353023742 0.9586648554240248
48 0.09535674750804901 0.9744645953178406 0.21507026436172913 0.9544392523364486
49 0.1322920024394989 0.963008463382721 0.11673265817584452 0.9729318220267298
50 0.10834038257598877 0.9700822830200195 0.1531171270076361 0.959419789328268
51 0.10997047275304794 0.9658938050270081 0.14998494743409518 0.9704399234915667
52 0.10207048058509827 0.9709548354148865 0.22470843095550014 0.9495274914089347
53 0.09919016063213348 0.9705097675323486 0.06274121156482627 0.9829201934703748
54 0.10392924398183823 0.9724887609481812 0.17875565583137606 0.9568245125348189
55 0.12309497594833374 0.9678566455841064 0.20332985016906147 0.9606049290515309
56 0.06669408082962036 0.9812448620796204 0.11175869313311561 0.9737889847378899
57 0.20481833815574646 0.9497835636138916 0.14889756411752358 0.9595959595959596
58 0.0723666101694107 0.977207362651825 0.14741052796276502 0.9703998615198199
59 0.1123598963022232 0.9685141444206238 0.2120405425439226 0.957541447634452
60 0.12750758230686188 0.9620202779769897 0.08094070166419913 0.9792159513349105
61 0.14131535589694977 0.9590163826942444 0.1457113344993817 0.9648668639053254
62 0.05263190716505051 0.9843288064002991 0.1654847123294344 0.9696020633750921
63 0.12790805101394653 0.9641807675361633 0.16833545417978463 0.9633132235447579
64 0.23157501220703125 0.9402003288269043 0.12245071040825771 0.9670579989486595
65 0.18075081706047058 0.9499608874320984 0.12963226899074445 0.9702537182852143
66 0.12532301247119904 0.9656912088394165 0.24372883724462388 0.9512195121951219
67 0.07458651810884476 0.9807178974151611 0.11254987949119712 0.9752854951423214
68 0.12526200711727142 0.968244731426239 0.15376408014271067 0.961034164358264
69 0.0428009107708931 0.9902346134185791 0.13807195407146214 0.9751583861287095
70 0.08501363545656204 0.9803309440612793 0.20190374049024742 0.9563663720142938
71 0.08386851847171783 0.9786530137062073 0.09830754181931811 0.9740217033870437
72 0.055533938109874725 0.986828088760376 0.15781847363087667 0.9639925373134328
73 0.095095694065094 0.9746028780937195 0.24272610984294027 0.9547909581916383
74 0.07834992557764053 0.979353666305542 0.10963620951247317 0.9769957081545064
75 0.055497217923402786 0.9858512282371521 0.16190037374797692 0.9604729133580377
76 0.05339839681982994 0.9848642349243164 0.1550326143793699 0.9719327731092438
77 0.08000563830137253 0.9774056673049927 0.23093709875840884 0.9530105777054516
78 0.053885579109191895 0.9862081408500671 0.06579741167861189 0.9836709059506987
79 0.053419698029756546 0.9838683009147644 0.1805136254412659 0.9602938256898947
80 0.10355999320745468 0.9720398783683777 0.22218169797445841 0.9592270531400966
81 0.07630997896194458 0.9818206429481506 0.11705187934234468 0.9736408566721582
82 0.05590583011507988 0.9854439496994019 0.14766124994263014 0.9641427328111402
83 0.05970679223537445 0.9821974635124207 0.13502232558062288 0.9754055468341183
84 0.04537366330623627 0.9877315759658813 0.24992251534221388 0.9520264681555004
85 0.08679049462080002 0.9771019220352173 0.06235492979038115 0.9851579955314396
86 0.058907411992549896 0.9838352799415588 0.16483020929359316 0.9634193299961494
87 0.09172443300485611 0.9750885367393494 0.21412671107953696 0.962253829321663
88 0.0542391799390316 0.9857635498046875 0.11546697976749715 0.9752052545155994
89 0.07171512395143509 0.9797830581665039 0.13252184272046977 0.967626542657843
90 0.0971238985657692 0.9729651808738708 0.13481341873593022 0.9734312183184758
91 0.042546115815639496 0.9882240295410156 0.25804788158053443 0.9535714285714286
92 0.0779477208852768 0.9787154793739319 0.07589705835361836 0.9810326659641728
93 0.07969354838132858 0.9782382249832153 0.13664538733786044 0.9697974860335196
94 0.11352572590112686 0.9676492214202881 0.14649453520466807 0.9760059062384644
95 0.04377336427569389 0.9892446398735046 0.1869914723346425 0.9592839592839593
96 0.06418538838624954 0.9842855334281921 0.09555899987749505 0.9759240420481519
97 0.03814198076725006 0.9884642362594604 0.14306646532868986 0.9702863317125878
98 0.17663612961769104 0.9541206359863281 0.22550649508512632 0.9606135986733002
99 0.08126658201217651 0.979973316192627 0.10007107009753138 0.9781305114638448
100 0.07513648271560669 0.981130838394165 0.1518653804563734 0.9637770353437665
101 0.07001636922359467 0.9806563258171082 0.13519827912887827 0.9772804340454392
102 0.04998511075973511 0.9875162839889526 0.19718266013932553 0.960440380668035
103 0.04813910275697708 0.987669050693512 0.06614426370742328 0.9835317136800913
104 0.05696578696370125 0.9856080412864685 0.17396543078253207 0.9640985833495052
105 0.06322606652975082 0.9858924746513367 0.20090452644312493 0.9643691588785047
106 0.038130585104227066 0.9908187389373779 0.11339125667927613 0.9764845203857215
107 0.03269226849079132 0.9920724630355835 0.16117233423922392 0.961837333793818
108 0.03444869443774223 0.990356981754303 0.14260113224778417 0.9756564075812902
109 0.049635160714387894 0.9876684546470642 0.2380153907533364 0.9561855670103093
110 0.031116027384996414 0.9937320947647095 0.061479720988613804 0.9847339782345829
111 0.07093395292758942 0.9830735921859741 0.19466885787957794 0.9588141663350577
112 0.05021617189049721 0.988501787185669 0.19748813310071903 0.9669529499626587
113 0.032724037766456604 0.9935305118560791 0.1246402749338131 0.9746184472461845
114 0.032692138105630875 0.9893417954444885 0.15682099690657864 0.9627857522594365
115 0.042003367096185684 0.9873976111412048 0.14692664645794087 0.9752466678206682
116 0.03907403349876404 0.9884692430496216 0.23557023890241166 0.9579458147998382
117 0.054958220571279526 0.9869385361671448 0.07920373392572432 0.9836093274754985
118 0.020278148353099823 0.994512677192688 0.15876741698507726 0.9654215976331361
119 0.0685969740152359 0.9816021919250488 0.16777123927015897 0.9723655121591747
120 0.07046201080083847 0.9825544357299805 0.17754496902970882 0.9647806946029676
121 0.051242418587207794 0.9861239790916443 0.12415607459745245 0.9714385841948484
122 0.03653191402554512 0.9905837774276733 0.13631388813596176 0.9721784776902888
123 0.07156213372945786 0.9814989566802979 0.2632400759989551 0.956140350877193
124 0.0380060039460659 0.9893411993980408 0.11569522352464076 0.9769899437531958
125 0.04269108548760414 0.9880558252334595 0.1634382739291627 0.9621421975992613
126 0.03521928936243057 0.9909871816635132 0.13859546657560984 0.9778259419806602
127 0.051650725305080414 0.9875317215919495 0.2083642093987758 0.9593755877374459
128 0.026555482298135757 0.9948462843894958 0.0909243445379175 0.9792831305491615
129 0.04678259417414665 0.9874835014343262 0.16256244502850434 0.966044776119403
130 0.03573854640126228 0.9924330711364746 0.2443116205791207 0.958991798359672
131 0.03978630155324936 0.9899039268493652 0.10500706390249882 0.9792274678111588
132 0.07893450558185577 0.9777491688728333 0.16234243671288667 0.9625904358567143
133 0.03901994973421097 0.9909347295761108 0.14526159655146237 0.9768067226890756
134 0.04094694182276726 0.9897339940071106 0.221758683393213 0.9566720911310008
135 0.0341397263109684 0.9905176758766174 0.06053765899327584 0.9866541058250903
136 0.0612371526658535 0.9837456941604614 0.18950698592015763 0.9593011713321421
137 0.0241122767329216 0.9946749210357666 0.2183083443019701 0.9634782608695652
138 0.019833408296108246 0.9951634407043457 0.11227877143972985 0.9784184514003295
139 0.03533019870519638 0.9921000599861145 0.15840974049115825 0.9651871192341166
140 0.0487610287964344 0.9890799522399902 0.1309906636651806 0.9762776905634049
141 0.036199625581502914 0.9888905882835388 0.2543711153508416 0.9561621174524401
142 0.054363977164030075 0.9871407151222229 0.06232680872610648 0.9865943185445261
143 0.022264618426561356 0.9946959018707275 0.17334869085112856 0.9638043896804005
144 0.036568548530340195 0.989724338054657 0.21561878810913174 0.9648067104303428
145 0.015750888735055923 0.9969753623008728 0.11606719795314745 0.9768472906403941
146 0.031156601384282112 0.9919814467430115 0.13393101549771153 0.9690574137005903
147 0.04011842608451843 0.9909259676933289 0.1373110232580752 0.9764027267960147
148 0.017958758398890495 0.9941520690917969 0.2762542751100328 0.9543650793650794
149 0.017880447208881378 0.9971751570701599 0.08136403606697513 0.9840182648401826
150 0.041828032582998276 0.9916222095489502 0.1467295149185138 0.9682262569832403
151 0.01896321401000023 0.9955583810806274 0.15006283104925763 0.978405315614618
152 0.025182830169796944 0.9938674569129944 0.2024714284798437 0.9613899613899614
153 0.02759905718266964 0.9938700199127197 0.1012680393916221 0.9779586300440828
154 0.029776209965348244 0.9937865734100342 0.15816446795842248 0.9706464973887988
155 0.030298767611384392 0.9918793439865112 0.25633053914033754 0.957089552238806
156 0.03261978179216385 0.9919827580451965 0.10996877613101266 0.9800705467372134
157 0.031137648969888687 0.99357670545578 0.16779633098271604 0.9660629505890628
158 0.034522317349910736 0.9909018278121948 0.14454619768879381 0.9782977280434045
159 0.034513749182224274 0.9913212656974792 0.213415914650483 0.960440380668035
160 0.011919124983251095 0.9977769255638123 0.06788786398767745 0.9854883417577043
161 0.025633661076426506 0.994873047 0.1894226844116898 0.9627401513681351
162 0.025274522602558136 0.9939469695091248 0.2290465742628151 0.9602803738317757
163 0.04713669419288635 0.9868891835212708 0.12256324321480294 0.9764845203857215
164 0.022655064240098 0.9947890639305115 0.16661001655685329 0.9656363322396823
165 0.02598702348768711 0.9945166110992432 0.1533924607562306 0.9763519387932533
166 0.030285196378827095 0.9904102087020874 0.24841255092948572 0.9553264604810997
167 0.03933889418840408 0.9910299777984619 0.06209488445498779 0.9862454655380894
168 0.06741807609796524 0.9826927185058594 0.19482203778450202 0.9623955431754875
169 0.031035343185067177 0.9929735660552979 0.2109367357785351 0.9669529499626587
170 0.014346349984407425 0.9971481561660767 0.12165767179867568 0.9761114797611148
171 0.03277391195297241 0.9909849166870117 0.15658672460528653 0.9654439128123339
full_valid_loss and full_valid_mean_token_accuracy for 5 epochs:
epochs, full_valid_loss, full_valid_mean_token_accuracy
epoch 1:0.15247034518051916, 0.9098591549295775
epoch 2: 0.15076934432599626, 0.9138832997987928
epoch 3: 0.14382053937470649, 0.9175050301810865
epoch 4: 0.15336846777610855, 0.9181086519114688
epoch 5: 0.16031172299528987, 0.9179577464788733
Some background information:
I’m fine tuning for answering exam questions in the domain of accounting and finance, there are both multiple choice questions and comprehensive questions. MCQ is easier and comprehensive is very long and difficult. I have both in my training data, but only MCQ is validation data (because it’s hard to make new comprehensive questions in the validation)
Need some advise on how to adjust the hyperparameter. Additonally, I’m aslo wondering whether not having comprehensive questions in the validation data will be a big issue?
Thanks!