Vector similarity in Python representing music notes

I have the following vector, which represents 5 notes played on a guitar:

[(0, 0.06365224719047546), (41, 0.6289597749710083), (42, 0.6319441795349121), (43, 0.632896363735199), (44, 0.631447434425354), (45, 0.6318693161010742), (46, 0.6315509080886841), (47, 0.6318208575248718), (48, 0.6322312355041504), (49, 0.6312702894210815), (50, 0.6237916350364685), (51, 0.630915105342865), (52, 0.6276333928108215), (53, 0.6117454171180725), (54, 0.6141350865364075), (55, 0.6154367923736572), (56, 0.6177382469177246), (57, 0.6182602047920227), (58, 0.617703378200531), (59, 0.6159048080444336), (60, 0.6125935316085815), (61, 0.6102696657180786), (64, 0.5833065509796143), (66, 0.5837067365646362), (67, 0.5833760499954224), (68, 0.5839518904685974), (69, 0.5836052894592285), (70, 0.5835791826248169), (71, 0.5839493274688721), (72, 0.5813934803009033), (76, 0.6141220331192017), (77, 0.614814043045044), (80, 0.6165654063224792), (81, 0.6164389848709106), (82, 0.6160199642181396), (83, 0.610359787940979), (84, 0.6152002215385437), (85, 0.6141528487205505), (87, 0.5446829795837402), (88, 0.5510357022285461), (89, 0.552834153175354), (90, 0.5524792075157166), (91, 0.5520510077476501), (92, 0.5521120429039001), (93, 0.5521101355552673), (94, 0.5524383187294006), (95, 0.552492082118988), (96, 0.5522018074989319), (97, 0.5521131753921509), (98, 0.5523127317428589), (99, 0.5523053407669067), (100, 0.5521847009658813), (101, 0.5523706674575806), (102, 0.5523468852043152), (103, 0.5524667501449585), (104, 0.5524278879165649), (105, 0.552390992641449), (106, 0.5524452328681946), (107, 0.5524633526802063), (108, 0.5524865984916687), (109, 0.5526250600814819), (110, 0.5525843501091003), (111, 0.5524541139602661), (112, 0.5526156425476074), (113, 0.5528975129127502), (114, 0.5524407029151917), (115, 0.5524605512619019), (116, 0.5524886250495911), (117, 0.5525526404380798), (118, 0.5524702072143555), (119, 0.5525854229927063), (120, 0.5523728728294373), (121, 0.5524235963821411), (122, 0.5523437261581421), (123, 0.5518389940261841), (124, 0.5520192384719849), (125, 0.5523939728736877), (126, 0.5523043870925903), (127, 0.5532050132751465), (132, 0.5515548586845398), (134, 0.5523167252540588), (135, 0.5519833564758301), (136, 0.5524169206619263), (137, 0.5527742505073547), (138, 0.5523315668106079), (139, 0.5523473024368286), (140, 0.5532975196838379), (141, 0.5522792935371399), (142, 0.5503222942352295)]

It has 89 elements. Visual spectrogram-like representation looks like:

Y-axis is basically pitch. And X-axis is time. And I have another vector, which represents exactly the same notes, played on another acoustic guitar, and the play is quite similar if you listen to it. The vector looks like (94 elements this time):

[(31, 0.13769060373306274), (39, 0.15499019622802734), (40, 0.16191327571868896), (43, 0.16355487704277039), (59, 0.6356481313705444), (60, 0.634376585483551), (63, 0.6343578100204468), (64, 0.6335580945014954), (65, 0.6335859894752502), (66, 0.6335384845733643), (67, 0.6334232091903687), (68, 0.6339468955993652), (69, 0.630445122718811), (72, 0.6184465885162354), (73, 0.6183992028236389), (74, 0.6181117296218872), (75, 0.6186220049858093), (76, 0.6186297535896301), (77, 0.6185297966003418), (78, 0.618561327457428), (79, 0.618633508682251), (80, 0.6185418963432312), (81, 0.6184455752372742), (82, 0.6117323040962219), (83, 0.6014747619628906), (85, 0.39688345789909363), (90, 0.5867741107940674), (91, 0.5872393250465393), (92, 0.586899995803833), (93, 0.5866436958312988), (94, 0.5866578817367554), (95, 0.5864415168762207), (96, 0.5868685245513916), (97, 0.586477518081665), (104, 0.6182103157043457), (105, 0.6182997226715088), (106, 0.6188161969184875), (107, 0.618650496006012), (108, 0.6187460422515869), (109, 0.6181941628456116), (110, 0.6184064149856567), (111, 0.6148801445960999), (114, 0.5523382425308228), (115, 0.5557005405426025), (116, 0.5558828711509705), (118, 0.554828405380249), (119, 0.554919958114624), (120, 0.55497145652771), (121, 0.5546750426292419), (122, 0.5545178651809692), (123, 0.5545105338096619), (124, 0.5545050501823425), (125, 0.5544342994689941), (126, 0.5545479655265808), (127, 0.5543638467788696), (128, 0.5543714165687561), (129, 0.5545525550842285), (130, 0.5545058846473694), (131, 0.5547449588775635), (132, 0.5546910166740417), (133, 0.5545474290847778), (134, 0.5546845197677612), (135, 0.5546503663063049), (136, 0.5545172691345215), (137, 0.5548205971717834), (138, 0.5546956062316895), (139, 0.5547483563423157), (140, 0.5544265508651733), (141, 0.554632306098938), (142, 0.5543283820152283), (143, 0.5546634197235107), (144, 0.5543924570083618), (145, 0.5543931722640991), (146, 0.5547248721122742), (147, 0.5549289584159851), (148, 0.5547417998313904), (149, 0.5546922087669373), (150, 0.5545686483383179), (151, 0.5547193884849548), (152, 0.5548165440559387), (153, 0.5544684529304504), (154, 0.5549207329750061), (155, 0.5548054575920105), (156, 0.5541093945503235), (157, 0.554355800151825), (158, 0.5545284152030945), (159, 0.5548104643821716), (160, 0.5544529557228088), (161, 0.5541993379592896), (162, 0.5540767312049866), (163, 0.5550277829170227), (164, 0.5545808672904968), (177, 0.1516854166984558), (182, 0.15804383158683777)]

Visual representation is:

(see my first comment, since website allows only 1 attachment for new users)

As you can see, the data looks similar, and there is a little bit of noise. Time-wise, there is shift at the beginning (starts at 60, while the first one starts at 40). And some notes have been played a little bit (100-200ms) longer than others. But if you look to Y-axis, it’s pretty much the same, because note frequencies are the same.

I am looking for the way to compare these vectors to find out their similarity, which to my subjective judgement is around 95%.

I tried to invent my own algos, but I’m getting nowhere currently. Would really appreciate any pointers, what I’m looking for is a function that for given 2 vectors returns similarity value from 0 to 1.

Visual representation of a second vector: