Anybody has a public sample showing how to run NER on annotations coming from SageMaker Ground Truth NER?
Hey @OlivierCR,
Sorry, I donât have an example using NER annotations coming from SageMaker Ground Truth NER.
Using the example you have sent me
HF Datasets (conll2003)
{âchunk_tagsâ: [11, 21, 11, 12, 21, 22, 11, 12, 0],
âidâ: â0â,
âner_tagsâ: [3, 0, 7, 0, 0, 0, 7, 0, 0],
âpos_tagsâ: [22, 42, 16, 21, 35, 37, 16, 21, 7],
âtokensâ: [âEUâ,
ârejectsâ,
âGermanâ,
âcallâ,
âtoâ,
âboycottâ,
âBritishâ,
âlambâ,
â.â]}{ âcrowd-entity-annotationâ: { âentitiesâ: [ { âendOffsetâ: 26, âlabelâ: âsoftwareâ, âstartOffsetâ: 0 }, { âendOffsetâ: 38, âlabelâ: âversionâ, âstartOffsetâ: 35 }, { âendOffsetâ: 88, âlabelâ: âsoftwareâ, âstartOffsetâ: 84 }, { âendOffsetâ: 90, âlabelâ: âversionâ, âstartOffsetâ: 89 }, { âendOffsetâ: 93, âlabelâ: âversionâ, âstartOffsetâ: 92 }, { âendOffsetâ: 100, âlabelâ: âversionâ, âstartOffsetâ: 98 } ] } }
You could use the load_dataset to load your JSON files coming from SM Ground Truth and then use dataset.map() to iterate through it and adjust it to the datasets
format