I have been having the time of my life playing around with this! I just discovered it recently, and think it may come in handy for a project of mine that had plateaued.
I am looking for ways to determine whether throughout a given twitch stream, whether the streamer is on-topic, or if they are exchanging pleasantries with their chat.
I was able to use Whisper’s transcription model to get a transcription of the text with timestamps, however I’m failing on being able to accurately get a good result of “is this science talk or chit-chat?”.
For example, for the following text:
even if they weren’t born with the equipment to glow. Let’s revisit the anglerfish moments before it tried to eat you. That glowing bait on top of its head? It comes from a pocket of skin called the eska. The eska holds bioluminescent bacteria. The anglerfish can’t glow there by itself, so it holds a sack of glowing bacteria instead. Remember the firefly? It can actually make itself glow. Inside its lantern are two chemicals. is how much the researchers have utilized them to see if there’s any efficiency differences. So they looked at the sequences that they identified, both the sequences for luciferase and luciferin are contained in the glowworm. So that’s why presumably they have the glowing and bioluminescent ability. Why that is in terms of efficiency, we don’t know. We haven’t like a comparative look. They haven’t isolated this particular molecule because there are some differences in terms of sequence. If it’s more or less efficient, then let’s say a firefly is one. Why? Hold on, Cricut. I used to know this. Okay, so yes, that’s what Kith is right. I did find, I have confirmed this was my suspicion. So in Latin, as Grimley points out, to loose, in Latin origin means light. Sorry, the word luxe, it’s the word luxe. Just like Lucifer’s nightclub in the hit Netflix TV show, Lucifer starring Tom Ellis. I got you, Cricket, I got you on the modern day fancy pants stuff, I got you. Chad, do you know about Loving This Cricket? She’s on her 1,000 follower push, she’s almost there, she’s nine. One, two, three, four, five, six, seven, eight, nine followers away from getting 1,000. Please go follow our lovely, amazing friend Cricket. Back to the Luciferase and Luciferin mix together in the presence of oxygen and fuel for the cell called ATP, the chemical reaction gives off energy in the form of light. Once scientists figured out how the Firefly creates its Luciferase and Luciferin, they used genetic engineering to make this light-producing reaction occur inside other living things that can’t glow. For example, they inserted the genes, or instructions, for a in theory apply. New Zealand and when the other place heavily studied was also Australia. But it’s actually suspected when that they were colonized originally from New Zealand to Australia by human intervention, for better or worse. The beauty of bioluminescence, unlike the light from the sun or an incandescent bulb, is that it’s not hot. It takes place in a range of temperatures that don’t burn a living thing. And unlike a glow stick which fades out as the chemicals inside get used up, bioluminescent reactions use replenishable resources. That’s one reason engineers are trying to develop bioluminescent trees. Just think, if planted on the side of highways, they could light the way using harm to the local animals because of the light pollution. But the counter argument was if they’re in places with already existing light pollution, it’s not going to make things worse, maybe. And but it would help with in terms of the environment and oxygen production and CO2 reduction and things along those lines. Not sure I want to glow. Drink more, cook less. you glow every day. I see the way you look at your wife. Talk about survival advantage. That could help our planet live longer. Do you find yourself thinking of other ways to put bioluminescence to good use? That glow stick you swing at a rave may help you find a mate. But how else can bioluminescence improve your survival? If you start thinking in this way, you have seen the light.
I would expect that to be categorized as science or education mostly, since there’s a lot of chat about the technical stuff, and not a lot about just chatting. However, I’ve tried a couple zero-shot classifications, I’ve tried some popular twitter classifiers, I’ve tried summarizing and then classifying, and haven’t gotten a result that didn’t include some level of strong bias towards
greetings, which I would expect to NOT be the case.
I posted this in ChatGPT (3.5) with the following question:
Is this bit of text moreso a twitch streamer exchanging pleasantries with their viewership and chat, or a twitch streamer discussion a scientific topic?
To which it responded with:
Based on the provided text snippet, it appears to be more of a twitch streamer discussing a scientific topic related to bioluminescence and the mechanisms behind it. The streamer is mentioning various examples, such as the anglerfish, fireflies, and the chemical reactions involved in bioluminescence. They also touch on topics like genetic engineering and the potential applications of bioluminescence in areas such as environmental conservation. While there might be some interaction with the viewership and chat, the focus seems to be on delivering scientific information rather than exchanging pleasantries.
Which is MUCH more along the lines of what I was expecting an AI to be able to pick up on.
Now, obviously I have hours and hours worth of snippets like this to do, so I’d love to do it with HuggingFace, but am not sure what type of model I would have to look into to have a solid base. I don’t mind training one up, but would like to start with some type of basic foundational success before I dive into any one approach over the other…