A useful tool
LCS for data mining
The exact same code to do supervised learning can be used for data mining. In fact, I’ve already shown that, with the “not-bit-4” example.
The approach is simple: Instead of training (and then testing with new data), suppose what you want to understand something about your data. Then you can show it all to the LCS for it to come up with rules. If you’re lucky, your data has hidden rules that you didn’t know about.
Demonstration
So we’ve seen it already, the “not-bit-4” example. Well, the LCS can try to find out for you:
But what if what you have is this? Not so easy to find the rule manually, right?
Maybe too easy? But it’s the concept that matters! Let’s try a different example. See if you can spot it:
The LCS only takes a few second…
And finds the rules (4 in this case):
That’s how the LCS here expresses “bit 4 XOR bit 5”. And yes, that was it.
Enough with the examples. But conceptually, the “Mux6” example for the last entry? Same-same.
And more to the point, I had 6 bits inputs here, but what if there are 10 bits? 20 bits? Would you be able to manually find similar rules? It would take me some time :D
Sure, the algorithm is slower with 20 bits instead of 6 (can you blame it?). But it’s machine time. :)
So what?
I actually have a use-case at work I want to test this on. As soon as I upload a first version of my project to GitHub, I’ll be able to test it there. I have inventories and some things about them I want to know whether I can explain with short, readable rules. I think the LCS might help. Although… Maybe there is no simple rule to be found, but that’s worth trying.
The trick there will be: How to encode the data into binary strings, as this is what my code accepts…? But I actually know how to go about that.
Conclusion
Well, I just like the idea that the same Supervised Learning algorithm can be used to do Data mining. And provide interpretable information about some datasets (if there is something to be found, that is).
Take that, Neural Networks! :D
Alright, I guess next week I’ll return to working on the package itself.