How do neural networks work?
Neural networks represent an approach to developing solutions to problems by attempting to mimic the behaviors of neurons in a simplified brain. That’s one big confusing sentence. So let’s try to deconstruct it so that we can better understand what neural networks are all about. Let’s start with the problems they solve.
Given enough data, neural networks can learn pretty much anything. But for most their power becomes obvious when you get them to recognize or modify images. A lot of image problems are trivial, even boring when we try to solve them but nearly impossible to explain how we did it. Let’s have a thought experiment: Try to recognize what’s on the image below!
If your answer is “cat”, Great job! 🙂 But I wonder if you could tell me what exactly happened in your brain; that even before accounting for the whole picture, your brain came to a conclusion? Like literally, your brain skimmed over, saw a focused yellow blob, and flagged it as “Cat”. Not even getting distracted by the artsy chair in the corner, or the elevated power cord hiding cover. Not even the little bell on the rope for the cat to play with! Your brain just focused on important information. And while at it, you probably also quick-decided that this is a cat, and not a kitten or a kitty.
In general, our brains seem to be much better at recognizing cats than explaining why those things are cats. We just seem to have a hard time understanding how exactly we did it. And by proxy, we are better at getting computers to learn how to recognize cats than we are at teaching them how to recognize cats. Machine learning in general attempts to learn how to solve problems without starting with explicit ways of solving them. And neural networks, in particular, work by mimicking the learning process of a simplified brain.
Brain-like learning in neural networks
Neural networks are inspired by the way neurons interact in a brain. That is they form connections with other neurons. Based on an input from one neuron they fire and trigger some other neuron. And these connections get stronger or weaker based on how often they get used. This rather trivial explanation is how artificial neural networks work. Except that they use simulated neurons instead of actual brain cells.
The underlying principle is quite simple to implement. A neuron takes some inputs, calculates output based on some function, and passes it on to the next neuron in line. Also, there is a “threshold” and if the result is lower than the threshold it doesn’t pass forward. And after every run, the network compares its predictions and the expected results and readjusts those thresholds to increase accuracy. This amounts to the network getting better at ignoring things that don’t work, and valuing the connections that increase accuracy. And this process repeats many times until the network gets good at predicting results based on the training data.
If this seems a bit confusing, it’s alright. After the neural network is trained, its model gets validated against another dataset. We can tell NN’s accuracy, but we can’t know the criteria and reasoning it used. The easiest way to think about is, that neural networks are training intuition and not rational decision making.
And this is basically why (and how) they work. They use trial and error and try to get as accurate as possible. And they are very well adapted for solving problems where the explicit description of minute detail is impossible or impractical. This is why they seem to be a good way to get computers to recognize cats.
What are they commonly used for?
The most apparent modern examples have to do with image recognition. I say apparent, because these are a sort of things that are trivial for humans, but with which computers struggle. One of the significant NN deployments has to do with number, text, and handwriting recognition. Neural networks can be faster and better than humans at recognizing handwriting and are often deployed in letter sorting centers to automate shipment sorting. For a detailed example of number recognition, I’d recommend this free book. Also, you can commonly find them as workhorses for recommendation engines or other similar Machine Learning workloads.
The neural network pictured above can recreate an image to appear as painted by several famous painters As for other interesting NN projects, take a look at Nvidia’s image upscaling. Their NN implementation can upscale images to such a degree that they look near identical to the much more constantly full-size rendering. Then there is the music composer neural network. And if you think all these examples are just Great! There is also a neural network to detect your sarcasm! Deepfakes have recently made some notoriety because they allow us to swap people’s faces in movies, here’s a rather humorous example. Although they might mean that soon we won’t be able to trust any picture or video that we find on the internet.
Small scale NN
All this so far sounds great, you might say, but I don’t work for a multi-billion dollar company, is this only the playground for the fortune 500 companies and University Labs? You can integrate the technology into your S&M enterprise, particularly if it’s somewhat IT-oriented. A large part of it comes to what data you have, and also what challenges you face in your daily work.
For instance, if you’re in the B2B sector, with many clients and many years of contracts, orders, invoices, and payments you might use the data to train a neural network to predict your cashflow. You might add some info from a business registry and have the network estimate in how many days you can expect each debtor company to pay their invoices. Alternatively, you could use it for support tickets, to try to offer similar previously solved issues, or to suggest who to contact inside the organization concerning the problem at hand.
Chances are you have a bunch of data in your company, just try to find some problems that would need to use that data to be solved. Then it’s just a matter of finding the scope of the problem and the exact definition of the problem that can be solved by a neural network. It might not work for everything, but for sure there is at least one problem that can be solved using a neural network and inexpensively deployed on a server or a VM.
Silver bullet
So, they seem to be able to learn about pretty much anything, so aren’t neural networks a real silver bullet? Well not exactly. While they should be able to extrapolate pretty much anything, they require lots of data. Provided enough data they can tackle almost any problem, in theory.
Difficulties with defining the problem
You might corner yourself by not defining the problem at hand properly or by defining the wrong problem. This will leave you with a neural network that’s very good at solving a problem. Just that its answers are useless to you. Your experts can help you define a solvable problem, but they can’t help you pinpoint what in your company is the problem that fits NN’s needs.
I hate to repeat myself! But the answer is 42, but the neural network can’t tell you what the question. And sadly this is the one acks a simple how-to guide. Start with what data you have and what strategic considerations/challenges/opportunities you expect. Somewhere in the correction of those lies the starting point towards defining the problem solvable by a learning algorithm.
The issues with data used to train neural networks
Another major implementation challenge lies with the data itself. Any bias in the data or the development team will also show up in the final model. Neural networks are kids, tell about how evil a particular group is and it’s gonna grow up hating them. While high-profile biases such as racism and sexism draw attention, the smaller less morally ambiguous things can significantly affect results. Whenever you have some “type” underrepresented, be it people, hardware vendors, invoices with % if paid early. You risk ending up with much lower accuracy in such subsegments.
While training such networks, as well as everywhere else in life, we strive to get results that make sense. And what makes sense depends on what we know and how we view the world. On one extreme this can lead to actively harmful racist outcomes. On the other, this essentially means that people training the neural network, are the ones who bias the network towards the results that make sense to them. To go back to the cat example, the NN solutions only as good as our understanding of “what are cats” while training the network.
While they show much potential in real-world neural networks tend to be quite hard to implement. In the end, I’ll leave you with this example of a failed Neural Network project that cost IBM $60 million.
Further reading
While we didn’t go into math for neural networks. I recommend this book as a very good starting point on math, philosophy and coding of neural networks. Hit me up on Facebook or Twitter if you’d like me to cover the technical sides of implementing NNs.
A nice article further explaining how neural networks work
Play with Google’s quickdraw neural network.