# Deep Learning from Scratch to GPU – 14 – Learning a Regression

When we step over the hype, what neural networks do is they approximate functions. Neural networks
learn to do that based on a lot of examples taken out from the function’s output.
The real use case is to learn to approximate functions that would be impossible to explicitly implement
in practice.

Their key ability is to approximate unknown
functions. When I know a set of rules that transfer
inputs to outputs, I can think of many ways of implementing that in programming languages, and
all these ways are more efficient than using neural networks. If I don’t know the process
that transfers the inputs to the outputs, I can’t program it explicitly. If the process is known,
but the rules are numerous, and it is not feasible to elicit them, that could be hard to implement, too.

A typical example familiar to programmers would be an expert system.
Expert systems were a promising area of AI several
decades ago. The idea is to find an expert for a certain area, help her define the rules she is
using when making some decisions, program there rules in fancy DSLs with if/then/else flavor, and profit.
It turns out that expert time is expensive. Also, experts use lots of intuition; some rules work, but not
always, with lots of ‘however’s. On top of that, the probabilistic nature of life kicks in, so
you can’t completely rely even on the rules that you can define.

Let’s say that I would like to analyze traffic of a website to defend against malicious visitors.
I consulted with an expert, and he told me most of the known ways of detecting these. I implement some filters:
if the user is from this range of IP addresses, if he uses a web browser with a certain user agent, if
he comes via a proxy, if… Lots of rules are possible, and they would filter a lot of unwanted traffic.
They would also filter some wanted traffic. But, most importantly, the attackers also know a lots
of detection rules, so they would adapt to pass these rules.

The approach that neural networks take is implicit. If I feed past data to the network, and label
the network can figure out
how to recognize them on its own. Even better, it can figure out how to recognize traffic that it has never seen.
Of course, it may not do it perfectly, but it can learn this sort of stuff well enough.

To summarize, if I have lots of input/output examples, I can train a neural network to approximate
the unknown
function that produced these examples.

We are going to do something less ambitious than the website traffic filtering in this article.
We are going to train a neural network
on a known
function. It is obvious that neural networks are not the right tool for that job, since
it is much easier and precise to code the function in Clojure right away.
It is a great first example, though, since it makes it possible to easily see what the network is doing.