The honeypot is a method of cybersecurity in which a bait (‘honey’) system/network is designed to emulate or act as a real system/network to divert malicious attacks upon the actual real system/network. The honeypot may act to mitigate, block, and in some cases capture the malicious behavior. The concept of the honeypot probably originated from two books, “The Cuckoos Egg” by Clifford Stoll and “An Evening with Berferd” by Bill Chewick, both describing the authors’ own personal efforts to catch computer hackers. The first publically available honeypot was Fred Cohen's Deception ToolKit in 1998; since then, as the prevalence of malicious network attacks has increased, so has the use and the sophistication of honeypots.
Honeypot design and deployment is a tradeoff between realism and simplicity; this tradeoff can be characterized as the difference between high and low interaction honeypots. A realistic design could use an actual operating system instrumented to detect and capture intruders (known as a high interaction honeypot). However, the detection would be greatly complicated, because it is difficult to distinguish between normal traffic on the system and the attacker's. It is a low signal to noise detection problem due to the complexity of modern operating systems running hundreds of threads generating large volumes of traffic with complex signatures. A honeypot that is designed only to superficially mimic an OS (low interaction honeypot) can easily detect the attacker's actions, since there is no background noise. Unfortunately, the attacker can also recognize it as a decoy because of its inherent simplicity and shallowness. While low interaction honeypots have evolved to mitigate the possibility of detection by implementing protocols more completely, this approach has been deemed futile by some researchers, because the attacker can more easily detect honeypots than the defender can create plausible simulacrum. In fact, it can be argued that low-interaction honeypots can never be fully undetectable to attackers, as, by definition, they only partially simulate/emulate a service to be attacked.
Specifically, there are some known methods for detecting low-interaction honeypots, with some of these methods being quite trivial and obvious. For example, the default setting for Beartrap, an ftp-based low interaction honeypot, always returns an identifying banner, ‘‘220 BearTrap-ftps Service ready.” Conpot, another common honeypot based on scada emulation, has the same implementation name (“Mouser Factory”) and the same serial number set as a default. Aside from these obvious examples, one common issue is the behavior of honeypots trying to emulate certain services that does not match the actual behavior of those services. For example, Honeyd, a platform to establish multiple virtual honeypots of different server types, has service scripts for IIS and Linux/FTP. However, the response of GET from IIS service script returns the same response, with an abnormally long time elapsed from the latest update, and the Linus/FTP service script does not support the DELE command. Another example comes from Nova, which expands upon honeyd to create emulations of complete machines. Their Windows machine default configuration does not have a NetBIOS service script, and thus it displays an open port and allows connection but does not implement the service. These behaviors are clear indicators to attackers of honeypots, as opposed to real machines.
A high interaction honeypot generally will not have the ease of detection of a low interaction honeypot, but can also pose a threat. It is designed to allow infection, but unless the intrusion is detected quickly it can become a vector of attack to other systems. Running the honeypot in a virtual machine can protect against the malicious attacker, but this mode of operation can be detected, neutralizing its effectiveness while incurring substantial operating costs over low interaction honeypots. Also, as mentioned above, it difficult to determine actual attacks from normal activity (i.e., the low signal-to-noise issue) on a high interaction honeypot.
Machine learning holds the promise of realistically simulating protocols in a way that fools the attacker but does not compromise the system. This problem can be framed as a version of the Turing test, where the attacker is querying the system to see if it is a decoy. In the Turing test, an interaction between a human and a machine is observed to see if the machine displays sufficient skill in imitating a human. Here we need to imitate a protocol. Large amounts of mined protocol command and responses would serve as the training input to these machine learning systems similar to what is done in building language translators and chatbots.
A machine learning system capable of understanding the patterns of sequences in time is a natural fit to a type of recurrent neural network called an LSTM (long short-term memory). They have been used to generated artificial examples of mathematical texts (algebraic geometry in Latex). Here we are interested in producing plausible responses to protocol requests such as ls or cd in ssh. LSTMs have an internal state which lets them remember the past but with an intelligent forgetting factor allowing them to continue to learn without saturating memory. It is plausible that these types of network can be automatically trained and deployed at a much lower cost that the humans needed to continually patch the deficiencies in honeypot verisimilitude.